LLVM/project b2c30acllvm/include/llvm/Analysis DependenceAnalysis.h, llvm/lib/Analysis DependenceAnalysis.cpp

[DA] Rewrite formula in the Weak Zero SIV tests
DeltaFile
+67-72llvm/lib/Analysis/DependenceAnalysis.cpp
+8-8llvm/test/Analysis/DependenceAnalysis/weak-zero-siv-large-btc.ll
+4-8llvm/include/llvm/Analysis/DependenceAnalysis.h
+2-6llvm/test/Analysis/DependenceAnalysis/weak-zero-siv-overflow.ll
+2-2llvm/test/Analysis/DependenceAnalysis/weak-crossing-siv-large-btc.ll
+83-965 files

LLVM/project f7b1107llvm/lib/Analysis IVDescriptors.cpp, llvm/test/Transforms/LoopVectorize minmax_reduction.ll float-minmax-instruction-flag.ll

[IVDescriptors] Remove function FMF attribute check for FP min/max reduction (#183523)

Remove the use of function attributes no-nans-fp-math and
no-signed-zeros-fp-math in FP min/max reduction detection. The required
fast-math flags nnan and nsz should be present on the intrinsic calls,
fcmp and select instructions themselves.
DeltaFile
+55-57llvm/test/Transforms/LoopVectorize/minmax_reduction.ll
+26-47llvm/lib/Analysis/IVDescriptors.cpp
+35-36llvm/test/Transforms/LoopVectorize/RISCV/reductions.ll
+15-22llvm/test/Transforms/LoopVectorize/float-minmax-instruction-flag.ll
+17-18llvm/test/Transforms/LoopVectorize/X86/reduction-fastmath.ll
+10-11llvm/test/Transforms/LoopVectorize/RISCV/tail-folding-reduction.ll
+158-1913 files not shown
+179-2139 files

LLVM/project b4e01callvm/lib/Target/AMDGPU AMDGPUSubtarget.h

Remove unused getFlatOffsetBitWidth()
DeltaFile
+0-2llvm/lib/Target/AMDGPU/AMDGPUSubtarget.h
+0-21 files

LLVM/project d7d26e5mlir/include/mlir/IR Region.h Operation.h, mlir/lib/Dialect/OpenACC/IR OpenACC.cpp

[mlir][IR] Add multi-type `getParentOfType` overloads
DeltaFile
+7-23mlir/lib/Dialect/OpenACC/IR/OpenACC.cpp
+11-0mlir/include/mlir/IR/Region.h
+1-8mlir/lib/Dialect/OpenACC/Transforms/LegalizeDataValues.cpp
+8-0mlir/include/mlir/IR/Operation.h
+1-7mlir/lib/Dialect/OpenACC/Utils/OpenACCUtils.cpp
+2-4mlir/lib/Dialect/SparseTensor/Transforms/Utils/CodegenUtils.cpp
+30-426 files

LLVM/project 265c1f4llvm/lib/Transforms/Vectorize LoopVectorize.cpp, llvm/test/Transforms/LoopVectorize max-interleave-factor-debug.ll

[LV] Add debug print for TTI.MaxInterleaveFactor (NFC) (#183309)

As its not currently visible in the debug output.

---------

Co-authored-by: Sander de Smalen <sander.desmalen at arm.com>
DeltaFile
+24-0llvm/test/Transforms/LoopVectorize/max-interleave-factor-debug.ll
+2-0llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+26-02 files

LLVM/project 7baa97dmlir/include/mlir/IR Region.h Operation.h, mlir/lib/Dialect/OpenACC/IR OpenACC.cpp

[mlir][IR] Add multi-type `getParentOfType` overloads
DeltaFile
+7-23mlir/lib/Dialect/OpenACC/IR/OpenACC.cpp
+11-0mlir/include/mlir/IR/Region.h
+1-8mlir/lib/Dialect/OpenACC/Transforms/LegalizeDataValues.cpp
+8-0mlir/include/mlir/IR/Operation.h
+1-7mlir/lib/Dialect/OpenACC/Utils/OpenACCUtils.cpp
+2-4mlir/lib/Dialect/SparseTensor/Transforms/Utils/CodegenUtils.cpp
+30-426 files

LLVM/project daaedf9llvm/lib/Analysis ValueTracking.cpp, llvm/test/Transforms/Attributor nofpclass-bitcast.ll

ValueTracking: Teach computeKnownFPClass to look at bitcast + integer max

The returned class will still be one of the bitpatterns.

This pattern is used in rocm device libraries in assorted functions, e.g.,
https://github.com/ROCm/llvm-project/blob/amd-staging/amd/device-libs/ocml/src/rlen3F.cl#L20

I believe it is blocking the eliminationg of finite checks in some of the more
complex functions.
DeltaFile
+213-0llvm/test/Transforms/Attributor/nofpclass-bitcast.ll
+20-1llvm/lib/Analysis/ValueTracking.cpp
+233-12 files

LLVM/project ff32bc2mlir/include/mlir/IR Region.h Operation.h, mlir/lib/Dialect/OpenACC/IR OpenACC.cpp

[mlir][IR] Add multi-type `getParentOfType` overloads
DeltaFile
+7-23mlir/lib/Dialect/OpenACC/IR/OpenACC.cpp
+9-0mlir/include/mlir/IR/Region.h
+1-8mlir/lib/Dialect/OpenACC/Transforms/LegalizeDataValues.cpp
+8-0mlir/include/mlir/IR/Operation.h
+1-7mlir/lib/Dialect/OpenACC/Utils/OpenACCUtils.cpp
+2-4mlir/lib/Dialect/SparseTensor/Transforms/Utils/CodegenUtils.cpp
+28-426 files

LLVM/project 1c33275mlir/include/mlir/Dialect/SPIRV/IR SPIRVTosaOps.td

[mlir][spirv] Introduce a base class for spirv.TOSA convolution ops (#183751)

Signed-off-by: Davide Grohmann <davide.grohmann at arm.com>
DeltaFile
+38-108mlir/include/mlir/Dialect/SPIRV/IR/SPIRVTosaOps.td
+38-1081 files

LLVM/project 65e8bf6llvm/include/llvm/Analysis DependenceAnalysis.h, llvm/lib/Analysis DependenceAnalysis.cpp

[DA] Rewrite formula in the Weak Zero SIV tests
DeltaFile
+67-72llvm/lib/Analysis/DependenceAnalysis.cpp
+4-8llvm/include/llvm/Analysis/DependenceAnalysis.h
+4-4llvm/test/Analysis/DependenceAnalysis/weak-zero-siv-large-btc.ll
+2-6llvm/test/Analysis/DependenceAnalysis/weak-zero-siv-overflow.ll
+2-2llvm/test/Analysis/DependenceAnalysis/weak-crossing-siv-large-btc.ll
+79-925 files

LLVM/project 14bcb1abolt/include/bolt/Core BinaryContext.h, bolt/lib/Core BinaryFunction.cpp

[BOLT] Make sure IOAddressMap exist before lookup (NFC) (#183184)

`BinaryFunction::translateInputToOutputAddress()` contains fallback
logic in case that querying `IOAddressMap` doesn't yield an output
address. Because this function could be called in scenarios where
`IOAddressMap` won't be set up, we should check if the map actually
exists before lookup.
DeltaFile
+4-2bolt/lib/Core/BinaryFunction.cpp
+1-0bolt/include/bolt/Core/BinaryContext.h
+5-22 files

LLVM/project b4b32e8bolt/runtime common.h instr.cpp

[BOLT][instr] Disable stderr diagnostic output when targeting Android (#183185)

Disable all stderr diagnostic output on Android since there is typically
no terminal to read diagnostic message. The `noinline`annotation is to
keep same inline decision before and after this change. On AArch64
the `.text` section in instr runtime library is now ~4.8 KB smaller.
DeltaFile
+13-1bolt/runtime/common.h
+13-0bolt/runtime/instr.cpp
+26-12 files

LLVM/project 3270bbfbolt/runtime instr.cpp

[BOLT][instr] Make instrumentation counter reset thread safe (#183186)

Use `GlobalWriteProfileMutex` to synchronize between data reset and
dump. Between static counter reset and increment, we use atomic store
in counter reset - the counter increment sequence inserted within user
code already takes care of thread safety, so we just need to make sure
the counter reset code is also thread safe (no torn write to counter).
DeltaFile
+10-2bolt/runtime/instr.cpp
+10-21 files

LLVM/project b8d0bb2clang/lib/StaticAnalyzer/Checkers/WebKit NoDeleteChecker.cpp PtrTypesSemantics.cpp, clang/test/Analysis/Checkers/WebKit nodelete-annotation.cpp

[WebKit checkers] Trivial function analysis ignores some nodelete annotation (#183970)

This PR fixes the bug that TrivialFunctionAnalysis can ignore nodelete
annotation set on some but not all function declarations because it does
not check the annotation on prior declarations unlike
alpha.webkit.NoDeleteChecker which checks it on any declaration by
replacing isNoDeleteFunction with NoDeleteChecker's
hasNoDeleteAnnotation.
DeltaFile
+1-20clang/lib/StaticAnalyzer/Checkers/WebKit/NoDeleteChecker.cpp
+20-1clang/lib/StaticAnalyzer/Checkers/WebKit/PtrTypesSemantics.cpp
+10-0clang/test/Analysis/Checkers/WebKit/nodelete-annotation.cpp
+31-213 files

LLVM/project 6d82f14clang-tools-extra/clang-tidy/performance UseStdMoveCheck.cpp UseStdMoveCheck.h, clang-tools-extra/docs ReleaseNotes.rst

[clang-tidy] New performance linter: performance-use-std-move (#179467)

This linter suggests calls to ``std::move`` when a costly copy would
happen otherwise. It does not try to suggest ``std::move`` when they are
valid but obviously not profitable (e.g. for trivially movable types)

This is a very simple version that only considers terminating basic
blocks. Further work will extend the approach through the control flow
graph.

It has already been used successfully on llvm/lib to submit bugs
#178174,
 #178169, #178176, #178172, #178175, #178180, #178178, #178177, #178179,
 #178173 and #178167, and on the firefox codebase to submit most of the
dependencies of bug https://bugzilla.mozilla.org/show_bug.cgi?id=2012658
DeltaFile
+288-0clang-tools-extra/test/clang-tidy/checkers/performance/use-std-move.cpp
+122-0clang-tools-extra/clang-tidy/performance/UseStdMoveCheck.cpp
+35-0clang-tools-extra/clang-tidy/performance/UseStdMoveCheck.h
+23-0clang-tools-extra/docs/clang-tidy/checks/performance/use-std-move.rst
+6-0clang-tools-extra/docs/ReleaseNotes.rst
+2-0clang-tools-extra/clang-tidy/performance/PerformanceTidyModule.cpp
+476-02 files not shown
+478-08 files

LLVM/project f1620e4clang/lib/Basic/Targets NVPTX.h AMDGPU.h, clang/test/Misc nvptx.languageOptsOpenCL.cl amdgcn.languageOptsOpenCL.cl

[OpenCL] Enable __cl_clang_function_scope_local_variables for AMDGPU and NVPTX targets (#183892)

I'd like to use this extension in our downstream SYCL compiler to
implement the __clc__group_scratch helper function, allowing us to
replace .ll files with .cl files for the two targets:
https://github.com/intel/llvm/blob/sycl/libclc/libspirv/lib/amdgcn-amdhsa/group/collectives_helpers.ll
https://github.com/intel/llvm/blob/sycl/libclc/libspirv/lib/ptx-nvidiacl/group/collectives_helpers.ll
DeltaFile
+5-0clang/test/Misc/nvptx.languageOptsOpenCL.cl
+5-0clang/test/Misc/amdgcn.languageOptsOpenCL.cl
+1-0clang/lib/Basic/Targets/NVPTX.h
+1-0clang/lib/Basic/Targets/AMDGPU.h
+12-04 files

LLVM/project ab1d59eclang/docs ClangFormatStyleOptions.rst, clang/include/clang/Format Format.h

[clang-format] Allow InheritParentConfig to accept a directory (#182791)

Add support for `BasedOnStyle: InheritParentConfig=<directory-path>` in
config files to redirect inheritance to the `.clang-format` or
`_clang-format` file in the `<directory_path>` directory.

Closes #107808
DeltaFile
+56-17clang/lib/Format/Format.cpp
+57-0clang/unittests/Format/ConfigParseTest.cpp
+6-4clang/docs/ClangFormatStyleOptions.rst
+1-1clang/include/clang/Format/Format.h
+120-224 files

LLVM/project 52a9eb3.github/workflows/upload-release-artifact action.yml

[Github] Add TODO around actions/attest
DeltaFile
+2-0.github/workflows/upload-release-artifact/action.yml
+2-01 files

LLVM/project 8fff1c0.github/workflows/upload-release-artifact action.yml

Update actions/attest-build-provenance action to v4 (#184051)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
|
[actions/attest-build-provenance](https://redirect.github.com/actions/attest-build-provenance)
| action | major | `v3.1.0` → `v4.1.0` |

---

> [!WARNING]
> Some dependencies could not be looked up. Check the [Dependency
Dashboard](../issues/160328) for more information.

---

### Release Notes

    [120 lines not shown]
DeltaFile
+1-1.github/workflows/upload-release-artifact/action.yml
+1-11 files

LLVM/project 686987allvm/test/CodeGen/AMDGPU atomic_optimizations_global_pointer.ll atomic_optimizations_local_pointer.ll, llvm/test/Transforms/InstCombine/AMDGPU mbcnt.ll canonicalize-add-to-gep.ll

ValueTracking/AMDGPU: handle mbcnt in computeKnownBitsFromOperator (#183229)

This helps canonicalize some address calculation. This would further
help immediate folding into memory load instructions in the backend.

The order changes to v_mad_u32_u24 is just because
@llvm.amdgcn.mul.u24.i32 was used in codegen prepare after this change.
It does not really change anything important.
DeltaFile
+61-0llvm/test/Transforms/InstCombine/AMDGPU/mbcnt.ll
+23-22llvm/test/CodeGen/AMDGPU/atomic_optimizations_global_pointer.ll
+19-18llvm/test/CodeGen/AMDGPU/atomic_optimizations_local_pointer.ll
+29-0llvm/test/Transforms/InstCombine/AMDGPU/canonicalize-add-to-gep.ll
+7-18llvm/test/Transforms/InstCombine/AMDGPU/llvm.amdgcn.wave.shuffle.ll
+12-12llvm/test/CodeGen/AMDGPU/insert_waitcnt_for_precise_memory.ll
+151-706 files not shown
+197-10812 files

LLVM/project e95dabemlir/python/mlir/dialects ext.py, mlir/test/python/dialects ext.py

[MLIR][Python] Support attribute definitions in Python-defined dialects (#183907)

This PR is quite similiar to
https://github.com/llvm/llvm-project/pull/182805.

We added basic support of attribute definitions in Python-defined
dialects, including:

- IRDL codegen for attribute definitions
- Attr builders like `MyAttr.get(..)` and attr parameter accessors (e.g.
`my_attr.param1`)
- Use Python-defined attrs in Python-defined operations

Assisted by GitHub Copilot.
DeltaFile
+111-4mlir/python/mlir/dialects/ext.py
+89-0mlir/test/python/dialects/ext.py
+200-42 files

LLVM/project 8774da8mlir/lib/Dialect/XeGPU/Transforms XeGPULayoutImpl.cpp, mlir/test/Dialect/XeGPU xegpu-wg-to-sg-unify-ops.mlir subgroup-distribute.mlir

[MLIR][XeGPU] Preserve anchor layouts in recoverTemporaryLayout (#182186)

DeltaFile
+13-1mlir/test/Dialect/XeGPU/xegpu-wg-to-sg-unify-ops.mlir
+6-6mlir/test/Dialect/XeGPU/subgroup-distribute.mlir
+1-1mlir/lib/Dialect/XeGPU/Transforms/XeGPULayoutImpl.cpp
+20-83 files

LLVM/project f800218libunwind/src libunwind.cpp

[libunwind][PAC] Defang ptrauth's PC in valid CFI range abort

It turns out making the CFI check a release mode abort causes many,
if not the majority, of JITs to fail during unwinding as they do not
set up CFI sections for their generated code. As a result any JITs
that do nominally support unwinding (and catching) through their JIT
or assembly frames trip this abort.

rdar://170862047
DeltaFile
+11-17libunwind/src/libunwind.cpp
+11-171 files

LLVM/project 81872e7clang/test/CodeGenOpenCL cl-uniform-wg-size.cl

[NFC] Fix check lines for `clang/test/CodeGenOpenCL/cl-uniform-wg-size.cl` on Darwin (#184042)

DeltaFile
+6-6clang/test/CodeGenOpenCL/cl-uniform-wg-size.cl
+6-61 files

LLVM/project e6aafaepolly/lib/External/isl isl_ast_build_expr.c GIT_HEAD_ID, polly/lib/External/isl/test_inputs/codegen polly3.st polly3.c

[Polly] Update isl to isl-0.27-86-gcf471c16 (#184044)

Update isl to include
https://repo.or.cz/isl.git/commit/d1b49851aca59c1edd01cb1dc97674e6d79d07af
which fixes #180958

Closes #180958

Thanks @skimo-openhub for the fix and @thapgua for the bugreport.
DeltaFile
+65-9polly/lib/External/isl/isl_ast_build_expr.c
+5-0polly/lib/External/isl/test_inputs/codegen/polly3.st
+2-0polly/lib/External/isl/test_inputs/codegen/polly3.c
+1-1polly/lib/External/isl/GIT_HEAD_ID
+73-104 files

LLVM/project 4287382clang/test/CodeGenOpenCL cl-uniform-wg-size.cl

[NFC] Fix check lines for `clang/test/CodeGenOpenCL/cl-uniform-wg-size.cl` on Darwin
DeltaFile
+6-6clang/test/CodeGenOpenCL/cl-uniform-wg-size.cl
+6-61 files

LLVM/project d947f8fclang/docs ReleaseNotes.rst, clang/include/clang/Basic DiagnosticSemaKinds.td

[clang][Sema] fix crash on __type_pack_element with dependent packs (GH180307) (#180407)

dependent pack expansions in __type_pack_element can result in
single-element template argument lists. When performing semantic
analysis for these builtins, the compiler needs to account for the
dependent expansions and handle them without triggering strict size
assertions. The patch adds this analysis and ensures we either defer
evaluation for dependent cases or report clear out-of-bounds diagnostics
instead of crashing
Ai was used for test generation and CI debugging 
fixes #180307
DeltaFile
+24-0clang/test/SemaCXX/builtin_templates_invalid_parameters.cpp
+11-5clang/lib/Sema/SemaTemplate.cpp
+1-1clang/include/clang/Basic/DiagnosticSemaKinds.td
+1-0clang/docs/ReleaseNotes.rst
+37-64 files

LLVM/project f05d2e8clang/test/CodeGenOpenCL amdgpu-enqueue-kernel.cl cl-uniform-wg-size.cl, llvm/lib/IR AutoUpgrade.cpp

[AMDGPU] Make uniform-work-group-size a valueless attribute (#183925)

The "uniform-work-group-size" function attribute previously took a
string value of "true" or "false". Since presence alone can convey the
"true" semantics and absence can convey "false", the value is
unnecessary.

This patch converts it to a valueless string attribute: presence
indicates true, absence indicates false. For backward compatibility,
auto-upgrade logic is added in both UpgradeAttributes (bitcode) and
UpgradeFunctionAttributes: if the old value is "true", the attribute is
kept without a value; if "false", the attribute is removed.
DeltaFile
+24-26clang/test/CodeGenOpenCL/amdgpu-enqueue-kernel.cl
+14-13clang/test/CodeGenOpenCL/cl-uniform-wg-size.cl
+21-0llvm/lib/IR/AutoUpgrade.cpp
+21-0llvm/test/Bitcode/upgrade-uniform-work-group-size.ll
+4-9llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp
+5-6llvm/test/CodeGen/AMDGPU/uniform-work-group-propagate-attribute.ll
+89-5445 files not shown
+167-13451 files

LLVM/project 89498e2libunwind/src libunwind.cpp

[libunwind][PAC] Defang ptrauth's PC in valid CFI range abort

It turns out making the CFI check a release mode abort causes many,
if not the majority, of JITs to fail during unwinding as they do not
set up CFI sections for their generated code. As a result any JITs
that do nominally support unwinding (and catching) through their JIT
or assembly frames trip this abort.

rdar://170862047
DeltaFile
+10-17libunwind/src/libunwind.cpp
+10-171 files

LLVM/project e2ef93fclang/test/CodeGenOpenCL .gdb_history

[NFC] Remove `clang/test/CodeGenOpenCL/.gdb_history` (#184038)

DeltaFile
+0-11clang/test/CodeGenOpenCL/.gdb_history
+0-111 files