LLVM/project a2011b1llvm/test/Transforms/LoopVectorize/RISCV strided-accesses.ll

[LV][RISCV][NFC] Update strided-accesses.ll to UTC version 6 (#193211)
DeltaFile
+570-540llvm/test/Transforms/LoopVectorize/RISCV/strided-accesses.ll
+570-5401 files

LLVM/project e5925fbllvm/test/tools/llvm-objdump/MachO compact-unwind-i386.test compact-unwind-x86_64.test

[NFC][llvm-objdump] Use CHECK-NEXT in MachO tests (#192696)

[NFC][llvm-objdump] Use CHECK-NEXT in MachO tests
DeltaFile
+25-25llvm/test/tools/llvm-objdump/MachO/compact-unwind-i386.test
+25-25llvm/test/tools/llvm-objdump/MachO/compact-unwind-x86_64.test
+21-21llvm/test/tools/llvm-objdump/MachO/dis-no-leading-addr.test
+18-18llvm/test/tools/llvm-objdump/MachO/archive-headers.test
+16-16llvm/test/tools/llvm-objdump/MachO/dis-symname.test
+15-15llvm/test/tools/llvm-objdump/MachO/section-contents.test
+120-1207 files not shown
+156-15613 files

LLVM/project dc73cabllvm/lib/Target/X86 X86ISelLowering.cpp, llvm/test/CodeGen/X86 bfloat-int-cvt.ll

[X86][AVX10.2] Skip FP2I/I2FP customizations for bf16 (#193137)

Because AVX10.2 doesn't provide native BF16 FP2I/I2FP conversion
instructions.
DeltaFile
+849-0llvm/test/CodeGen/X86/bfloat-int-cvt.ll
+11-5llvm/lib/Target/X86/X86ISelLowering.cpp
+860-52 files

LLVM/project 8abcce0llvm/test/Transforms/LoopVectorize/AArch64 Oz-and-forced-vectorize.ll, llvm/test/Transforms/PhaseOrdering/AArch64 Oz-and-forced-vectorize.ll

[LoopVectorize] Generate test checks (NFC) (#193216)

Also move the test to PhaseOrdering, as it tests the full pipeline.
DeltaFile
+89-0llvm/test/Transforms/PhaseOrdering/AArch64/Oz-and-forced-vectorize.ll
+0-37llvm/test/Transforms/LoopVectorize/AArch64/Oz-and-forced-vectorize.ll
+89-372 files

LLVM/project 941e8efmlir/include/mlir/Dialect/Arith/Transforms Passes.h Passes.td, mlir/lib/Dialect/Arith/Transforms ExpandOps.cpp

[mlir][arith] Add support for `arith.flush_denormals` emulation (#192660)

Add lowering pattern and a new pass `arith-expand-flush-denormals` that
rewrites `arith.flush_denormals` ops with integer arithmetics. This
lowering is useful for target architectures that cannot pattern-match
`arith.flush_denormals` + other FP arithmetics into special instructions
with FTZ semantics.

Assisted-by: claude-opus-4.7-thinking-high

Depends on #192641.
DeltaFile
+105-0mlir/lib/Dialect/Arith/Transforms/ExpandOps.cpp
+96-0mlir/test/Dialect/Arith/expand-flush-denormals.mlir
+5-0mlir/include/mlir/Dialect/Arith/Transforms/Passes.h
+4-0mlir/include/mlir/Dialect/Arith/Transforms/Passes.td
+210-04 files

LLVM/project 1566b63clang/lib/Driver/ToolChains Clang.cpp, clang/lib/Driver/ToolChains/Arch X86.cpp

[X86][clang-cl] Make AVX10.2 map to the same target-cpu as AVX10.1 (#193147)

Diamondrapids contains a large feature set APX, which should not be
enabled by AVX10.2
DeltaFile
+6-4clang/test/Driver/cl-x86-flags.c
+8-1clang/lib/Driver/ToolChains/Clang.cpp
+1-1clang/lib/Driver/ToolChains/Arch/X86.cpp
+15-63 files

LLVM/project 9c2d944llvm/lib/CodeGen/SelectionDAG DAGCombiner.cpp, llvm/test/CodeGen/AArch64 dag-ReplaceAllUsesOfValuesWith.ll

[DAG] Reassociate (add (add X, Y), X) --> add(add(X, X), Y) (#162242)

Attempt to bring together self-additions, to help with folding to shift/mul/address patterns
DeltaFile
+39-37llvm/test/CodeGen/AMDGPU/idot2.ll
+14-0llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+6-6llvm/test/CodeGen/X86/avx-vinsertf128.ll
+1-4llvm/test/CodeGen/AArch64/dag-ReplaceAllUsesOfValuesWith.ll
+1-2llvm/test/CodeGen/Hexagon/isel-fold-shl-zext.ll
+61-495 files

LLVM/project 1697b96runtimes/cmake config-Fortran.cmake

[runtimes] Protect use of undefined CMAKE_Fortran_COMPILER (#193210)

Unlike everything else in CMake, cmake_path does not assume a default
value for undefined variables, but instead throws an error:
```
CMake Error at cmake/config-Fortran.cmake:77 (cmake_path):
  cmake_path undefined variable for input path.
Call Stack (most recent call first):
  CMakeLists.txt:284 (include)
```
Protect the use of cmake_path to not trigger this error when
CMAKE_Fortran_COMPILER is undefined.

Fixes the flang-aarch64-out-of-tree buildbot after #171610.
DeltaFile
+38-36runtimes/cmake/config-Fortran.cmake
+38-361 files

LLVM/project d629a22polly/unittests CMakeLists.txt

[Polly] Disable PCH reuse for unit tests (#193209)

Polly library targets already disable PCH reuse because Polly
unconditionally builds with -fno-rtti and -fno-exceptions. Reusing LLVM
PCHs that were built with RTTI or exceptions enabled is incompatible
with Clang when compiling Polly targets under those flags.

After 47eb8b43c990 enabled PCH reuse for unit tests, Polly unit tests
can hit the same mismatch as the library targets. Pass DISABLE_PCH_REUSE
through the shared add_polly_unittest wrapper so all Polly unit tests
follow the existing Polly target policy.

cc @aengelke -- a minor fix for polly.
DeltaFile
+4-1polly/unittests/CMakeLists.txt
+4-11 files

LLVM/project 300285eclang/lib/CIR/CodeGen CIRGenModule.cpp CIRGenModule.h

[CIR][NFCI] Remove 'isConstant' from getCIRLinkageForX (#193100)

This variable has since disappeared from classic compiler, and we
weren't using it anywhere anyway. This patch gets us back in sync with
the classic codegen for these interfaces.
DeltaFile
+9-14clang/lib/CIR/CodeGen/CIRGenModule.cpp
+2-4clang/lib/CIR/CodeGen/CIRGenModule.h
+1-2clang/lib/CIR/CodeGen/CIRGenDecl.cpp
+1-2clang/lib/CIR/CodeGen/CIRGenCXXABI.cpp
+1-2clang/lib/CIR/CodeGen/CIRGenExprConstant.cpp
+1-1clang/lib/CIR/CodeGen/CIRGenExpr.cpp
+15-256 files

LLVM/project f6f39c6llvm/test/Transforms/LoopVectorize/RISCV strided-accesses.ll

[LV] Add test for interaction between interleaved and strided load. nfc (#192990)

For #147297

Co-authored-by: Luke Lau <luke at igalia.com>
DeltaFile
+126-0llvm/test/Transforms/LoopVectorize/RISCV/strided-accesses.ll
+126-01 files

LLVM/project a976a72llvm/lib/Target/AMDGPU SIRegisterInfo.cpp, llvm/test/CodeGen/AMDGPU vgpr-spill.mir

[AMDGPU] Multi dword spilling for unaligned tuples (#183701)

While spilling unaligned tuples, rather than breaking the
spill into 32-bit accesses, spill the first register as a single
32-bit spill, and spill the remainder of the tuple as an aligned tuple.
Some additional bookkeeping is required in the spilling
loop to manage the state.

References: https://github.com/llvm/llvm-project/pull/177317
DeltaFile
+21-37llvm/test/CodeGen/AMDGPU/vgpr-spill.mir
+34-10llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
+55-472 files

LLVM/project b7cfcfellvm/lib/ProfileData/Coverage CoverageMapping.cpp

[llvm-cov] Fix error propagation in CoverageMapping::load() (#193197)

Fix a subtle issue on the error path: if loadFromFile() fails there is no error to consume.
DeltaFile
+11-7llvm/lib/ProfileData/Coverage/CoverageMapping.cpp
+11-71 files

LLVM/project 58063ffmlir/include/mlir/Dialect/Arith/Transforms Passes.td, mlir/lib/Dialect/Arith/Transforms ExpandOps.cpp

address comments
DeltaFile
+30-46mlir/lib/Dialect/Arith/Transforms/ExpandOps.cpp
+16-28mlir/test/Dialect/Arith/expand-flush-denormals.mlir
+4-15mlir/include/mlir/Dialect/Arith/Transforms/Passes.td
+50-893 files

LLVM/project 037a48allvm/lib/Transforms/InstCombine InstCombineCompares.cpp, llvm/test/Transforms/InstCombine fcmp.ll

[InstCombine] fold fabs(uitofp(i16 a) - uitofp(i16 b)) < 1.0 to a == b (#191378)

Fixes: https://github.com/llvm/llvm-project/issues/187088

When a and b are types with bitwidth (16 bits) smaller than the mantissa
for float32 (24 bits), they will be exact and their absolute difference
would be integral ±1 or greater if a != b. On the corollary, if their
difference is < 1.0, this implies that a = b.

This patch exploits this fact to fold the expression to just a single
icmp.
DeltaFile
+259-5llvm/test/Transforms/InstCombine/fcmp.ll
+70-0llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
+329-52 files

LLVM/project 7134ce5clang-tools-extra/clang-tidy/readability IdentifierLengthCheck.cpp

Revert "[clang-tidy][NFC] add numeric include for transform_reduce" (#193200)

After experiment, this didn't fix the build failure. So revert this to
keep the trunk clean.

Reverts llvm/llvm-project#193165
DeltaFile
+0-1clang-tools-extra/clang-tidy/readability/IdentifierLengthCheck.cpp
+0-11 files

LLVM/project 2196957mlir/lib/Dialect/Arith/Transforms ExpandOps.cpp, mlir/test/Dialect/Arith expand-flush-denormals.mlir

address comments
DeltaFile
+16-31mlir/lib/Dialect/Arith/Transforms/ExpandOps.cpp
+15-27mlir/test/Dialect/Arith/expand-flush-denormals.mlir
+31-582 files

LLVM/project 649143cmlir/include/mlir/Dialect/Arith/Transforms Passes.td Passes.h, mlir/lib/Dialect/Arith/Transforms ExpandOps.cpp

[mlir][arith] Add support for `arith.flush_denormals` emulation
DeltaFile
+121-0mlir/lib/Dialect/Arith/Transforms/ExpandOps.cpp
+108-0mlir/test/Dialect/Arith/expand-flush-denormals.mlir
+15-0mlir/include/mlir/Dialect/Arith/Transforms/Passes.td
+5-0mlir/include/mlir/Dialect/Arith/Transforms/Passes.h
+249-04 files

LLVM/project 744279bmlir/include/mlir/Dialect/Arith/IR ArithOps.td, mlir/lib/Conversion/ArithAndMathToAPFloat ArithToAPFloat.cpp

[mlir][arith] Add `arith.flush_denormals` operation (#192641)

Add a new `arith.flush_denormals` operation. The operation takes a
floating-point value as input and returns zero if the value is denormal.
If the input is not denormal, the operation passes through the input.
This commit also adds support to the `ArithToAPFloat` infrastructure.

Running example:
```mlir
%flush_a = arith.flush_denormals %a : f32
%flush_b = arith.flush_denormals %b : f32
%res = arith.addf %flush_a, %flush_b : f32
%flush_res = arith.flush_denormals %res : f32
```

The exact lowering path depends on the backend and is not implemented as
part of this PR:
- Per-instruction mode. E.g., on NVIDIA architectures, the above example
can lower to `add.ftz.f32 dest, a, b`.

    [11 lines not shown]
DeltaFile
+39-0mlir/include/mlir/Dialect/Arith/IR/ArithOps.td
+23-13mlir/lib/Conversion/ArithAndMathToAPFloat/ArithToAPFloat.cpp
+32-0mlir/test/Dialect/Arith/ops.mlir
+26-0mlir/test/Conversion/ArithAndMathToAPFloat/arith-to-apfloat.mlir
+23-0mlir/test/Dialect/Arith/canonicalize.mlir
+22-0mlir/lib/Dialect/Arith/IR/ArithOps.cpp
+165-132 files not shown
+195-138 files

LLVM/project 95c5836llvm/lib/Target/AMDGPU AMDGPURegBankLegalizeRules.cpp, llvm/test/CodeGen/AMDGPU minimumnum.ll maximumnum.ll

[AMDGPU] Add legalizer rule support for AMDGPU's regbank fminimumnum and fmaximumnum (#192719)

Part of #192497
DeltaFile
+51-24llvm/test/CodeGen/AMDGPU/minimumnum.ll
+51-24llvm/test/CodeGen/AMDGPU/maximumnum.ll
+2-1llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeRules.cpp
+104-493 files

LLVM/project 60af5a9libsycl/include/sycl/__impl queue.hpp, libsycl/include/sycl/__impl/detail arg_wrapper.hpp

fix merge errors

Signed-off-by: Tikhomirova, Kseniya <kseniya.tikhomirova at intel.com>
DeltaFile
+0-135libsycl/include/sycl/__impl/detail/arg_wrapper.hpp
+0-43libsycl/include/sycl/__impl/queue.hpp
+0-1782 files

LLVM/project c5f5458clang-tools-extra/clang-tidy/readability IdentifierLengthCheck.cpp

Revert "[clang-tidy][NFC] add numeric include for transform_reduce (#193165)"

This reverts commit 3db991b5c287617cedfdb5b2aa5b4cfdd1173a1c.
DeltaFile
+0-1clang-tools-extra/clang-tidy/readability/IdentifierLengthCheck.cpp
+0-11 files

LLVM/project 36836e6clang-tools-extra/clang-tidy/readability IdentifierLengthCheck.cpp IdentifierLengthCheck.h, clang-tools-extra/docs ReleaseNotes.rst

Revert "Revert "[clang-tidy][readability-identifier-length] Add a line count …"

This reverts commit b3647eb0830f62c1ba0fe94dc9f325b7a205d7fd.
DeltaFile
+85-0clang-tools-extra/test/clang-tidy/checkers/readability/identifier-length-line-count-threshold.cpp
+55-1clang-tools-extra/clang-tidy/readability/IdentifierLengthCheck.cpp
+18-0clang-tools-extra/docs/clang-tidy/checks/readability/identifier-length.rst
+5-0clang-tools-extra/docs/ReleaseNotes.rst
+2-0clang-tools-extra/clang-tidy/readability/IdentifierLengthCheck.h
+165-15 files

LLVM/project a1dfc8dmlir/include/mlir/Transforms GreedyPatternRewriteDriver.h CSE.h, mlir/lib/Dialect/Transform/IR TransformOps.cpp

[mlir] Add option to run CSE between greedy rewriter iterations (#193081)

The greedy pattern rewrite driver previously only deduplicated constant
ops between iterations (via the operation folder). Structurally
identical non-constant subexpressions remained distinct SSA values,
blocking fold patterns that only fire when operands match. Reaching the
true fixpoint required chaining an external `cse,canonicalize,...`
pipeline.

Add an opt-in `cseBetweenIterations` flag on `GreedyRewriteConfig` that
runs full CSE on the scoped region after each pattern-application
iteration, and surface it as a `cse-between-iterations` option on the
canonicalizer pass. Off by default to preserve existing performance
characteristics.

Assisted-by: Claude Code
DeltaFile
+435-0mlir/lib/Transforms/Utils/CSE.cpp
+12-401mlir/lib/Transforms/CSE.cpp
+81-0mlir/test/Transforms/canonicalize-cse-between-iterations.mlir
+22-26mlir/lib/Dialect/Transform/IR/TransformOps.cpp
+24-0mlir/include/mlir/Transforms/GreedyPatternRewriteDriver.h
+17-1mlir/include/mlir/Transforms/CSE.h
+591-4286 files not shown
+615-43112 files

LLVM/project ed34ee3mlir/include/mlir/Transforms GreedyPatternRewriteDriver.h, mlir/lib/Transforms/Utils GreedyPatternRewriteDriver.cpp

[mlir] Assert region is within config scope in RegionPatternRewriteDriver (#193177)

Assisted-by: Claude Code
DeltaFile
+11-1mlir/lib/Transforms/Utils/GreedyPatternRewriteDriver.cpp
+7-2mlir/include/mlir/Transforms/GreedyPatternRewriteDriver.h
+18-32 files

LLVM/project 797fc5dllvm/test/CodeGen/AMDGPU idot4u.ll idot2.ll

[AMDGPU] Prefer mul24 over mad24 on SDWA targets (#193033)

If either of a mul24's operands can potentially fold into a SDWA
pattern, then don't fold into a mad24 node (which doesn't have SDWA
variants).

Fixes regressions I first noticed in #162242 - but turns out its an
older problem
DeltaFile
+256-360llvm/test/CodeGen/AMDGPU/idot4u.ll
+134-230llvm/test/CodeGen/AMDGPU/idot2.ll
+136-199llvm/test/CodeGen/AMDGPU/idot4s.ll
+94-115llvm/test/CodeGen/AMDGPU/idot8u.ll
+73-93llvm/test/CodeGen/AMDGPU/idot8s.ll
+32-56llvm/test/CodeGen/AMDGPU/flat-scratch-svs.ll
+725-1,0533 files not shown
+764-1,0559 files

LLVM/project 78cb9fbllvm/test/CodeGen/AMDGPU amdgcn.bitcast.1024bit.ll amdgcn.bitcast.512bit.ll

[DAG] Add Srl combine for extracting last element of BUILD_VECTOR (#181412)

While working on another combine, I noticed some redundant zext shift
pairs `v_lshrrev_b32 + v_lshlrev_b32` coming from a `build_vector(undef,
x)` created by `TargetLowering::SimplifyDemandedBits` and a `srl`
created by `lowerEXTRACT_VECTOR_ELT`.
DeltaFile
+4,805-4,811llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.1024bit.ll
+1,871-1,882llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.512bit.ll
+837-855llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.256bit.ll
+415-431llvm/test/CodeGen/AMDGPU/load-global-i8.ll
+202-482llvm/test/CodeGen/AMDGPU/lower-buffer-fat-pointers-nontemporal-metadata.ll
+303-329llvm/test/CodeGen/AMDGPU/load-constant-i8.ll
+8,433-8,79013 files not shown
+9,460-9,83219 files

LLVM/project 5d24bebllvm/include/llvm/IR ValueDeletionListener.h LLVMContext.h, llvm/lib/IR LLVMContext.cpp LLVMContextImpl.h

[IR] Add ValueDeletionListener for context-level Value deletion notifications
DeltaFile
+132-0llvm/unittests/IR/ValueDeletionListenerTest.cpp
+58-0llvm/include/llvm/IR/ValueDeletionListener.h
+18-0llvm/lib/IR/LLVMContext.cpp
+10-0llvm/include/llvm/IR/LLVMContext.h
+6-0llvm/lib/IR/LLVMContextImpl.h
+6-0llvm/lib/IR/Value.cpp
+230-01 files not shown
+231-07 files

LLVM/project e17fe37llvm/test/CodeGen/AMDGPU/NextUseAnalysis spill-vreg-many-lanes.mir acyclic-770bb.mir

Merge branch 'users/KseniyaTikhomirova/kernel_submit_single_3' into users/KseniyaTikhomirova/kernel_submit_parallel_4
DeltaFile
+275,101-0llvm/test/CodeGen/AMDGPU/NextUseAnalysis/spill-vreg-many-lanes.mir
+144,679-0llvm/test/CodeGen/AMDGPU/NextUseAnalysis/acyclic-770bb.mir
+57,682-0llvm/test/CodeGen/AMDGPU/NextUseAnalysis/double-nested-loops-complex-cfg.mir
+41,844-0llvm/test/CodeGen/AMDGPU/NextUseAnalysis/test_ers_multiple_spills2.mir
+40,613-0llvm/test/CodeGen/AMDGPU/NextUseAnalysis/test_ers_multiple_spills1.mir
+37,209-0llvm/test/CodeGen/AMDGPU/NextUseAnalysis/test_ers_multiple_spills3.mir
+597,128-012,193 files not shown
+1,862,921-399,89412,199 files

LLVM/project 3de6b5cmlir/lib/Dialect/SPIRV/IR SPIRVTypes.cpp, mlir/test/Conversion/FuncToSPIRV types-to-spirv.mlir

[mlir][spirv] Fix Float8EXT type conversion legality (#192466)

Signed-off-by: Davide Grohmann <davide.grohmann at arm.com>
DeltaFile
+50-27mlir/test/Conversion/FuncToSPIRV/types-to-spirv.mlir
+2-0mlir/lib/Dialect/SPIRV/IR/SPIRVTypes.cpp
+52-272 files