LLVM/project 2196957mlir/lib/Dialect/Arith/Transforms ExpandOps.cpp, mlir/test/Dialect/Arith expand-flush-denormals.mlir

address comments
DeltaFile
+16-31mlir/lib/Dialect/Arith/Transforms/ExpandOps.cpp
+15-27mlir/test/Dialect/Arith/expand-flush-denormals.mlir
+31-582 files

LLVM/project 649143cmlir/include/mlir/Dialect/Arith/Transforms Passes.td Passes.h, mlir/lib/Dialect/Arith/Transforms ExpandOps.cpp

[mlir][arith] Add support for `arith.flush_denormals` emulation
DeltaFile
+121-0mlir/lib/Dialect/Arith/Transforms/ExpandOps.cpp
+108-0mlir/test/Dialect/Arith/expand-flush-denormals.mlir
+15-0mlir/include/mlir/Dialect/Arith/Transforms/Passes.td
+5-0mlir/include/mlir/Dialect/Arith/Transforms/Passes.h
+249-04 files

LLVM/project 744279bmlir/include/mlir/Dialect/Arith/IR ArithOps.td, mlir/lib/Conversion/ArithAndMathToAPFloat ArithToAPFloat.cpp

[mlir][arith] Add `arith.flush_denormals` operation (#192641)

Add a new `arith.flush_denormals` operation. The operation takes a
floating-point value as input and returns zero if the value is denormal.
If the input is not denormal, the operation passes through the input.
This commit also adds support to the `ArithToAPFloat` infrastructure.

Running example:
```mlir
%flush_a = arith.flush_denormals %a : f32
%flush_b = arith.flush_denormals %b : f32
%res = arith.addf %flush_a, %flush_b : f32
%flush_res = arith.flush_denormals %res : f32
```

The exact lowering path depends on the backend and is not implemented as
part of this PR:
- Per-instruction mode. E.g., on NVIDIA architectures, the above example
can lower to `add.ftz.f32 dest, a, b`.

    [11 lines not shown]
DeltaFile
+39-0mlir/include/mlir/Dialect/Arith/IR/ArithOps.td
+23-13mlir/lib/Conversion/ArithAndMathToAPFloat/ArithToAPFloat.cpp
+32-0mlir/test/Dialect/Arith/ops.mlir
+26-0mlir/test/Conversion/ArithAndMathToAPFloat/arith-to-apfloat.mlir
+23-0mlir/test/Dialect/Arith/canonicalize.mlir
+22-0mlir/lib/Dialect/Arith/IR/ArithOps.cpp
+165-132 files not shown
+195-138 files

LLVM/project 95c5836llvm/lib/Target/AMDGPU AMDGPURegBankLegalizeRules.cpp, llvm/test/CodeGen/AMDGPU minimumnum.ll maximumnum.ll

[AMDGPU] Add legalizer rule support for AMDGPU's regbank fminimumnum and fmaximumnum (#192719)

Part of #192497
DeltaFile
+51-24llvm/test/CodeGen/AMDGPU/minimumnum.ll
+51-24llvm/test/CodeGen/AMDGPU/maximumnum.ll
+2-1llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeRules.cpp
+104-493 files

LLVM/project 60af5a9libsycl/include/sycl/__impl queue.hpp, libsycl/include/sycl/__impl/detail arg_wrapper.hpp

fix merge errors

Signed-off-by: Tikhomirova, Kseniya <kseniya.tikhomirova at intel.com>
DeltaFile
+0-135libsycl/include/sycl/__impl/detail/arg_wrapper.hpp
+0-43libsycl/include/sycl/__impl/queue.hpp
+0-1782 files

LLVM/project c5f5458clang-tools-extra/clang-tidy/readability IdentifierLengthCheck.cpp

Revert "[clang-tidy][NFC] add numeric include for transform_reduce (#193165)"

This reverts commit 3db991b5c287617cedfdb5b2aa5b4cfdd1173a1c.
DeltaFile
+0-1clang-tools-extra/clang-tidy/readability/IdentifierLengthCheck.cpp
+0-11 files

LLVM/project 36836e6clang-tools-extra/clang-tidy/readability IdentifierLengthCheck.cpp IdentifierLengthCheck.h, clang-tools-extra/docs ReleaseNotes.rst

Revert "Revert "[clang-tidy][readability-identifier-length] Add a line count …"

This reverts commit b3647eb0830f62c1ba0fe94dc9f325b7a205d7fd.
DeltaFile
+85-0clang-tools-extra/test/clang-tidy/checkers/readability/identifier-length-line-count-threshold.cpp
+55-1clang-tools-extra/clang-tidy/readability/IdentifierLengthCheck.cpp
+18-0clang-tools-extra/docs/clang-tidy/checks/readability/identifier-length.rst
+5-0clang-tools-extra/docs/ReleaseNotes.rst
+2-0clang-tools-extra/clang-tidy/readability/IdentifierLengthCheck.h
+165-15 files

LLVM/project a1dfc8dmlir/include/mlir/Transforms GreedyPatternRewriteDriver.h CSE.h, mlir/lib/Dialect/Transform/IR TransformOps.cpp

[mlir] Add option to run CSE between greedy rewriter iterations (#193081)

The greedy pattern rewrite driver previously only deduplicated constant
ops between iterations (via the operation folder). Structurally
identical non-constant subexpressions remained distinct SSA values,
blocking fold patterns that only fire when operands match. Reaching the
true fixpoint required chaining an external `cse,canonicalize,...`
pipeline.

Add an opt-in `cseBetweenIterations` flag on `GreedyRewriteConfig` that
runs full CSE on the scoped region after each pattern-application
iteration, and surface it as a `cse-between-iterations` option on the
canonicalizer pass. Off by default to preserve existing performance
characteristics.

Assisted-by: Claude Code
DeltaFile
+435-0mlir/lib/Transforms/Utils/CSE.cpp
+12-401mlir/lib/Transforms/CSE.cpp
+81-0mlir/test/Transforms/canonicalize-cse-between-iterations.mlir
+22-26mlir/lib/Dialect/Transform/IR/TransformOps.cpp
+24-0mlir/include/mlir/Transforms/GreedyPatternRewriteDriver.h
+17-1mlir/include/mlir/Transforms/CSE.h
+591-4286 files not shown
+615-43112 files

LLVM/project ed34ee3mlir/include/mlir/Transforms GreedyPatternRewriteDriver.h, mlir/lib/Transforms/Utils GreedyPatternRewriteDriver.cpp

[mlir] Assert region is within config scope in RegionPatternRewriteDriver (#193177)

Assisted-by: Claude Code
DeltaFile
+11-1mlir/lib/Transforms/Utils/GreedyPatternRewriteDriver.cpp
+7-2mlir/include/mlir/Transforms/GreedyPatternRewriteDriver.h
+18-32 files

LLVM/project 797fc5dllvm/test/CodeGen/AMDGPU idot4u.ll idot2.ll

[AMDGPU] Prefer mul24 over mad24 on SDWA targets (#193033)

If either of a mul24's operands can potentially fold into a SDWA
pattern, then don't fold into a mad24 node (which doesn't have SDWA
variants).

Fixes regressions I first noticed in #162242 - but turns out its an
older problem
DeltaFile
+256-360llvm/test/CodeGen/AMDGPU/idot4u.ll
+134-230llvm/test/CodeGen/AMDGPU/idot2.ll
+136-199llvm/test/CodeGen/AMDGPU/idot4s.ll
+94-115llvm/test/CodeGen/AMDGPU/idot8u.ll
+73-93llvm/test/CodeGen/AMDGPU/idot8s.ll
+32-56llvm/test/CodeGen/AMDGPU/flat-scratch-svs.ll
+725-1,0533 files not shown
+764-1,0559 files

LLVM/project 78cb9fbllvm/test/CodeGen/AMDGPU amdgcn.bitcast.1024bit.ll amdgcn.bitcast.512bit.ll

[DAG] Add Srl combine for extracting last element of BUILD_VECTOR (#181412)

While working on another combine, I noticed some redundant zext shift
pairs `v_lshrrev_b32 + v_lshlrev_b32` coming from a `build_vector(undef,
x)` created by `TargetLowering::SimplifyDemandedBits` and a `srl`
created by `lowerEXTRACT_VECTOR_ELT`.
DeltaFile
+4,805-4,811llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.1024bit.ll
+1,871-1,882llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.512bit.ll
+837-855llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.256bit.ll
+415-431llvm/test/CodeGen/AMDGPU/load-global-i8.ll
+202-482llvm/test/CodeGen/AMDGPU/lower-buffer-fat-pointers-nontemporal-metadata.ll
+303-329llvm/test/CodeGen/AMDGPU/load-constant-i8.ll
+8,433-8,79013 files not shown
+9,460-9,83219 files

LLVM/project 5d24bebllvm/include/llvm/IR ValueDeletionListener.h LLVMContext.h, llvm/lib/IR LLVMContext.cpp LLVMContextImpl.h

[IR] Add ValueDeletionListener for context-level Value deletion notifications
DeltaFile
+132-0llvm/unittests/IR/ValueDeletionListenerTest.cpp
+58-0llvm/include/llvm/IR/ValueDeletionListener.h
+18-0llvm/lib/IR/LLVMContext.cpp
+10-0llvm/include/llvm/IR/LLVMContext.h
+6-0llvm/lib/IR/LLVMContextImpl.h
+6-0llvm/lib/IR/Value.cpp
+230-01 files not shown
+231-07 files

LLVM/project e17fe37llvm/test/CodeGen/AMDGPU/NextUseAnalysis spill-vreg-many-lanes.mir acyclic-770bb.mir

Merge branch 'users/KseniyaTikhomirova/kernel_submit_single_3' into users/KseniyaTikhomirova/kernel_submit_parallel_4
DeltaFile
+275,101-0llvm/test/CodeGen/AMDGPU/NextUseAnalysis/spill-vreg-many-lanes.mir
+144,679-0llvm/test/CodeGen/AMDGPU/NextUseAnalysis/acyclic-770bb.mir
+57,682-0llvm/test/CodeGen/AMDGPU/NextUseAnalysis/double-nested-loops-complex-cfg.mir
+41,844-0llvm/test/CodeGen/AMDGPU/NextUseAnalysis/test_ers_multiple_spills2.mir
+40,613-0llvm/test/CodeGen/AMDGPU/NextUseAnalysis/test_ers_multiple_spills1.mir
+37,209-0llvm/test/CodeGen/AMDGPU/NextUseAnalysis/test_ers_multiple_spills3.mir
+597,128-012,193 files not shown
+1,862,921-399,89412,199 files

LLVM/project 3de6b5cmlir/lib/Dialect/SPIRV/IR SPIRVTypes.cpp, mlir/test/Conversion/FuncToSPIRV types-to-spirv.mlir

[mlir][spirv] Fix Float8EXT type conversion legality (#192466)

Signed-off-by: Davide Grohmann <davide.grohmann at arm.com>
DeltaFile
+50-27mlir/test/Conversion/FuncToSPIRV/types-to-spirv.mlir
+2-0mlir/lib/Dialect/SPIRV/IR/SPIRVTypes.cpp
+52-272 files

LLVM/project 044e21fclang/lib/Headers vecintrin.h, clang/test/CodeGen/SystemZ builtins-systemz-zvector2.c

[SystemZ] Fix wrong mask for float vec_insert (#192967)

This commit fixes an error in vec_insert, where the index masking
effectively made the last two float elements of a vector non-insertable.

co-authored-by: @Andreas-Krebbel
DeltaFile
+4-1clang/test/CodeGen/SystemZ/builtins-systemz-zvector2.c
+1-1clang/lib/Headers/vecintrin.h
+5-22 files

LLVM/project cf1f7c5llvm/test/Transforms/Attributor dereferenceable-2-inseltpoison.ll dereferenceable-2.ll, llvm/test/Transforms/Attributor/ArgumentPromotion musttail.ll

[Attributor] Regenerate test checks (NFC) (#193192)

To avoid spurious invariant.load changes in future changes.
DeltaFile
+44-22llvm/test/Transforms/Attributor/ArgumentPromotion/X86/attributes.ll
+12-8llvm/test/Transforms/Attributor/ArgumentPromotion/musttail.ll
+9-7llvm/test/Transforms/Attributor/dereferenceable-2-inseltpoison.ll
+9-7llvm/test/Transforms/Attributor/dereferenceable-2.ll
+10-6llvm/test/Transforms/Attributor/align.ll
+8-6llvm/test/Transforms/Attributor/call-simplify-pointer-info.ll
+92-565 files not shown
+117-6711 files

LLVM/project 8e132f7cmake/Modules GetToolchainDirs.cmake, flang-rt/cmake/modules AddFlangRT.cmake

[runtimes][CMake] Move Fortran support code from flang-rt (#171610)

Common CMake code to be used by flang-rt and openmp to emit Flang module
files. Most of the code is not yet used within this PR.

Extracted out of #171515 for review by @petrhosek.
DeltaFile
+250-0runtimes/cmake/config-Fortran.cmake
+15-0runtimes/CMakeLists.txt
+14-0flang-rt/cmake/modules/AddFlangRT.cmake
+11-0cmake/Modules/GetToolchainDirs.cmake
+290-04 files

LLVM/project 7f72a8dllvm/lib/Target/AArch64 AArch64SystemOperands.td, llvm/lib/Target/AArch64/AsmParser AArch64AsmParser.cpp

[AArch64][llvm] Remove support for FEAT_MPAMv2_VID

`FEAT_MPAMv2_VID` instructions and system registers, as introduced
in change d30f18d2c, are being removed at this time, as they've been
removed from the latest Arm ARM, which doesn't preclude them returning
in some form in future.

Other system registers introduced with `FEAT_MPAMv2` are unaffected,
and these continue to be ungated, but since `+mpamv2` gating is now empty,
I'm removing this superfluous gating code.
DeltaFile
+5-86llvm/test/MC/AArch64/armv9.7a-mpamv2.s
+0-36llvm/lib/Target/AArch64/AArch64SystemOperands.td
+5-17llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp
+0-18llvm/test/MC/AArch64/armv9.7a-mpamv2-diagnostics.s
+2-12llvm/lib/Target/AArch64/MCTargetDesc/AArch64InstPrinter.cpp
+0-10llvm/test/CodeGen/AArch64/aarch64-sys-intrinsic.ll
+12-1797 files not shown
+13-20513 files

LLVM/project af5fb38llvm/lib/Transforms/IPO AttributorAttributes.cpp, llvm/test/Transforms/Attributor undefined_behavior.ll

[Attributor] Clarify volatile null pointer behavior (NFCI) (#193190)

The comment was referring to volatile stores in particular, which
are specified as non-willreturn. However, allowing volatile accesses
on null (independently of null_pointer_is_valid) is a general
provision that is independent of the access kind.

The actual behavior was still correct, because volatile loads are
considered as writing inaccessible memory, so the mayWriteToMemory()
check was ultimately redundant.

Add a test to make sure volatile load is handled correctly.
DeltaFile
+62-85llvm/test/Transforms/Attributor/undefined_behavior.ll
+2-2llvm/lib/Transforms/IPO/AttributorAttributes.cpp
+64-872 files

LLVM/project 47918c2clang/include/clang/CIR/Dialect/IR CIROps.td, clang/lib/CIR/CodeGen CIRGenBuilder.cpp

[CIR] Make array decay and get_element op perserve address spaces (#192361)

This patch makes sure that the maybeBuildArrayDecay function takes
address spaces into account and makes the get_element op preserve the
address space of the base pointer.

Assisted-by: Cursor / claude-4.6-opus-high
DeltaFile
+50-0clang/test/CIR/CodeGen/amdgpu-array-addrspace.cpp
+9-2clang/include/clang/CIR/Dialect/IR/CIROps.td
+2-1clang/lib/CIR/CodeGen/CIRGenBuilder.cpp
+61-33 files

LLVM/project c2139f1llvm/lib/Transforms/Vectorize SLPVectorizer.cpp, llvm/test/Transforms/SLPVectorizer/X86 copyable_reorder.ll

Revert "[SLP] Normalize copyable operand order to group loads for better vectorization"

This reverts commit 6c35bdbea235fa7f5dd10497b049ed5f328b9124 to fix
issues, reported in https://github.com/llvm/llvm-project/pull/189181#issuecomment-4286829960

Reviewers: 

Pull Request: https://github.com/llvm/llvm-project/pull/193186
DeltaFile
+26-12llvm/test/Transforms/SLPVectorizer/X86/copyable_reorder.ll
+7-26llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+33-382 files

LLVM/project b3647ebclang-tools-extra/clang-tidy/readability IdentifierLengthCheck.cpp IdentifierLengthCheck.h, clang-tools-extra/docs ReleaseNotes.rst

Revert "[clang-tidy][readability-identifier-length] Add a line count threshold" (#193182)

Reverts llvm/llvm-project#185319
DeltaFile
+0-85clang-tools-extra/test/clang-tidy/checkers/readability/identifier-length-line-count-threshold.cpp
+1-55clang-tools-extra/clang-tidy/readability/IdentifierLengthCheck.cpp
+0-18clang-tools-extra/docs/clang-tidy/checks/readability/identifier-length.rst
+0-5clang-tools-extra/docs/ReleaseNotes.rst
+0-2clang-tools-extra/clang-tidy/readability/IdentifierLengthCheck.h
+1-1655 files

LLVM/project 3600cd8llvm/lib/Analysis ConstantFolding.cpp, llvm/test/Transforms/InstSimplify/ConstProp/AMDGPU wave.reduce.ll

[AMDGPU] Unmark wave reduce intrinsics for constant folding (#193142)

The `add`, `sub`, and `xor` wave reduction intrinsics
cannot be constant folded, as `add` and `sub` need
to be multipled by the number of active lanes, and
`xor` depends on the parity of the number of
active lanes.
DeltaFile
+54-39llvm/test/Transforms/InstSimplify/ConstProp/AMDGPU/wave.reduce.ll
+0-6llvm/lib/Analysis/ConstantFolding.cpp
+54-452 files

LLVM/project 853d7c9llvm/lib/Target/AMDGPU AMDGPURegBankLegalizeHelper.cpp AMDGPURegBankLegalize.cpp

AMDGPU/GlobalISel: RegbankLegalize rules for merge-like opcodes (#193026)

Move RegbankLegalize handling for G_BUILD_VECTOR, G_MERGE_VALUES and
G_CONCAT_VECTORS from AMDGPURegBankLegalize to AMDGPURegBankLegalizeRules
by implementing rules for all supported types.
DeltaFile
+0-22llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeHelper.cpp
+0-10llvm/lib/Target/AMDGPU/AMDGPURegBankLegalize.cpp
+10-0llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeRules.cpp
+0-3llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeHelper.h
+10-354 files

LLVM/project fc7c257libcxx/src any.cpp

[libc++] Fix any.cpp not compiling with the minimum header version >= 7 (#193183)

The namespace was accidentally closed outside the header version check
while it was opened inside the check. This moves the closing code into
the check.
DeltaFile
+2-2libcxx/src/any.cpp
+2-21 files

LLVM/project 45db5e4llvm/lib/TargetParser RISCVISAInfo.cpp

[RISCV][NFC] Remove unused RISCVExtBit (#193153)

It seems I forgot to remove it in #135600.
DeltaFile
+0-6llvm/lib/TargetParser/RISCVISAInfo.cpp
+0-61 files

LLVM/project d1f4b79llvm/lib/Transforms/Scalar LICM.cpp, llvm/test/Transforms/LICM call-hoisting.ll pr54495.ll

[LICM] Remove unnecessary check during store hoisting (#187529)

When hoisting stores, we check for interfering uses. This is done
by getting the clobbering def for the use and checking whether it
is outside the loop, which implies that no store in the loop can
interfere with it.

However, in addition to that, we check that the memory use does
not occur before the store. I believe that this additional check
is unnecessary, as if the use could be affected by the store, the
clobber walk would have pointed to the memory phi, not outside the
loop.

I think this check was added because MemorySSA had trouble with
loop-carried dependencies in the past (like in #54682), but this
should no longer be a problem.

This allows store hoisting in cases where there are unrelated
loads before the store.
DeltaFile
+31-0llvm/test/Transforms/LICM/call-hoisting.ll
+0-6llvm/lib/Transforms/Scalar/LICM.cpp
+1-1llvm/test/Transforms/LICM/pr54495.ll
+32-73 files

LLVM/project b460f29llvm/lib/Target/RISCV RISCVISelLowering.cpp RISCVTargetTransformInfo.h, llvm/test/CodeGen/RISCV/rvv vfsqrt-vp.ll fixed-vectors-vfsqrt-vp.ll

[RISCV] Remove codegen for vp_sqrt (#191837)

Part of the work to remove trivial VP intrinsics from the RISC-V
backend, see
https://discourse.llvm.org/t/rfc-remove-codegen-support-for-trivial-vp-intrinsics-in-the-risc-v-backend/87999

This splits off vp_sqrt from #179622.
DeltaFile
+129-218llvm/test/CodeGen/RISCV/rvv/vfsqrt-vp.ll
+63-88llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vfsqrt-vp.ll
+0-5llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+0-1llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h
+192-3124 files

LLVM/project c8af57bclang-tools-extra/clang-tidy/readability IdentifierLengthCheck.cpp IdentifierLengthCheck.h, clang-tools-extra/docs ReleaseNotes.rst

Revert "[clang-tidy][readability-identifier-length] Add a line count threshol…"

This reverts commit 3c88abe3206bb944566ff4b62aa4b9874327f37d.
DeltaFile
+0-85clang-tools-extra/test/clang-tidy/checkers/readability/identifier-length-line-count-threshold.cpp
+1-55clang-tools-extra/clang-tidy/readability/IdentifierLengthCheck.cpp
+0-18clang-tools-extra/docs/clang-tidy/checks/readability/identifier-length.rst
+0-5clang-tools-extra/docs/ReleaseNotes.rst
+0-2clang-tools-extra/clang-tidy/readability/IdentifierLengthCheck.h
+1-1655 files

LLVM/project 337ad44llvm/lib/Debuginfod BuildIDFetcher.cpp, llvm/lib/Object BuildID.cpp

[llvm] Errorize DebuginfodFetcher for inspection at call-sites (#191191)

Failure to fetch debuginfod is rarely an error, but there are cases where
we want to distinguish error reasons down the line, for example in order
to test connection timeouts.
DeltaFile
+9-11llvm/lib/ProfileData/Coverage/CoverageMapping.cpp
+11-9llvm/lib/Debuginfod/BuildIDFetcher.cpp
+9-3llvm/tools/llvm-objdump/llvm-objdump.cpp
+8-3llvm/lib/ProfileData/InstrProfCorrelator.cpp
+6-4llvm/tools/llvm-debuginfod-find/llvm-debuginfod-find.cpp
+6-2llvm/lib/Object/BuildID.cpp
+49-323 files not shown
+57-379 files