LLVM/project 6073fdellvm/lib/Transforms/Vectorize VPlan.h, llvm/test/Transforms/LoopVectorize tail-folding-constant-trip-counts.ll

[VPlan] Directly check if middle block is pred of scalar preheader. (#191768)

hasScalarTail currently returns incorrect results when queried after
runtime checks have been added. Generalize and harden by checking if the
middle block is a predecessor of the scalar preheader.
DeltaFile
+314-0llvm/test/Transforms/LoopVectorize/tail-folding-constant-trip-counts.ll
+5-5llvm/lib/Transforms/Vectorize/VPlan.h
+319-52 files

LLVM/project 53e01f1mlir/lib/Dialect/Tosa/IR TosaOps.cpp, mlir/test/Dialect/Tosa verifier.mlir invalid.mlir

[mlir][tosa] Improve matmul verifier to check shape information (#191300)

Updates the matmul verifier to check input and output shapes are valid.

Also adds some tests for verifier failures which were previously not
covered.
DeltaFile
+137-0mlir/test/Dialect/Tosa/verifier.mlir
+50-23mlir/lib/Dialect/Tosa/IR/TosaOps.cpp
+0-40mlir/test/Dialect/Tosa/invalid.mlir
+187-633 files

LLVM/project a959796llvm/lib/Transforms/Vectorize VPlanTransforms.cpp

[VPlan] Assert ComputeReductionResult isn't predicated in middle block. NFC (#191767)
DeltaFile
+11-0llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+11-01 files

LLVM/project fe74f12llvm/include/llvm/CodeGen ScheduleDAG.h, llvm/lib/CodeGen ScheduleDAG.cpp

[MISched] Extract `isClustered()` method on SUnit (NFC) (#191700)

This patch encapsulates the check for wether a `SUnit` is clustered,
rather than letting it scatter across call sites. Currently there is
only a single user, but more users can show up, and I think it provides
a cleaner API even for that single user.
DeltaFile
+4-0llvm/include/llvm/CodeGen/ScheduleDAG.h
+1-1llvm/lib/CodeGen/ScheduleDAG.cpp
+5-12 files

LLVM/project 8327759libc/src/__support/annex_k constraint_macros.h CMakeLists.txt

[libc][annex_k] Add libc_constraint_handler macros.
DeltaFile
+44-0libc/src/__support/annex_k/constraint_macros.h
+9-0libc/src/__support/annex_k/CMakeLists.txt
+53-02 files

LLVM/project 8f2e040libc/config/gpu/amdgpu entrypoints.txt, libc/config/gpu/nvptx entrypoints.txt

apply code review suggestions
DeltaFile
+2-3libc/src/__support/annex_k/CMakeLists.txt
+2-2libc/src/__support/annex_k/abort_handler_s.h
+1-1libc/src/stdlib/abort_handler_s.cpp
+1-0libc/config/gpu/amdgpu/entrypoints.txt
+1-0libc/config/gpu/nvptx/entrypoints.txt
+0-1libc/include/stdlib.yaml
+7-73 files not shown
+10-79 files

LLVM/project e099e0alibc/config/linux/aarch64 entrypoints.txt, libc/config/linux/x86_64 entrypoints.txt

[libc][stdlib][annex_k] Add set_constraint_handler_s.
DeltaFile
+28-0libc/src/stdlib/set_constraint_handler_s.cpp
+21-0libc/src/stdlib/set_constraint_handler_s.h
+11-0libc/src/stdlib/CMakeLists.txt
+7-0libc/include/stdlib.yaml
+1-0libc/config/linux/x86_64/entrypoints.txt
+1-0libc/config/linux/aarch64/entrypoints.txt
+69-01 files not shown
+70-07 files

LLVM/project a9fc66alibc/src/__support/annex_k libc_constraint_handler.h CMakeLists.txt

[libc][annex_k] Add libc_constraint_handler.
DeltaFile
+26-0libc/src/__support/annex_k/libc_constraint_handler.h
+9-0libc/src/__support/annex_k/CMakeLists.txt
+35-02 files

LLVM/project df399e0libc/src/__support/annex_k libc_constraint_handler.h

fix format
DeltaFile
+1-1libc/src/__support/annex_k/libc_constraint_handler.h
+1-11 files

LLVM/project e1fd65clibc/config/linux/riscv entrypoints.txt, libc/config/linux/x86_64 entrypoints.txt

[libc][stdlib][annex_k] Add ignore_handler_s.
DeltaFile
+22-0libc/src/stdlib/ignore_handler_s.h
+16-0libc/src/stdlib/ignore_handler_s.cpp
+13-0libc/src/stdlib/CMakeLists.txt
+9-0libc/include/stdlib.yaml
+2-1libc/config/linux/x86_64/entrypoints.txt
+1-0libc/config/linux/riscv/entrypoints.txt
+63-12 files not shown
+65-18 files

LLVM/project c76cb2bclang/lib/StaticAnalyzer/Core RegionStore.cpp, clang/test/Analysis regionstore-zero-init.cpp

[analyzer] Refine default binding preservation in RegionStore (#189319)

Narrow the new setImplicitDefaultValue() guard so existing default
bindings are preserved only for aggregate-like cases.

The previous change was too broad and regressed normal
zero-initialization, causing new int[10]{} to be modeled as undefined
and emit a garbage-value warning instead of the expected analyzer
reports.
DeltaFile
+6-5clang/lib/StaticAnalyzer/Core/RegionStore.cpp
+9-0clang/test/Analysis/regionstore-zero-init.cpp
+15-52 files

LLVM/project 8074ad6libc/include stdlib.yaml, libc/src/__support/annex_k abort_handler_s.h CMakeLists.txt

[libc][annex_k] Add abort_handler_s.
DeltaFile
+43-0libc/src/__support/annex_k/abort_handler_s.h
+22-0libc/src/stdlib/abort_handler_s.h
+20-0libc/src/stdlib/abort_handler_s.cpp
+13-1libc/include/stdlib.yaml
+12-0libc/src/__support/annex_k/CMakeLists.txt
+10-0libc/src/stdlib/CMakeLists.txt
+120-14 files not shown
+124-110 files

LLVM/project 82442a5llvm/lib/Target/AArch64 AArch64ISelLowering.cpp, llvm/test/Analysis/CostModel/AArch64 ldexp.ll

[AArch64] Fix legalization of bf16 ldexp. (#190805)

Similar to fp16 ldexp, we cannot create illegal types for bf16 during
lowering so should promote.
DeltaFile
+70-2llvm/test/CodeGen/AArch64/ldexp.ll
+5-5llvm/test/Analysis/CostModel/AArch64/ldexp.ll
+6-3llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+81-103 files

LLVM/project 1d18740clang/include/clang/AST OpenMPClause.h, clang/lib/Sema SemaOpenMP.cpp

[Clang][OpenMP] Implement Loop splitting `#pragma omp split` directive (#190397)

Implement Loop-splitting #pragma omp split construct with counts clause.
Posting this PR after the revert of PR
([#183261](https://github.com/llvm/llvm-project/pull/183261))

Changes: 

1. Added `openmp/runtime/test/transform/split/lit.local.cfg`
2. Enforced ICE for `counts` clause items in `SemaOpenMP.cpp` (minor
change)
3. Updated tests `split_messages.cpp`, `split_omp_fill.cpp`,
`split_diag_errors.c`.
4. Removed `nonconstant_count.cpp`
DeltaFile
+1,986-0clang/test/OpenMP/split_codegen.cpp
+271-0clang/lib/Sema/SemaOpenMP.cpp
+139-0openmp/runtime/test/transform/split/iterfor.cpp
+123-0clang/test/OpenMP/split_counts_verify.c
+108-0clang/test/OpenMP/split_messages.cpp
+101-0clang/include/clang/AST/OpenMPClause.h
+2,728-074 files not shown
+4,191-1180 files

LLVM/project 5255317llvm/test/CodeGen/RISCV/rvv vssub-vp.ll vssubu-vp.ll

[RISCV] Remove codegen for vp_{u,s}{add,sub}sat (#191639)

Part of the work to remove trivial VP intrinsics from the RISC-V
backend, see
https://discourse.llvm.org/t/rfc-remove-codegen-support-for-trivial-vp-intrinsics-in-the-risc-v-backend/87999

This splits off 4 intrinsics from #179622.
DeltaFile
+309-364llvm/test/CodeGen/RISCV/rvv/vssub-vp.ll
+310-363llvm/test/CodeGen/RISCV/rvv/vssubu-vp.ll
+274-320llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vssubu-vp.ll
+239-303llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vssub-vp.ll
+237-290llvm/test/CodeGen/RISCV/rvv/vsaddu-vp.ll
+235-290llvm/test/CodeGen/RISCV/rvv/vsadd-vp.ll
+1,604-1,9305 files not shown
+1,992-2,46911 files

LLVM/project 682ae8bmlir/lib/Dialect/X86/Transforms VectorContractToAMXDotProduct.cpp, mlir/test/Dialect/X86/AMX vector-contract-to-tiled-dp.mlir

[mlir][x86] Lower packed type vector.contract to AMX dot-product (online-packing) (#188192)

A transform pass to lower flat layout `vector.contract` operation to (a)
amx.tile_mulf for BF16, or (b) amx.tile_muli for Int8 packed types via
`online` packing.

TODOs: On an another `patch` planned to re-factor this pass + retiring
`convert-vector-to-amx` pass.
DeltaFile
+875-148mlir/lib/Dialect/X86/Transforms/VectorContractToAMXDotProduct.cpp
+480-20mlir/test/Dialect/X86/AMX/vector-contract-to-tiled-dp.mlir
+1,355-1682 files

LLVM/project 8048e36clang/lib/CIR/CodeGen CIRGenModule.cpp, clang/test/CIR/CodeGen attr-retain.c attr-used.c

add gv section attribute
DeltaFile
+1-5clang/lib/CIR/CodeGen/CIRGenModule.cpp
+2-2clang/test/CIR/CodeGen/attr-retain.c
+1-1clang/test/CIR/CodeGen/attr-used.c
+1-1clang/test/CIR/CodeGen/keep-persistent-storage-variables.cpp
+1-1clang/test/CIR/CodeGen/keep-static-consts.cpp
+6-105 files

LLVM/project 4a1d1c2clang/test/CIR/CodeGenHIP hip-cuid.hip

fix hip test
DeltaFile
+2-3clang/test/CIR/CodeGenHIP/hip-cuid.hip
+2-31 files

LLVM/project 2b71043clang/test/CIR/CodeGen keep-persistent-storage-variables.cpp keep-static-consts.cpp

add tests persistent-storage-variables and keep-static-consts
DeltaFile
+20-0clang/test/CIR/CodeGen/keep-persistent-storage-variables.cpp
+11-0clang/test/CIR/CodeGen/keep-static-consts.cpp
+31-02 files

LLVM/project 957215cclang/lib/CIR/CodeGen CIRGenModule.cpp CIRGenModule.h, clang/test/CIR/CodeGen attr-retain.c attr-used.c

use CIRGlobalValueInterface
DeltaFile
+30-29clang/lib/CIR/CodeGen/CIRGenModule.cpp
+18-0clang/test/CIR/CodeGen/attr-retain.c
+7-7clang/lib/CIR/CodeGen/CIRGenModule.h
+14-0clang/test/CIR/CodeGen/attr-used.c
+69-364 files

LLVM/project fbcdc95clang/lib/CIR/CodeGen CIRGenModule.cpp CIRGenModule.h, clang/test/CIR/CodeGenHIP hip-cuid.hip

[CIR] Add addLLVMUsed and addLLVMCompilerUsed methods to CIRGenModule
DeltaFile
+100-2clang/lib/CIR/CodeGen/CIRGenModule.cpp
+27-0clang/test/CIR/CodeGenHIP/hip-cuid.hip
+19-0clang/lib/CIR/CodeGen/CIRGenModule.h
+146-23 files

LLVM/project f647f0cllvm/lib/Transforms/Vectorize SLPVectorizer.cpp, llvm/test/Transforms/SLPVectorizer/RISCV revec-strided-load.ll

[SLP] Fix handling of strided loads during re-vectorization (#191294)

Fixes #191292
DeltaFile
+8-2llvm/test/Transforms/SLPVectorizer/RISCV/revec-strided-load.ll
+4-3llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+12-52 files

LLVM/project 874702ellvm/lib/Target/AMDGPU AMDGPUSwLowerLDS.cpp

use getFirstNonPHIOrDbgOrAlloca
DeltaFile
+1-3llvm/lib/Target/AMDGPU/AMDGPUSwLowerLDS.cpp
+1-31 files

LLVM/project def143aclang/lib/AST DeclTemplate.cpp, clang/test/SemaTemplate GH188759.cpp

[clang] fix getReplacedTemplateParameter for function template specializations (#189559)

(cherry picked from commit 2b439327026d45bf53e59159c8e40fccf87930b6)
DeltaFile
+13-0clang/test/SemaTemplate/GH188759.cpp
+6-4clang/lib/AST/DeclTemplate.cpp
+19-42 files

LLVM/project a98b9dallvm/lib/Target/AMDGPU AMDGPUSwLowerLDS.cpp, llvm/test/CodeGen/AMDGPU amdgpu-sw-lower-lds-static-alloca-placement.ll

splice and then move stragglers allocas
DeltaFile
+40-74llvm/test/CodeGen/AMDGPU/amdgpu-sw-lower-lds-static-alloca-placement.ll
+9-6llvm/lib/Target/AMDGPU/AMDGPUSwLowerLDS.cpp
+49-802 files

LLVM/project b0a403allvm/lib/Target/AMDGPU AMDGPUSwLowerLDS.cpp, llvm/test/CodeGen/AMDGPU amdgpu-sw-lower-lds-static-alloca-placement.ll

[AMDGPU][ASAN] Move allocas to entry block in amdgpu-sw-lower-lds pass
DeltaFile
+95-0llvm/test/CodeGen/AMDGPU/amdgpu-sw-lower-lds-static-alloca-placement.ll
+13-1llvm/lib/Target/AMDGPU/AMDGPUSwLowerLDS.cpp
+108-12 files

LLVM/project c755c08llvm/lib/CodeGen/SelectionDAG TargetLowering.cpp, llvm/test/CodeGen/RISCV split-udiv-by-constant.ll split-urem-by-constant.ll

[TargetLowering] Support larger divisors in expandDIVREMByConstant. (#191119)

Instead of bailing out if the original divisor exceeds HBitWidth,
allow divisors that fit in HBitWidth after removing trailing zeros.

PartialRem now needs a low and high part. Shifting RemL left
now needs to handle shifting into RemH.

Assisted-by: Claude Sonnet 4.5
DeltaFile
+287-2llvm/test/CodeGen/RISCV/split-udiv-by-constant.ll
+210-2llvm/test/CodeGen/RISCV/split-urem-by-constant.ll
+70-24llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
+23-5llvm/test/CodeGen/X86/i128-udiv.ll
+590-334 files

LLVM/project 215f35ellvm/lib/Target/AArch64 AArch64ExpandPseudoInsts.cpp

[AArch64] Skip non-pseudo instructions in AArch64ExpandPseudoInsts (#191395)

AArch64::getSVEPseudoMap calls are visible in compile-time profiles even on
non-SVE targets. I think CodeGenMapTable could be improved, it's currently
emitting a constexpr array sorted by opcode and a hand-rolled binary search
over that array, however the AArch64ExpandPseudoInsts pass is missing a simple
check for pseudo instructions before expanding. This avoids the compile-time
cost.

https://llvm-compile-time-tracker.com/compare.php?from=0d42811ea4658b3e86a3801b3bc848324f8540f8&to=9e2434de84577ca1c5e6de8fe8d75c6b8e282b3f&stat=instructions%3Au
DeltaFile
+2-1llvm/lib/Target/AArch64/AArch64ExpandPseudoInsts.cpp
+2-11 files

LLVM/project 8e64b13llvm/lib/Target/AMDGPU FLATInstructions.td, llvm/test/CodeGen/AMDGPU llvm.amdgcn.av.global.load.b128.ll llvm.amdgcn.av.global.store.b128.ll

Address review comments

- Revert a lot of mnemonic renames caused by a brute-force sed.
- Add -filetype=null to unsupported test RUN lines
- Regenerate CHECK lines in codegen tests

Assisted-By: Claude Opus 4.6 (1M context)
DeltaFile
+696-696llvm/test/CodeGen/AMDGPU/llvm.amdgcn.av.global.load.b128.ll
+96-96llvm/test/CodeGen/AMDGPU/llvm.amdgcn.av.global.store.b128.ll
+48-48llvm/test/CodeGen/AMDGPU/amdgcn-av-scopes.ll
+7-7llvm/lib/Target/AMDGPU/FLATInstructions.td
+6-6llvm/test/CodeGen/AMDGPU/unsupported-av-global-store.ll
+6-6llvm/test/CodeGen/AMDGPU/unsupported-av-global-load.ll
+859-8591 files not shown
+863-8637 files

LLVM/project 5864733llvm/include/llvm/CodeGen/GlobalISel GIMatchTableExecutorImpl.h GIMatchTableExecutor.h, llvm/utils/TableGen/Common/GlobalISel GlobalISelMatchTable.cpp GlobalISelMatchTable.h

Skip type check for metadata operands in addTypeCheckPredicate

Metadata is trivially always metadata. So we don't actually need the predicate
introduced in #191389.
DeltaFile
+4-15llvm/utils/TableGen/Common/GlobalISel/GlobalISelMatchTable.cpp
+0-18llvm/utils/TableGen/Common/GlobalISel/GlobalISelMatchTable.h
+0-9llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutorImpl.h
+0-6llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutor.h
+4-484 files