LLVM/project 5e9e039libunwind/src libunwind.cpp, libunwind/test cfi_violating_handler.pass.cpp

[libunwind][PAC] Defang ptrauth's PC in valid CFI range abort

It turns out making the CFI check a release mode abort causes many,
if not the majority, of JITs to fail during unwinding as they do not
set up CFI sections for their generated code. As a result any JITs
that do nominally support unwinding (and catching) through their JIT
or assembly frames trip this abort.

rdar://170862047
DeltaFile
+101-0libunwind/test/cfi_violating_handler.pass.cpp
+11-17libunwind/src/libunwind.cpp
+112-172 files

LLVM/project 8a9c6a3llvm/lib/Analysis DependenceAnalysis.cpp

[DA] refactor bounds inference in exactSIVtest and exactRDIVtest  (NFC) (#185719)

Replaces the `SmallVector`-based approach for computing the min/max of
affine domain bounds with `GetMaxOrMin` lambda returning `std::optional`
for better readability.
Previously, the code allocated a `SmallVector` to collect valid bounds
and relied on `smax(front(), back())` to handle the single-element case,
which may cause misunderstanding.

---------

Signed-off-by: Ruoyu Qiu <cabbaken at outlook.com>
DeltaFile
+27-23llvm/lib/Analysis/DependenceAnalysis.cpp
+27-231 files

LLVM/project ca29695llvm/lib/Analysis DependenceAnalysis.cpp

[DA] Remove absolute value calculations in the Weak Zero SIV tests
DeltaFile
+7-7llvm/lib/Analysis/DependenceAnalysis.cpp
+7-71 files

LLVM/project 446d5d4llvm/test/Analysis/DependenceAnalysis weak-zero-siv-addrec-wrap.ll

[DA] Update tests for the Weak Zero SIV tests (NFC)
DeltaFile
+112-0llvm/test/Analysis/DependenceAnalysis/weak-zero-siv-addrec-wrap.ll
+112-01 files

LLVM/project c239032llvm/include/llvm/Analysis DependenceAnalysis.h, llvm/lib/Analysis DependenceAnalysis.cpp

[DA] Consolidate the core logic of the Weak Zero SIV tests (NFCI)
DeltaFile
+80-124llvm/lib/Analysis/DependenceAnalysis.cpp
+5-0llvm/include/llvm/Analysis/DependenceAnalysis.h
+85-1242 files

LLVM/project 4860287llvm/include/llvm/Analysis DependenceAnalysis.h, llvm/lib/Analysis DependenceAnalysis.cpp

[DA] Extract reversing dependence logic (NFCI)
DeltaFile
+10-7llvm/lib/Analysis/DependenceAnalysis.cpp
+6-0llvm/include/llvm/Analysis/DependenceAnalysis.h
+16-72 files

LLVM/project 5cfe50dllvm/lib/Analysis DependenceAnalysis.cpp, llvm/test/Analysis/DependenceAnalysis weak-zero-siv-addrec-wrap.ll

[DA] Add nsw check for addrecs in the Weak Zero SIV tests
DeltaFile
+31-16llvm/test/Analysis/DependenceAnalysis/weak-zero-siv-addrec-wrap.ll
+3-0llvm/lib/Analysis/DependenceAnalysis.cpp
+34-162 files

LLVM/project 68bcbd8llvm/include/llvm/Analysis DependenceAnalysis.h, llvm/lib/Analysis DependenceAnalysis.cpp

[DA] Refactor the signature of the Weak Zero SIV tests (NFC)
DeltaFile
+13-22llvm/lib/Analysis/DependenceAnalysis.cpp
+4-8llvm/include/llvm/Analysis/DependenceAnalysis.h
+17-302 files

LLVM/project 047c2aellvm/lib/Analysis DependenceAnalysis.cpp, llvm/test/Analysis/DependenceAnalysis weak-zero-siv-large-btc.ll weak-zero-siv-overflow.ll

[DA] Rewrite formula in the Weak Zero SIV tests
DeltaFile
+18-14llvm/lib/Analysis/DependenceAnalysis.cpp
+8-8llvm/test/Analysis/DependenceAnalysis/weak-zero-siv-large-btc.ll
+2-6llvm/test/Analysis/DependenceAnalysis/weak-zero-siv-overflow.ll
+28-283 files

LLVM/project eab7049libclc/clc/include/clc/math clc_div_fast.h clc_recip_fast.h, libclc/clc/lib/generic/math clc_sqrt_fast.cl clc_recip_fast.cl

libclc: Add fast version utility functions for div, sqrt and reciprocal

These are subtly different from the native versions, and should have
tighter requirements. They should handle the special cases correctly,
unlike the native functions from the standard.
DeltaFile
+20-0libclc/clc/include/clc/math/clc_div_fast.h
+20-0libclc/clc/include/clc/math/clc_recip_fast.h
+20-0libclc/clc/include/clc/math/clc_sqrt_fast.h
+15-0libclc/clc/lib/generic/math/clc_sqrt_fast.cl
+14-0libclc/clc/lib/generic/math/clc_recip_fast.cl
+13-0libclc/clc/lib/generic/math/clc_div_fast.cl
+102-03 files not shown
+131-19 files

LLVM/project 16e6391llvm/lib/Target/RISCV RISCVTargetTransformInfo.h

[RISCV] Disable use of scalable vectors for VLEN=32 (#185553)

This patch prevents the loop vectorizer to choose scalable vector type
when target VLEN is less than RVVBitsPerBlock.
DeltaFile
+3-1llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h
+3-11 files

LLVM/project eb89049libclc/clc/lib/generic/math clc_frexp_builtin.inc

Remove private
DeltaFile
+0-1libclc/clc/lib/generic/math/clc_frexp_builtin.inc
+0-11 files

LLVM/project 8cfcf33llvm/lib/Target/RISCV RISCVISelLowering.cpp, llvm/test/CodeGen/RISCV/rvv abd.ll fixed-vectors-sad.ll

[RISCV] Combine vwaddu_wv+vabd(u) to vwabda(u)

Note that we only support SEW=8/16 for `vwabda(u)`.

Reviewers: mgudim, preames, mshockwave

Reviewed By: mshockwave

Pull Request: https://github.com/llvm/llvm-project/pull/184603
DeltaFile
+80-1llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+10-14llvm/test/CodeGen/RISCV/rvv/abd.ll
+10-14llvm/test/CodeGen/RISCV/rvv/fixed-vectors-sad.ll
+100-293 files

LLVM/project 98eb241llvm/test/CodeGen/RISCV/rvv abd.ll

[RISCV] Add tests for vwaddu_wv+vabd(u) combine



Reviewers: lukel97, topperc, preames, mgudim, mshockwave

Pull Request: https://github.com/llvm/llvm-project/pull/184962
DeltaFile
+106-0llvm/test/CodeGen/RISCV/rvv/abd.ll
+106-01 files

LLVM/project 5e43debclang/lib/CIR/CodeGen TargetInfo.cpp CIRGenModule.cpp, clang/test/CIR/CodeGenHIP simple.cpp

[CIR][AMDGPU] Add AMDGPU target support to CIR CodeGen
DeltaFile
+89-0clang/test/CIR/CodeGenHIP/simple.cpp
+20-0clang/lib/CIR/CodeGen/TargetInfo.cpp
+4-1clang/lib/CIR/CodeGen/CIRGenModule.cpp
+3-0clang/lib/CIR/CodeGen/TargetInfo.h
+116-14 files

LLVM/project 430e2b7llvm/lib/Transforms/Vectorize VPlanTransforms.cpp, llvm/test/Transforms/LoopVectorize/AArch64 partial-reduce-extends-shared-with-reduce.ll partial-reduce-incomplete-chains.ll

[LV] Simplify the chain traversal in `getScaledReductions()` (NFCI) (#184830)

I found the logic of this function quite hard to reason about. This
patch attempts to rectify this by splitting out matching an extended
reduction operand and traversing reduction chain.

- `matchExtendedReductionOperand()` contains all the logic to match an
  extended operand.
- `getScaledReductions()` validates each operation in the chain,
  starting backwards from the exit value, walking up through the operand
  that is not extended.
DeltaFile
+114-85llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+92-0llvm/test/Transforms/LoopVectorize/AArch64/partial-reduce-extends-shared-with-reduce.ll
+69-0llvm/test/Transforms/LoopVectorize/AArch64/partial-reduce-incomplete-chains.ll
+275-853 files

LLVM/project 84e4544libclc/clc/include/clc/math clc_sqrt_cr.h, libclc/clc/lib/generic CMakeLists.txt

libclc: Add sqrt_cr utility (#185816)

Same as the div case, add a backdoor to use the correctly
rounded sqrt builtin.
DeltaFile
+26-0libclc/clc/include/clc/math/clc_sqrt_cr.h
+11-0libclc/clc/lib/generic/math/clc_sqrt_cr.cl
+11-0libclc/clc/lib/generic/math/clc_sqrt_cr.inc
+3-1libclc/clc/lib/generic/CMakeLists.txt
+51-14 files

LLVM/project 67c2c56libclc/clc/include/clc/math clc_sqrt_cr.h, libclc/clc/lib/generic CMakeLists.txt

libclc: Add sqrt_cr utility

Same as the div case, add a backdoor to use the correctly
rounded sqrt builtin.
DeltaFile
+26-0libclc/clc/include/clc/math/clc_sqrt_cr.h
+11-0libclc/clc/lib/generic/math/clc_sqrt_cr.cl
+11-0libclc/clc/lib/generic/math/clc_sqrt_cr.inc
+3-1libclc/clc/lib/generic/CMakeLists.txt
+51-14 files

LLVM/project 50cb784llvm/include/llvm/Object ELF.h, llvm/test/tools/llvm-readobj/ELF many-sections.s

[Object][ELF] Fix section header zero check (#181796)

The PN_XNUM is a necessary condition for reading shdr0 regardless of the
value of e_shoff. Without this, readShdrZero falsely returns the garbage
value in ELF header instead of emitting warning.
DeltaFile
+47-11llvm/include/llvm/Object/ELF.h
+1-1llvm/test/tools/llvm-readobj/ELF/many-sections.s
+48-122 files

LLVM/project 0b26b37clang/docs ReleaseNotes.rst, llvm/docs ReleaseNotes.md

[RISCV] Add release notes for Zvabd (#185617)
DeltaFile
+1-0clang/docs/ReleaseNotes.rst
+1-0llvm/docs/ReleaseNotes.md
+2-02 files

LLVM/project 58c968dlibclc/clc/include/clc/math clc_div_cr.h, libclc/clc/lib/generic CMakeLists.txt

libclc: Add div_cr utility function (#185730)

This is a workaround for the modal div operator precision. The
OpenCL default is not correctly rounded, so this provides a backdoor
to get a correctly rounded fdiv. Ideally clang would have a builtin
or some other mechanism to control the precision.
DeltaFile
+26-0libclc/clc/include/clc/math/clc_div_cr.h
+12-0libclc/clc/lib/generic/math/clc_div_cr.inc
+11-0libclc/clc/lib/generic/math/clc_div_cr.cl
+4-0libclc/clc/lib/generic/CMakeLists.txt
+53-04 files

LLVM/project 1e67bd2llvm/test/CodeGen/AMDGPU amdgcn.bitcast.1024bit.ll amdgcn.bitcast.512bit.ll, llvm/test/CodeGen/RISCV clmul.ll

Merge branch 'main' into users/aokblast/readelf/pxnum_support
DeltaFile
+84,317-78,372llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.1024bit.ll
+66,293-29,491llvm/test/CodeGen/RISCV/rvv/clmulh-sdnode.ll
+25,751-24,782llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.512bit.ll
+23,663-20,281llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.960bit.ll
+21,867-18,577llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.896bit.ll
+25,051-14,920llvm/test/CodeGen/RISCV/clmul.ll
+246,942-186,4239,478 files not shown
+963,255-485,6939,484 files

LLVM/project f1a7df0llvm/test/CodeGen/AMDGPU dynamic_stackalloc.ll llvm.amdgcn.reduce.sub.ll

[AMDGPU] DPP implementations for Wave Reduction

Adding DPP reduction support for i32 types.
Supported Ops: `umin`, `min`, `umax`, `max`,
`add`, `sub`, `and`, `or`, `xor`.
DeltaFile
+2,113-1,374llvm/test/CodeGen/AMDGPU/dynamic_stackalloc.ll
+1,096-146llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.sub.ll
+1,047-142llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.add.ll
+986-132llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.xor.ll
+894-108llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.max.ll
+894-108llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.min.ll
+7,030-2,0107 files not shown
+11,207-2,79713 files

LLVM/project db7e0daclang/test/Preprocessor riscv-target-features.c, llvm/lib/Target/RISCV RISCVInstrInfoZvzip.td RISCVFeatures.td

[RISCV][MC] Add support of Zvzip extension (#185614)

This adds the initial support of the `Zvzip` standard extension for
reordering structured data in vector registers.

Doc:

*
https://github.com/ved-rivos/riscv-isa-manual/blob/zvzip/src/zvzip.adoc
*
https://github.com/riscv/riscv-opcodes/blob/master/extensions/unratified/rv_zvzip.

Co-Authored-By: wangboyao <wangboyao at bytedance.com>

---------

Co-authored-by: wangboyao <wangboyao at bytedance.com>
DeltaFile
+50-0llvm/test/MC/RISCV/rvv/zvzip.s
+34-0llvm/test/MC/RISCV/rvv/zvzip-invalid.s
+31-0llvm/lib/Target/RISCV/RISCVInstrInfoZvzip.td
+8-0llvm/lib/Target/RISCV/RISCVFeatures.td
+8-0clang/test/Preprocessor/riscv-target-features.c
+4-0llvm/test/CodeGen/RISCV/attributes.ll
+135-07 files not shown
+147-013 files

LLVM/project 03bb430libclc/clc/include/clc/math clc_div_cr.h

Reorder
DeltaFile
+1-1libclc/clc/include/clc/math/clc_div_cr.h
+1-11 files

LLVM/project 121f1a8llvm/lib/Target/AMDGPU SIInsertWaitcnts.cpp AMDGPU.td, llvm/test/CodeGen/AMDGPU asyncmark-gfx12plus.ll asyncmark-err.ll

[AMDGPU] asyncmark support for ASYNC_CNT

The ASYNC_CNT is used to track the progress of asynchronous copies between
global and LDS memories. By including it in asyncmark, the compiler can now
assist the programmer in generating waits for ASYNC_CNT.

Assisted-By: Claude Sonnet 4.5
DeltaFile
+366-0llvm/test/CodeGen/AMDGPU/asyncmark-gfx12plus.ll
+14-7llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
+0-19llvm/test/CodeGen/AMDGPU/asyncmark-err.ll
+3-0llvm/lib/Target/AMDGPU/AMDGPU.td
+1-2llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
+1-1llvm/lib/Target/AMDGPU/SOPInstructions.td
+385-292 files not shown
+388-308 files

LLVM/project 72bb66alibunwind/src libunwind.cpp, libunwind/test cfi_violating_handler.pass.cpp

[libunwind][PAC] Defang ptrauth's PC in valid CFI range abort

It turns out making the CFI check a release mode abort causes many,
if not the majority, of JITs to fail during unwinding as they do not
set up CFI sections for their generated code. As a result any JITs
that do nominally support unwinding (and catching) through their JIT
or assembly frames trip this abort.

rdar://170862047
DeltaFile
+101-0libunwind/test/cfi_violating_handler.pass.cpp
+11-17libunwind/src/libunwind.cpp
+112-172 files

LLVM/project 49c714eclang-tools-extra/clang-tidy/bugprone StdExceptionBaseclassCheck.cpp StdExceptionBaseclassCheck.h, clang-tools-extra/clang-tidy/hicpp ExceptionBaseclassCheck.cpp ExceptionBaseclassCheck.h

[clang-tidy] Rename hicpp-exception-baseclass to bugprone-exception-baseclass (#183474)

Part of the work in https://github.com/llvm/llvm-project/issues/183462.

Closes https://github.com/llvm/llvm-project/issues/183463.
DeltaFile
+0-284clang-tools-extra/test/clang-tidy/checkers/hicpp/exception-baseclass.cpp
+284-0clang-tools-extra/test/clang-tidy/checkers/bugprone/std-exception-baseclass.cpp
+0-57clang-tools-extra/clang-tidy/hicpp/ExceptionBaseclassCheck.cpp
+57-0clang-tools-extra/clang-tidy/bugprone/StdExceptionBaseclassCheck.cpp
+0-34clang-tools-extra/clang-tidy/hicpp/ExceptionBaseclassCheck.h
+34-0clang-tools-extra/clang-tidy/bugprone/StdExceptionBaseclassCheck.h
+375-3758 files not shown
+426-40214 files

LLVM/project 3df0285clang/include/clang/Basic DiagnosticSemaKinds.td, clang/lib/Sema SemaLookup.cpp

warning and note when user declares their own __memory_scope enum
DeltaFile
+21-2clang/lib/Sema/SemaLookup.cpp
+3-3clang/test/Sema/builtin-memory-scope-conflict.c
+3-0clang/include/clang/Basic/DiagnosticSemaKinds.td
+27-53 files

LLVM/project c53ee83llvm/test/CodeGen/AMDGPU dynamic_stackalloc.ll wave-reduce-dpp-i32.mir

[AMDGPU] DPP implementations for Wave Reduction

Adding DPP reduction support for i32 types.
Supported Ops: `umin`, `min`, `umax`, `max`,
`add`, `sub`, `and`, `or`, `xor`.
DeltaFile
+2,113-1,374llvm/test/CodeGen/AMDGPU/dynamic_stackalloc.ll
+1,255-0llvm/test/CodeGen/AMDGPU/wave-reduce-dpp-i32.mir
+1,096-146llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.sub.ll
+1,047-142llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.add.ll
+986-132llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.xor.ll
+894-108llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.min.ll
+7,391-1,9028 files not shown
+12,444-2,80314 files