LLVM/project 332fde6llvm/lib/Transforms/Vectorize VPlanAnalysis.h VPlanAnalysis.cpp

[LV] Store DataLayout on VPTypeAnalysis (NFC) (#197231)

Using `R->getParent()->getPlan()->getDataLayout()` limits
`inferScalarType` to recipes within blocks that have been attached to a
plan.

(Hit while re-basing a PR)
DeltaFile
+3-1llvm/lib/Transforms/Vectorize/VPlanAnalysis.h
+1-1llvm/lib/Transforms/Vectorize/VPlanAnalysis.cpp
+4-22 files

LLVM/project 1da4f4elldb/include/lldb/Target Platform.h, lldb/source/Plugins/Architecture/Arm ArchitectureArm.cpp

[lldb] Step over non-lldb breakpoints (#190622)

Note: this is a second attempt at 304c680 / #174348, hopefully fixing
the post-commit Mac testing failures. The main differences from the
previous commit are:
* Fixing the incorrect masks in ArchitectureArm.cpp
* Declining to step in StopInfoMachException if the PC and exception
exc_sub_code don't match - implies fixup already applied
* Change to reflect explicit Address constructor - I assume this is
correct, essentially explicitly making a temporary Address object of the
pc address in SkipOverTrapInstruction
* Removing the debugserver code to step over the trap instruction as it
interacts badly with this change (without the check mentioned
previously).

---

Several languages support some sort of "breakpoint" function, which adds
ISA-specific instructions to generate an interrupt at runtime. However,

    [31 lines not shown]
DeltaFile
+83-61lldb/source/Target/Platform.cpp
+76-0lldb/test/API/functionalities/builtin-debugtrap/TestBuiltinDebugTrap.py
+0-71lldb/test/API/macosx/builtin-debugtrap/TestBuiltinDebugTrap.py
+43-0lldb/source/Target/StopInfo.cpp
+30-0lldb/source/Plugins/Architecture/Arm/ArchitectureArm.cpp
+29-0lldb/include/lldb/Target/Platform.h
+261-13214 files not shown
+344-16920 files

LLVM/project 6a107d2llvm/lib/Transforms/AggressiveInstCombine AggressiveInstCombine.cpp, llvm/test/Transforms/AggressiveInstCombine popcount.ll

[AggressiveInstCombine] POPCNT generation for bit-count pattern (#177109)

The proposal is to enhance LLVM by teaching it to recognize the pattern
and replace it with the hardware POPCNT instruction.

---------

Co-authored-by: Rohit Aggarwal <Rohit.Aggarwal at amd.com>
Co-authored-by: Craig Topper <craig.topper at sifive.com>
DeltaFile
+1,077-0llvm/test/Transforms/AggressiveInstCombine/popcount.ll
+136-10llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp
+1,213-102 files

LLVM/project add71aellvm/include/llvm/IR Function.h InstructionListener.h, llvm/lib/IR BasicBlock.cpp Function.cpp

review
DeltaFile
+3-16llvm/include/llvm/IR/Function.h
+16-0llvm/unittests/IR/InstructionListenerTest.cpp
+8-7llvm/lib/IR/BasicBlock.cpp
+2-12llvm/lib/IR/Function.cpp
+7-4llvm/include/llvm/IR/InstructionListener.h
+8-2llvm/lib/IR/Instruction.cpp
+44-411 files not shown
+48-427 files

LLVM/project 8d4a0fbllvm/lib/Target/AMDGPU AMDGPULegalizerInfo.cpp, llvm/test/CodeGen/AMDGPU fptosi-sat-vector.ll fptoui-sat-vector.ll

[AMDGPU] Align GlobalISel with SelectionDAG for f16 to i1/i8 saturated conversions (#188019)

GlobaISel now also saturates `i1` and `i8` to `f16` conversion at `i16`
where available. As a side effect, this also causes the two uniform test
cases: `f16_i1` and `f16_i8` to use VALU instructions, instead of SALU
instructions. This is potentially sub-optimal but it makes it consistent
with ISel and has been already highlighted as future work in #187711.
DeltaFile
+113-193llvm/test/CodeGen/AMDGPU/fptosi-sat-vector.ll
+110-162llvm/test/CodeGen/AMDGPU/fptoui-sat-vector.ll
+10-25llvm/test/CodeGen/AMDGPU/fptosi-sat-scalar.ll
+8-20llvm/test/CodeGen/AMDGPU/fptoui-sat-scalar.ll
+4-1llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
+245-4015 files

LLVM/project 54294feflang/lib/Lower/OpenMP ClauseProcessor.cpp ClauseProcessor.h, llvm/include/llvm/Frontend/OpenMP ConstructDecompositionT.h

NFC code changes
DeltaFile
+68-68flang/lib/Lower/OpenMP/ClauseProcessor.cpp
+18-18llvm/include/llvm/Frontend/OpenMP/ConstructDecompositionT.h
+3-3flang/lib/Lower/OpenMP/ClauseProcessor.h
+4-2flang/lib/Lower/OpenMP/OpenMP.cpp
+93-914 files

LLVM/project 53050a9llvm/lib/Target/AMDGPU VOP3Instructions.td SIInstrInfo.td

Split true16HelperReg32FromSrc16 into two OutPatFrags
DeltaFile
+18-12llvm/lib/Target/AMDGPU/VOP3Instructions.td
+5-7llvm/lib/Target/AMDGPU/SIInstrInfo.td
+23-192 files

LLVM/project aefb53allvm/lib/Target/AMDGPU AMDGPULibCalls.cpp, llvm/test/CodeGen/AMDGPU amdgpu-simplify-libcall-fabs.ll

[AMDGPU] AMDGPULibCalls: Set new intrinsic calling convention to C (#197364)

In #197151 libclc/test/math/fabs.cl,
tryReplaceLibcallWithSimpleIntrinsic replaces `call fastcc float
@_Z4fabsf` with `call fastcc float @llvm.fabs.f32`. But intrinsic call
must use CallingConv::C.
DeltaFile
+12-2llvm/test/CodeGen/AMDGPU/amdgpu-simplify-libcall-fabs.ll
+1-0llvm/lib/Target/AMDGPU/AMDGPULibCalls.cpp
+13-22 files

LLVM/project 852291bclang-tools-extra/clang-tidy/hicpp HICPPTidyModule.cpp, clang-tools-extra/docs ReleaseNotes.rst

[clang-tidy] Remove hicpp module [4/N] (#197354)

This commit removes the remaining checks of `hicpp` module.

Part of https://github.com/llvm/llvm-project/issues/183462
DeltaFile
+1-24clang-tools-extra/clang-tidy/hicpp/HICPPTidyModule.cpp
+14-0clang-tools-extra/docs/ReleaseNotes.rst
+0-11clang-tools-extra/docs/clang-tidy/checks/hicpp/uppercase-literal-suffix.rst
+0-9clang-tools-extra/docs/clang-tidy/checks/hicpp/vararg.rst
+0-8clang-tools-extra/docs/clang-tidy/checks/hicpp/use-nullptr.rst
+0-8clang-tools-extra/docs/clang-tidy/checks/hicpp/use-equals-delete.rst
+15-608 files not shown
+17-10314 files

LLVM/project f835827llvm/lib/Transforms/InstCombine InstCombineMulDivRem.cpp, llvm/test/Transforms/InstCombine powi.ll

[InstCombine] Fix incorect `foldPowiReassoc` on signed overflow (#197172)

Reproducer: 

```
#include <math.h>
#include <stdio.h>

__attribute__((noinline))
double f(double x) {
    return __builtin_powi(x, 1073741824) * __builtin_powi(x, 1073741824);
}

int main(void) {
    double r = f(2.0);
    printf("%f\n", r);
    return r == 0.0; // 0 = correct, 1 = miscompile
}
```

https://llvm.godbolt.org/z/sjK1EsGhx
DeltaFile
+66-10llvm/test/Transforms/InstCombine/powi.ll
+2-2llvm/lib/Transforms/InstCombine/InstCombineMulDivRem.cpp
+68-122 files

LLVM/project 6baac3fllvm/lib/Transforms/InstCombine InstCombineAddSub.cpp, llvm/test/Transforms/InstCombine sub.ll

[InstCombine] Drop `(X + Z) - (Y + Z) --> (X - Y)` fold (#197373)

The pattern below does the same thing and does it better
DeltaFile
+33-0llvm/test/Transforms/InstCombine/sub.ll
+2-9llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp
+35-92 files

LLVM/project 6608431llvm/lib/Target/RISCV RISCVInstrInfoZvvm.td, llvm/lib/Target/RISCV/AsmParser RISCVAsmParser.cpp

[RISCV][MC] add experimental `Zvvfmm` MC support (#196486)

This PR adds experimental MC layer support for the RISC-V `Zvvfmm` from
Integrated Matrix Extension based on the
[riscv-isa-release-fa55752-2026-05-04 spec
release](https://github.com/riscv/integrated-matrix-extension/releases/tag/riscv-isa-release-fa55752-2026-05-04).
As a follow up of `Zvvmm` in #193956

This PR:
- Renames `RISCVInstrInfoZvvmm.td` to `RISCVInstrInfoZvvm.td` so `Zvvmm`
and `Zvvfmm` share the same IME instruction file according to the spec.
And all future instructions from the `Zvvm family` will be placed here
too.
- Adds a new `VScaleReg` asm operand to support the `v0.scale` assembly
syntax.
- Adds assembler support for floating-point matrix instructions:
`vfmmacc.vv`, `vfwmmacc.vv`, `vfqmmacc.vv`, `vf8wmmacc.vv`
- Adds integer-input floating-point accumulate scaled instructions:
`vfwimmacc.vv`, `vfqimmacc.vv`, `vf8wimmacc.vv`

    [3 lines not shown]
DeltaFile
+95-0llvm/test/MC/RISCV/rvv/zvvfmm-invalid.s
+80-4llvm/lib/Target/RISCV/RISCVInstrInfoZvvm.td
+59-0llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp
+57-0llvm/test/MC/RISCV/rvv/zvvfmm.s
+11-0llvm/lib/Target/RISCV/MCTargetDesc/RISCVInstPrinter.cpp
+9-0llvm/test/MC/RISCV/rvv/zvvfmm-invalid-encoding.s
+311-47 files not shown
+330-513 files

LLVM/project f59aca9openmp/device/src Reduction.cpp

[OpenMP][offload] Inline target reductions (#196061)

Significantly reduces register usage and removes register spilling in
`offload/test/offloading/multiple-reductions.cpp`, for example. Provides
speedup of up to 5-10x for a lot of reductions in such a larger setup.

Based on https://github.com/llvm/llvm-project/pull/195940.
See also the discussion in
https://github.com/llvm/llvm-project/pull/195102.
DeltaFile
+11-9openmp/device/src/Reduction.cpp
+11-91 files

LLVM/project eb899cfclang-tools-extra/clang-tidy/readability NonConstParameterCheck.cpp, clang-tools-extra/docs ReleaseNotes.rst

[clang-tidy] Fix false positives for dependent initializers (#186953)

Fixes #177354.

Handle dependent initializers in `readability-non-const-parameter` more
conservatively to avoid false positives in generic lambdas.

This fixes cases like:
- `T x(*p)`
- `DependentCtor<T> s(p)`
DeltaFile
+45-0clang-tools-extra/test/clang-tidy/checkers/readability/non-const-parameter.cpp
+23-6clang-tools-extra/clang-tidy/readability/NonConstParameterCheck.cpp
+1-1clang-tools-extra/docs/ReleaseNotes.rst
+69-73 files

LLVM/project d476ab3lld/ELF LTO.cpp, lldb/source/Core DataFileCache.cpp

[Support][Cache] Make `pruneCache` return an `Expected` (#191367)

When `sys::fs::disk_space` would fail in during a call to `pruneCache`,
it would report a `fatal_error`. However, a failure to prune doesn't
mean the caller should fail catastrophically.

Downstream, we use LLVM's cache in the OpenCL runtime. A failure to
prune the cache can be safely ignored without stopping the user's
application.
DeltaFile
+10-8llvm/lib/Support/CachePruning.cpp
+7-1llvm/lib/Debuginfod/Debuginfod.cpp
+6-1llvm/lib/LTO/ThinLTOCodeGenerator.cpp
+6-1lldb/source/Core/DataFileCache.cpp
+4-2llvm/include/llvm/Support/CachePruning.h
+2-1lld/ELF/LTO.cpp
+35-144 files not shown
+41-1810 files

LLVM/project 378456fllvm/lib/Target/AArch64/GISel AArch64CallLowering.cpp, llvm/test/CodeGen/AArch64/GlobalISel optnone-sme.ll

[AArch64] Don't use GISel for optnone functions if not feasible. (#196343)

A function like the one below should still result in an SME prologue to
set up ZA.
```
void bar() __arm_inout("za");

__attribute__((optnone)) __arm_new("za")
void foo() {
    bar();
}
```
https://godbolt.org/z/aEcoKea4b

This worked in LLVM 22, but got broken by #174746.
DeltaFile
+32-0llvm/test/CodeGen/AArch64/GlobalISel/optnone-sme.ll
+6-15llvm/lib/Target/AArch64/GISel/AArch64CallLowering.cpp
+38-152 files

LLVM/project 17cc4f7llvm/lib/Target/AMDGPU AMDGPUInstCombineIntrinsic.cpp, llvm/test/Transforms/InstCombine/AMDGPU wave-shuffle-patterns.ll llvm.amdgcn.wave.shuffle.ll

[AMDGPU][InstCombine] Optimize constant shuffle patterns (#192246)

Detect llvm.amdgcn.wave.shuffle intrinsics where the lane index is a
constant function of the lane ID and replace them with hardware-specific
intrinsics.
DeltaFile
+751-0llvm/test/Transforms/InstCombine/AMDGPU/wave-shuffle-patterns.ll
+401-74llvm/lib/Target/AMDGPU/AMDGPUInstCombineIntrinsic.cpp
+12-20llvm/test/Transforms/InstCombine/AMDGPU/llvm.amdgcn.wave.shuffle.ll
+1,164-943 files

LLVM/project b4b9b1fclang-tools-extra/clang-tidy/bugprone RandomGeneratorSeedCheck.cpp RandomGeneratorSeedCheck.h, clang-tools-extra/docs ReleaseNotes.rst

[clang-tidy] Adding note of implicit initialization to 'bugprone-random-generator-seed' (#194613)
DeltaFile
+30-4clang-tools-extra/clang-tidy/bugprone/RandomGeneratorSeedCheck.cpp
+13-0clang-tools-extra/test/clang-tidy/checkers/bugprone/random-generator-seed.cpp
+5-0clang-tools-extra/docs/ReleaseNotes.rst
+1-1clang-tools-extra/clang-tidy/bugprone/RandomGeneratorSeedCheck.h
+49-54 files

LLVM/project c8585c1llvm/docs LangRef.rst

[LangRef] Clarify pointer capture spec (#194647)

This clarifies the semantics of "pointer capture" in two respects:

* For provenance capture, specify this in terms of accesses based on the
pointer being UB after the function returns, rather than whether or not
the pointer gets stored etc. The distinction does not matter for
inference, but is commonly required for frontend-generated captures
annotations (and the `!captures` metadata doesn't really make sense
otherwise). This gives provenance (non-)capture unambiguous operational
semantics.
* For address capture, specify that the observable behavior of the
function can't differ based on the address. This is to accommodate
things like loop vectorization runtime checks, which introduce pointer
comparisons on `captures(none)` pointers in a way that is harmless and
needs to be allowed. The semantics here are non-operational. If anyone
has ideas on how to formalize this, they would be very welcome.
DeltaFile
+90-85llvm/docs/LangRef.rst
+90-851 files

LLVM/project 6d54f1ellvm/lib/Analysis AliasAnalysis.cpp, llvm/test/Analysis/BasicAA atomics.ll

[AA] Consider read-only provenance capture for synchronization effects (#197157)

If only read-only provenance is captured, this means that another thread
may only read the object, not write to it. As such, we can also model
synchronizing operations as only reading the location (and thus allow
reordering of reads, but not writes, across the synchronization).
DeltaFile
+26-0llvm/test/Analysis/BasicAA/atomics.ll
+9-3llvm/lib/Analysis/AliasAnalysis.cpp
+35-32 files

LLVM/project 47bca23llvm/lib/Target/X86 X86ISelLowering.cpp, llvm/test/CodeGen/X86 atomic-load-store.ll

[X86] Manage atomic store of fp -> int promotion in DAG

When lowering atomic <1 x T> vector types with floats, selection can fail since
this pattern is unsupported. To support this, floats can be casted to
an integer type of the same size.
DeltaFile
+130-0llvm/test/CodeGen/X86/atomic-load-store.ll
+4-0llvm/lib/Target/X86/X86ISelLowering.cpp
+134-02 files

LLVM/project 3e49a56llvm/lib/CodeGen/SelectionDAG LegalizeVectorTypes.cpp LegalizeTypes.h, llvm/test/CodeGen/X86 atomic-load-store.ll

[SelectionDAG] Scalarize <1 x T> vector types for atomic store

`store atomic <1 x T>` is not valid. This change legalizes
vector types of atomic store via scalarization in SelectionDAG
so that it can, for example, translate from `v1i32` to `i32`.
DeltaFile
+57-0llvm/test/CodeGen/X86/atomic-load-store.ll
+12-0llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
+1-0llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.h
+70-03 files

LLVM/project a31aee7clang/lib/AST ASTImporter.cpp, clang/unittests/AST ASTImporterTest.cpp

[clang][ASTImporter] Add import of node 'TemplateParamObjectDecl' (#193492)
DeltaFile
+34-0clang/unittests/AST/ASTImporterTest.cpp
+17-0clang/lib/AST/ASTImporter.cpp
+51-02 files

LLVM/project 03326f9mlir/include/mlir/Dialect/LLVMIR NVVMOps.td, mlir/lib/Dialect/LLVMIR/IR NVVMDialect.cpp

[MLIR][NVVM] Add rsqrt Op (#195854)

Adds `nvvm.rsqrt` op for fast approximate reciprocal square root. Supports f32 and f64 with an optional `ftz` attribute.

For more information, see PTX ISA: https://docs.nvidia.com/cuda/parallel-thread-execution/#floating-point-instructions-rsqrt
DeltaFile
+28-0mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td
+21-0mlir/test/Target/LLVMIR/nvvm/rsqrt/rsqrt.mlir
+20-0mlir/lib/Dialect/LLVMIR/IR/NVVMDialect.cpp
+69-03 files

LLVM/project f9246adllvm/test/Transforms/InstCombine sub.ll

update tests
DeltaFile
+2-2llvm/test/Transforms/InstCombine/sub.ll
+2-21 files

LLVM/project 245dcd8llvm/lib/Transforms/InstCombine InstCombineAddSub.cpp, llvm/test/Transforms/InstCombine sub.ll

[InstCombine] Drop `(X + Z) - (Y + Z) --> (X - Y)` fold
DeltaFile
+33-0llvm/test/Transforms/InstCombine/sub.ll
+2-9llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp
+35-92 files

LLVM/project f19c931llvm/test/CodeGen/X86 atomic-load-store.ll

[X86] Add atomic vector store tests for unaligned >1 sizes.

Unaligned atomic vector stores with size >1 are lowered to calls.
Adding their tests separately here.
DeltaFile
+1,068-0llvm/test/CodeGen/X86/atomic-load-store.ll
+1,068-01 files

LLVM/project 1693bc2clang/include/clang/Analysis/Analyses/LifetimeSafety Origins.h, clang/lib/Analysis/LifetimeSafety Origins.cpp FactsGenerator.cpp

only track origins for accessed fields
DeltaFile
+37-2clang/lib/Analysis/LifetimeSafety/Origins.cpp
+11-0clang/include/clang/Analysis/Analyses/LifetimeSafety/Origins.h
+4-3clang/lib/Analysis/LifetimeSafety/FactsGenerator.cpp
+52-53 files

LLVM/project ca8d74blldb/include/lldb/Interpreter CommandReturnObject.h, lldb/source/Commands CommandObjectProtocolServer.cpp CommandObjectProcess.cpp

Revert "[lldb] Assert that CommandObject::DoExecute sets a return status (#19…"

This reverts commit 78d124eb16aa62e02a465b0fd6c3c2cab0a26dd8.
DeltaFile
+0-38lldb/unittests/Interpreter/TestCommandReturnObject.cpp
+1-24lldb/source/Interpreter/CommandObject.cpp
+1-4lldb/include/lldb/Interpreter/CommandReturnObject.h
+2-2lldb/test/API/commands/command/script/TestCommandScript.py
+1-2lldb/source/Commands/CommandObjectProtocolServer.cpp
+0-2lldb/source/Commands/CommandObjectProcess.cpp
+5-722 files not shown
+6-748 files

LLVM/project b88008cllvm/lib/Transforms/InstCombine InstCombineMulDivRem.cpp, llvm/test/Transforms/InstCombine powi.ll powi-mul-overflow.ll

address review comments
DeltaFile
+61-5llvm/test/Transforms/InstCombine/powi.ll
+0-45llvm/test/Transforms/InstCombine/powi-mul-overflow.ll
+4-12llvm/lib/Transforms/InstCombine/InstCombineMulDivRem.cpp
+65-623 files