LLVM/project 0eefb26libcxx CMakeLists.txt, libcxx/test/tools/clang_tidy_checks CMakeLists.txt

[libc++] Build the library with C++26 (#181021)

All supported compilers support C++26. This allows simplifying some of
the upcoming <text_encoding> implementation.
DeltaFile
+13-5libcxx/CMakeLists.txt
+0-2libcxx/test/tools/clang_tidy_checks/CMakeLists.txt
+13-72 files

LLVM/project e31db65clang/include/clang/StaticAnalyzer/Core/PathSensitive ExprEngine.h, clang/lib/StaticAnalyzer/Core ExprEngineCXX.cpp CallEvent.cpp

[NFC][analyzer] Improve computeObjectUnderConstruction (#186186)

Previously the method `ExprEngine::computeObjectUnderConstruction` took
a `NodeBuilderContext` parameter which was only used to call its
`blockCount()` method; this commit replaces this with directly taking
`NumVisitedCaller` (= number of times the caller was visited, the
`blockCount`) as an unsigned value.

In `CallEvent::getReturnValueUnderConstruction` this method is invoked
with `getNumVisitedCurrent()`, the visitation count of the _current_
`LocationContext` and `Block`; instead of calling `getNumVisited()` on
the `LocationContext` and `Block` corresponding to the `CallEvent`
instance (available through its data members). This is logically
incorrect, but (at least within the lit testsuite) there is no situation
where it leads to actually incorrect behavior. This is currently marked
with a FIXME comment; it will be fixed in a follow-up commit.
DeltaFile
+6-8clang/lib/StaticAnalyzer/Core/ExprEngineCXX.cpp
+5-1clang/lib/StaticAnalyzer/Core/CallEvent.cpp
+3-3clang/include/clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h
+14-123 files

LLVM/project 67cedc3libclc/clc/lib/generic/math clc_pow_base.inc, llvm/test/CodeGen/X86/apx sub.ll add.ll

Merge branch 'main' into users/kparzysz/e10-check-depth
DeltaFile
+486-145llvm/test/CodeGen/X86/apx/sub.ll
+476-140llvm/test/CodeGen/X86/apx/add.ll
+450-132llvm/test/CodeGen/X86/apx/or.ll
+448-130llvm/test/CodeGen/X86/apx/xor.ll
+542-0libclc/clc/lib/generic/math/clc_pow_base.inc
+411-121llvm/test/CodeGen/X86/apx/and.ll
+2,813-668472 files not shown
+15,974-6,853478 files

LLVM/project bec0f40llvm/lib/Target/SPIRV SPIRVSubtarget.cpp, llvm/test/CodeGen/SPIRV memory-model-md-shader.ll memory-model-md-glsl450.ll

[SPIR-V] Handle spirv.MemoryModel metadata (#186138)
DeltaFile
+31-0llvm/lib/Target/SPIRV/SPIRVSubtarget.cpp
+16-0llvm/test/CodeGen/SPIRV/memory-model-md-shader.ll
+16-0llvm/test/CodeGen/SPIRV/memory-model-md-glsl450.ll
+14-0llvm/test/CodeGen/SPIRV/memory-model-md-opencl.ll
+14-0llvm/test/CodeGen/SPIRV/memory-model-md-vulkan.ll
+12-0llvm/test/CodeGen/SPIRV/memory-model-md-unknown.ll
+103-02 files not shown
+117-18 files

LLVM/project f335bd9flang/lib/Semantics resolve-directives.cpp check-omp-loop.cpp, flang/test/Parser/OpenMP interchange-permutation.f90 do-interchange.f90

[Flang][OpenMP] Add semantic support for OpenMP Loop Interchange and permutation clause in Flang (#183435)

This patch adds semantics for the `omp interchange` directive in flang
and the permutation clause, as specified in OpenMP 6.0.
Relevant tests have been added in every step.
DeltaFile
+108-0flang/test/Semantics/OpenMP/interchange-permutation.f90
+43-14flang/lib/Semantics/resolve-directives.cpp
+43-0flang/test/Semantics/OpenMP/interchange01.f90
+35-0flang/lib/Semantics/check-omp-loop.cpp
+35-0flang/test/Parser/OpenMP/interchange-permutation.f90
+34-0flang/test/Parser/OpenMP/do-interchange.f90
+298-148 files not shown
+393-1714 files

LLVM/project 818efd5llvm/lib/Target/SPIRV SPIRVEmitIntrinsics.cpp SPIRVInstructionSelector.cpp, llvm/test/CodeGen/SPIRV undef-global-aggregate-initializer.ll

[SPIR-V] Handle undef aggregate initializers for global variables (#186785)

Expand undef aggregate global initializers into per-element spv_undef
intrinsics
DeltaFile
+73-0llvm/test/CodeGen/SPIRV/undef-global-aggregate-initializer.ll
+39-3llvm/lib/Target/SPIRV/SPIRVEmitIntrinsics.cpp
+6-3llvm/lib/Target/SPIRV/SPIRVInstructionSelector.cpp
+5-1llvm/lib/Target/SPIRV/SPIRVUtils.h
+123-74 files

LLVM/project 34fa16aclang/lib/Sema SemaAttr.cpp, clang/test/Sema warn-lifetime-analysis-nocfg.cpp

[LifetimeSafety] Exclude basic_string::insert from capturing methods (#186989)

Fixes https://github.com/llvm/llvm-project/issues/186817
DeltaFile
+5-0clang/test/Sema/warn-lifetime-analysis-nocfg.cpp
+3-0clang/test/Sema/Inputs/lifetime-analysis.h
+2-0clang/lib/Sema/SemaAttr.cpp
+10-03 files

LLVM/project 838f617llvm/docs AMDGPUUsage.rst

Formatting fix
DeltaFile
+2-3llvm/docs/AMDGPUUsage.rst
+2-31 files

LLVM/project defe937llvm/docs AMDGPUUsage.rst

Reintroduce barrier-phase-with usage
DeltaFile
+3-3llvm/docs/AMDGPUUsage.rst
+3-31 files

LLVM/project 9a42e5bmlir/docs/Dialects TOSA.md, mlir/include/mlir/Dialect/Tosa/IR TosaOps.td TosaOpBase.td

[mlir][tosa] Remove 'Pure' trait from operations that are not speculatable (#185700)

This commit removes the 'Pure' trait from a number of TOSA operations.
Instead of marking most ops as pure by default, the trait is now opt-in
for operations that are provably side-effect free and speculatable.

Several operations were previously marked as pure unintentionally.

The following operations have had 'Pure' removed (reason in brackets):
- ARGMAX (out-of-range index)
- AVG_POOL2D (accumulator overflow/underflow)
- AVG_POOL2D_ADAPTIVE (same as above)
- CONV2D (accumulator overflow/underflow)
- CONV2D_BLOCK_SCALED (accumulator overflow/underflow)
- CONV3D (accumulator overflow/underflow)
- DEPTHWISE_CONV2D (accumulator overflow/underflow)
- MATMUL (accumulator overflow/underflow)
- MATMUL_T_BLOCK_SCALED (accumulator overflow/underflow)
- TRANSPOSE_CONV2D (accumulator overflow/underflow)

    [25 lines not shown]
DeltaFile
+145-58mlir/include/mlir/Dialect/Tosa/IR/TosaOps.td
+20-0mlir/docs/Dialects/TOSA.md
+4-4mlir/include/mlir/Dialect/Tosa/IR/TosaOpBase.td
+169-623 files

LLVM/project 2b5e302lldb/test/API/windows/launch/replace-dll TestReplaceDLL.py

[lldb][windows] fix TestReplaceDLL.py reruns (#187002)
DeltaFile
+2-0lldb/test/API/windows/launch/replace-dll/TestReplaceDLL.py
+2-01 files

LLVM/project e1baf3allvm/lib/Target/AMDGPU AMDGPUCallLowering.cpp AMDGPUCallLowering.h

[AMDGPU] Remove AMDGPUCallLowering dependency on AMDGPUTargetLowering. NFC. (#187008)
DeltaFile
+2-3llvm/lib/Target/AMDGPU/AMDGPUCallLowering.cpp
+1-2llvm/lib/Target/AMDGPU/AMDGPUCallLowering.h
+3-52 files

LLVM/project 63f3463llvm/include/llvm/ADT GenericUniformityImpl.h GenericUniformityInfo.h, llvm/lib/Analysis UniformityAnalysis.cpp

review: chnage design to track uniform values
DeltaFile
+43-47llvm/lib/Analysis/UniformityAnalysis.cpp
+11-29llvm/include/llvm/ADT/GenericUniformityImpl.h
+17-21llvm/lib/CodeGen/MachineUniformityAnalysis.cpp
+1-1llvm/include/llvm/ADT/GenericUniformityInfo.h
+72-984 files

LLVM/project 3be7b2fllvm/lib/Target/X86 X86ISelLowering.cpp, llvm/test/CodeGen/X86 shift-i512.ll

[X86] Improve handling of i512 SHL(-1,Amt) + SRL(-1,Amt) "mask shifts" (#186806)

An extension of the existing one-bit shift patterns - perform an initial
select to handle 'allones/allzeros' elements and then insert the element
that has a partial mask on top of it.

Often turns up in bit manipulation patterns
DeltaFile
+64-90llvm/test/CodeGen/X86/shift-i512.ll
+28-6llvm/lib/Target/X86/X86ISelLowering.cpp
+92-962 files

LLVM/project 5de7c86llvm/test/CodeGen/X86/apx sub.ll add.ll

[X86][APX] Enable NDD tunings (#186049)

For latest Intel processors with APX, mem form of all NDD instructions
(except for RIP based addressing) and imm form of NDD add/sub (include
inc/dec) instructions needed to be turned off by default for optimal
hardware performance.

Two new tunings enable-ndd-mem and enable-ndd-imm was added then
disabled by default. The new isa attributes is adopted for different
alternatives for NDD related patterns to control the generation of
mem/imm form.
DeltaFile
+486-145llvm/test/CodeGen/X86/apx/sub.ll
+476-140llvm/test/CodeGen/X86/apx/add.ll
+450-132llvm/test/CodeGen/X86/apx/or.ll
+448-130llvm/test/CodeGen/X86/apx/xor.ll
+411-121llvm/test/CodeGen/X86/apx/and.ll
+381-99llvm/test/CodeGen/X86/apx/sbb.ll
+2,652-76730 files not shown
+4,799-1,70436 files

LLVM/project 6c9407autils/bazel/llvm-project-overlay/clang BUILD.bazel

[Bazel] Port 9e43b35 (#187011)
DeltaFile
+2-0utils/bazel/llvm-project-overlay/clang/BUILD.bazel
+2-01 files

LLVM/project bc54aefllvm/test/Analysis/LoopAccessAnalysis invariant-dep-same-ptr.ll

[LAA] Add tests with missed aliasing invariant load/store. (NFC)

Add a set of tests showing incorrect LAA results based on
https://github.com/llvm/llvm-project/issues/186922.
DeltaFile
+343-0llvm/test/Analysis/LoopAccessAnalysis/invariant-dep-same-ptr.ll
+343-01 files

LLVM/project fdbc015lldb/source/Plugins/Platform/MacOSX PlatformDarwin.cpp, lldb/unittests/Platform PlatformDarwinTest.cpp

[lldb][PlatformDarwin][NFC] Move logic to emit warning on invalid/conflicting Python script names into helper function (#185669)

Depends on:
* https://github.com/llvm/llvm-project/pull/185666
* https://github.com/llvm/llvm-project/pull/185627

I'm planning on re-using this logic for a different API. Hence move it
into a common helper.
DeltaFile
+36-26lldb/source/Plugins/Platform/MacOSX/PlatformDarwin.cpp
+15-16lldb/unittests/Platform/PlatformDarwinTest.cpp
+51-422 files

LLVM/project bc19061utils/bazel/llvm_configs config.h.cmake

[Bazel] Port 55b271d (#187007)
DeltaFile
+3-0utils/bazel/llvm_configs/config.h.cmake
+3-01 files

LLVM/project a78d1d9mlir/lib/Conversion/VectorToLLVM ConvertVectorToLLVM.cpp, mlir/test/Conversion/VectorToLLVM vector-to-llvm-interface.mlir

[mlir][vector] Add missing tests (nfc) (#186990)

Currently, `ConvertVectorToLLVM` rejects strided memrefs when lowering
`vector.gather` and `vector.scatter`. This PR adds tests to document
that behavior.

Supporting strided memrefs in the lowering is left as future work.
However, it is still unclear whether gather/scatter on strided memrefs
should be supported at all (see the Discourse discussion [1]).

This PR also adds tests for `vector.load` and `vector.store` in
`invalid.mlir` to document that these ops do not support strided
memrefs.

[1] https://discourse.llvm.org/t/rfc-semantics-of-vector-gather-indices-with-strided-memrefs
DeltaFile
+27-0mlir/test/Conversion/VectorToLLVM/vector-to-llvm-interface.mlir
+16-0mlir/test/Dialect/Vector/invalid.mlir
+2-0mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVM.cpp
+45-03 files

LLVM/project f3bed35llvm/lib/Target/AArch64 AArch64InstrInfo.td AArch64InstrFormats.td, llvm/lib/Target/AArch64/MCTargetDesc AArch64InstPrinter.cpp

[AArch64][llvm] Redefine some isns as an alias of `SYS`

Some instructions are not currently defined as an alias of `SYS`
when they should be, so they don't disassemble back into the
native instruction, but instead disassemble into `SYS`.
Fix these cases and add additional testcase.

Note that I've left `GCSPUSHM` due to a `mayStore`, `GCSSS1` and
`GCSSS2` as they're used in AArch64ISelDAGToDAG.cpp, and `GCSPOPM`
has an intrinsic pattern in AArch64InstrInfo.td. They will disassemble
correctly though, as they use `InstAlias`.
DeltaFile
+116-0llvm/lib/Target/AArch64/MCTargetDesc/AArch64InstPrinter.cpp
+24-23llvm/lib/Target/AArch64/AArch64InstrInfo.td
+40-0llvm/test/MC/AArch64/armv9.4a-gcs.s
+0-19llvm/lib/Target/AArch64/AArch64InstrFormats.td
+6-2llvm/test/MC/AArch64/brbe.s
+5-0llvm/test/MC/AArch64/armv8.9a-debug-pmu.s
+191-441 files not shown
+196-447 files

LLVM/project 04cc752mlir/lib/Dialect/Bufferization/Transforms FuncBufferizableOpInterfaceImpl.cpp, mlir/test/Dialect/Bufferization/Transforms one-shot-module-bufferize-call-copy-before-write.mlir

[mlir][bufferization] Fix crash with copy-before-write + bufferize-function-boundaries (#186446)

When `copy-before-write=1` is combined with
`bufferize-function-boundaries=1`, `bufferizeOp` creates a plain
`AnalysisState` (not `OneShotAnalysisState`) and passes it to
`insertTensorCopies`. Walking `CallOp`s during conflict resolution
called `getCalledFunction(callOp, state)`, which unconditionally cast
the `AnalysisState` to `OneShotAnalysisState` via `static_cast`, causing
UB and a stack overflow crash.

Fix by guarding the cast with `isa<OneShotAnalysisState>()` so that when
the state is a plain `AnalysisState`, the function falls through to
building a fresh `SymbolTableCollection` — the same safe fallback
already present.

Fixes https://github.com/llvm/llvm-project/issues/163052

Assisted-by: Claude Code
DeltaFile
+16-0mlir/test/Dialect/Bufferization/Transforms/one-shot-module-bufferize-call-copy-before-write.mlir
+8-6mlir/lib/Dialect/Bufferization/Transforms/FuncBufferizableOpInterfaceImpl.cpp
+24-62 files

LLVM/project a26077ellvm/lib/Target/NVPTX NVPTXIntrinsics.td, llvm/test/CodeGen/NVPTX tcgen05-mma-scale-d.ll tcgen05-mma.ll

[NFC][NVPTX] Fix tcgen05.mma PTX instruction encoding (#186602)

.ashift should be before .collector::a::* according to PTX ISA.

ptxas accepts both orderings, but the spec-correct order is used now.
DeltaFile
+16-16llvm/test/CodeGen/NVPTX/tcgen05-mma-scale-d.ll
+12-12llvm/test/CodeGen/NVPTX/tcgen05-mma.ll
+4-4llvm/test/CodeGen/NVPTX/tcgen05-mma-i8.ll
+1-1llvm/lib/Target/NVPTX/NVPTXIntrinsics.td
+33-334 files

LLVM/project 055322cmlir/lib/IR Diagnostics.cpp, mlir/test/mlir-opt expected-unknown-loc-unmatched.mlir

[mlir] Fix crash in diagnostic verifier for unmatched @unknown expectations (#186148)

When an expected-* directive uses the @unknown location specifier, the
associated ExpectedDiag record has an invalid (null) SMLoc as its
fileLoc. If the expected diagnostic is never produced, emitError() is
called to report the unmatched expectation, but it unconditionally
constructs an SMRange from fileLoc, triggering a null-pointer
dereference (UBSan) and an assertion failure in SMRange's constructor
which requires both endpoints to have equal validity.

Fix by guarding the SMRange construction with a fileLoc.isValid() check.
When fileLoc is invalid, call PrintMessage without a source range.

Fixes #163343

Assisted-by: Claude Code
DeltaFile
+11-3mlir/lib/IR/Diagnostics.cpp
+9-0mlir/test/mlir-opt/expected-unknown-loc-unmatched.mlir
+20-32 files

LLVM/project 48e6a61llvm/test/CodeGen/AMDGPU amdgcn.bitcast.1024bit.ll amdgcn.bitcast.512bit.ll, llvm/test/CodeGen/RISCV/rvv clmulh-sdnode.ll

Merge branch 'arm-fp-flt' into arm-fp-faddsub

The merged changes on main include a fix for the previous denormal
handling bug in the old Thumb1 addsf3. So one of my reasons to replace
it completely is gone. Therefore I'm reinstating it, and putting the
new one alongside it as a different time/space tradeoff.
DeltaFile
+84,299-78,378llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.1024bit.ll
+66,293-29,491llvm/test/CodeGen/RISCV/rvv/clmulh-sdnode.ll
+25,754-24,794llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.512bit.ll
+23,631-20,343llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.960bit.ll
+21,843-18,635llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.896bit.ll
+19,086-16,499llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.832bit.ll
+240,906-188,1409,745 files not shown
+957,718-642,6419,751 files

LLVM/project b861a28llvm/lib/Target/WebAssembly WebAssemblyISelLowering.cpp, llvm/test/CodeGen/WebAssembly simd-bitmask.ll

[WebAssembly] combine `bitmask` with `setcc <X>, 0, setlt` (#179065)

The rust `simd_bitmask` intrinsic is UB when the lanes of its input are
not either `0` or `!0`, presumably so that the implementation can be
more efficient because it could look at any bit. To get the "mask of
MSB" behavior of webassembly's `bitmask`, we would like to simply first
compare with a zero vector.

```llvm
define i32 @example(<2 x i64> noundef %v) {
entry:
  %1 = icmp slt <16 x i8> %v, zeroinitializer
  %2 = bitcast <16 x i1> %1 to i16
  %3 = zext i16 %2 to i32
  ret i32 %3
}
```

On x86_64, this additional comparison optimizes away, but for wasm it

    [22 lines not shown]
DeltaFile
+120-0llvm/test/CodeGen/WebAssembly/simd-bitmask.ll
+23-1llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
+143-12 files

LLVM/project 973a5a9llvm/test/CodeGen/AMDGPU amdgcn.bitcast.1024bit.ll amdgcn.bitcast.512bit.ll, llvm/test/CodeGen/RISCV/rvv clmulh-sdnode.ll

Merge branch 'arm-fp-fix' into arm-fp-flt
DeltaFile
+84,299-78,378llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.1024bit.ll
+66,293-29,491llvm/test/CodeGen/RISCV/rvv/clmulh-sdnode.ll
+25,754-24,794llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.512bit.ll
+23,631-20,343llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.960bit.ll
+21,843-18,635llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.896bit.ll
+19,086-16,499llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.832bit.ll
+240,906-188,1409,743 files not shown
+956,257-641,4579,749 files

LLVM/project d080cacllvm/test/CodeGen/AMDGPU amdgcn.bitcast.1024bit.ll amdgcn.bitcast.512bit.ll, llvm/test/CodeGen/RISCV/rvv clmulh-sdnode.ll

Merge branch 'arm-fp-f2d2f' into arm-fp-fix
DeltaFile
+84,299-78,378llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.1024bit.ll
+66,293-29,491llvm/test/CodeGen/RISCV/rvv/clmulh-sdnode.ll
+25,754-24,794llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.512bit.ll
+23,631-20,343llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.960bit.ll
+21,843-18,635llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.896bit.ll
+19,086-16,499llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.832bit.ll
+240,906-188,1409,743 files not shown
+956,257-641,4579,749 files

LLVM/project a25bf27llvm/test/CodeGen/AMDGPU amdgcn.bitcast.1024bit.ll amdgcn.bitcast.512bit.ll, llvm/test/CodeGen/RISCV/rvv clmulh-sdnode.ll

Merge branch 'arm-fp-fcmp' into arm-fp-f2d2f
DeltaFile
+84,299-78,378llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.1024bit.ll
+66,293-29,491llvm/test/CodeGen/RISCV/rvv/clmulh-sdnode.ll
+25,754-24,794llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.512bit.ll
+23,631-20,343llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.960bit.ll
+21,843-18,635llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.896bit.ll
+19,086-16,499llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.832bit.ll
+240,906-188,1409,743 files not shown
+956,257-641,4579,749 files

LLVM/project fe187f6compiler-rt/lib/builtins CMakeLists.txt

Stop trying to crt_supersede one Arm .S file with another

Turns out that doesn't work: both versions of the assembly language
comparison were included in the output library, and the linker would
make an arbitrary choice of which to pull in to the link. Instead,
just put the old files on to the SOURCES list in an else clause.
DeltaFile
+8-4compiler-rt/lib/builtins/CMakeLists.txt
+8-41 files