LLVM/project 9b04a3clibcxx/utils/ci/docker linux-builder-base.dockerfile docker-compose.yml

[libc++] Use compiler explorer for Clang as well and update to LLVM 23 as head (#185168)

Using the compiler explorer infrastucture simplifies the dockerfile a
bit, since we have a single source for compilers now instead of two
independent ones. compiler explorer is also usually significantly faster
at providing new versions than apt.llvm.org.
DeltaFile
+28-28libcxx/utils/ci/docker/linux-builder-base.dockerfile
+1-1libcxx/utils/ci/docker/docker-compose.yml
+29-292 files

LLVM/project ffdb484llvm/lib/Target/RISCV RISCVTargetTransformInfo.cpp RISCVTargetTransformInfo.h, llvm/test/Transforms/InstCombine/RISCV riscv-vmv-v-x.ll

[InstCombine/RISCV] Constant-fold bitcast(vmv.v.x) (#182630)

Constant-fold bitcast(vmv.v.x) to avoid creating redundant masks.

llc run showing vsetvli eliminated: https://godbolt.org/z/d1Gx3KqaT
DeltaFile
+186-0llvm/test/Transforms/InstCombine/RISCV/riscv-vmv-v-x.ll
+52-0llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
+3-0llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h
+241-03 files

LLVM/project 4a8e172mlir/lib/Dialect/Linalg/Transforms Vectorization.cpp, mlir/test/Dialect/Linalg/vectorization unsupported.mlir

[mlir][Linalg] Prevent vectorization of generic Conv with dynamic dims (#185415)

-- We should use `isaConvolutionOpInterface` instead as it accommodates
both named as well as generic convolution ops.
-- https://github.com/llvm/llvm-project/pull/176339 missed making one
such update to `vectorizeDynamicLinalgOpPrecondition` and it got exposed
in a downstream project.
-- This commit therefore aims to fix the same.

Signed-off-by: Abhishek Varma <abhvarma at amd.com>
DeltaFile
+30-0mlir/test/Dialect/Linalg/vectorization/unsupported.mlir
+1-1mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp
+31-12 files

LLVM/project c21a5acmlir/lib/Dialect/Vector/Transforms VectorTransferOpTransforms.cpp, mlir/lib/Dialect/Vector/Utils VectorUtils.cpp

[mlir][vector] Flatten transfer - support multi-dim scalar element (#185417)

Adds support for flattening multi-dimensional scalar vector transfers.

The addition prevents pattern crashes on such inputs and allows for
cleaner lowering of scalar vectors.
DeltaFile
+41-0mlir/test/Dialect/Vector/vector-transfer-flatten.mlir
+22-6mlir/lib/Dialect/Vector/Transforms/VectorTransferOpTransforms.cpp
+4-0mlir/lib/Dialect/Vector/Utils/VectorUtils.cpp
+67-63 files

LLVM/project 2e3bff6llvm/test/Analysis/DependenceAnalysis SymbolicSIV.ll SymbolicRDIV.ll, llvm/test/Transforms/LoopUnrollAndJam unroll-and-jam.ll

[DA] Test AddRecs are nsw before strong SIV test (#183421)

Currently Strong SIV test, does not check that the AddRecs involved do
not overflow. This is required for correctness of the tests. Strictly
speaking, the range-based independence check in Strong SIV relies on
SCEV which internally takes care of potential overflows, so this is
mainly needed for the divisibility test and distance/directions
calculations, but putting the test early in the function covers all the
cases anyways.
DeltaFile
+51-401llvm/test/Transforms/LoopUnrollAndJam/unroll-and-jam.ll
+27-35llvm/test/Analysis/DependenceAnalysis/SymbolicSIV.ll
+18-18llvm/test/Analysis/DependenceAnalysis/SymbolicRDIV.ll
+12-18llvm/test/Analysis/DependenceAnalysis/StrongSIV.ll
+9-16llvm/test/Analysis/DependenceAnalysis/weak-crossing-siv-addrec-wrap.ll
+16-9llvm/test/Analysis/DependenceAnalysis/strong-siv-addrec-wrap.ll
+133-49720 files not shown
+196-58426 files

LLVM/project b8c6a39llvm/lib/Target/X86 X86ISelLowering.cpp, llvm/test/CodeGen/AArch64 fp-maximumnum-minimumnum.ll

Merge branch 'main' into users/s-perron/dim_attribute
DeltaFile
+1,561-2,812llvm/test/CodeGen/X86/wide-scalar-shift-by-byte-multiple-legalization.ll
+2,071-1,930llvm/test/CodeGen/AArch64/fp-maximumnum-minimumnum.ll
+3,114-0llvm/test/CodeGen/X86/andnot-sink-not.ll
+969-2,001llvm/test/CodeGen/X86/bit-manip-i512.ll
+538-1,357llvm/test/CodeGen/X86/shift-i512.ll
+348-707llvm/lib/Target/X86/X86ISelLowering.cpp
+8,601-8,8071,660 files not shown
+58,436-31,1201,666 files

LLVM/project 6b059a0flang/include/flang/Optimizer/HLFIR Passes.td, flang/lib/Optimizer/HLFIR/Transforms SimplifyHLFIRIntrinsics.cpp

[flang] Inline max/minval according to -ffp-maxmin-behavior. (#185148)

This patch takes into account the option setting when inlining
max/minval intrinsics. It is not an NFC change for Flang, because:
  * Inlining for integer types now uses arith.max/minsi operations.
  * We do not mark the reduction loops as `unordered`
    under `reassoc` FMF. I think this was not quite correct.

Otherwise, the default Legacy setting should produce the same
MLIR as before.
DeltaFile
+557-0flang/test/HLFIR/simplify-hlfir-intrinsics-maxmin.fir
+161-35flang/lib/Optimizer/HLFIR/Transforms/SimplifyHLFIRIntrinsics.cpp
+16-7flang/test/HLFIR/simplify-hlfir-intrinsics-maxval.fir
+16-7flang/test/HLFIR/simplify-hlfir-intrinsics-minval.fir
+16-1flang/include/flang/Optimizer/HLFIR/Passes.td
+6-4flang/lib/Optimizer/Passes/Pipelines.cpp
+772-541 files not shown
+773-547 files

LLVM/project fd676d8mlir/lib/Dialect/LLVMIR/IR LLVMTypes.cpp, mlir/test/IR test-func-erase-result.mlir

[MLIR][LLVM] Fix crash in LLVMFunctionType::clone when erasing void function results (#185093)

LLVMFunctionType::clone(inputs, results) was asserting that
results.size() == 1, which caused a crash (later changed to return
null/failure) when erasing results from a void llvm.func via
FunctionOpInterface::eraseResults.

For LLVM function types, an empty results range maps to void return: the
FunctionOpInterface represents void llvm.func with 0 results, while the
underlying LLVMFunctionType stores an explicit LLVMVoidType. When
erasing all results (or no-op erasing 0 results from a void function),
the interface passes an empty TypeRange to clone(), which should produce
a void function type.

Fix by accepting an empty results range in LLVMFunctionType::clone() and
mapping it to LLVMVoidType. More than one result remains invalid.

Fixes #128322

Assisted-by: Claude Code
DeltaFile
+9-2mlir/lib/Dialect/LLVMIR/IR/LLVMTypes.cpp
+3-1mlir/test/IR/test-func-erase-result.mlir
+12-32 files

LLVM/project f4e6226mlir/include/mlir/IR Iterators.h, mlir/test/IR visitors.mlir

[mlir] Fix crash in ForwardDominanceIterator when encountering graph regions (#185043)

ForwardDominanceIterator<NoGraphRegions=true> was asserting when it
encountered a region without SSA dominance (a "graph region"), such as
scf.forall.in_parallel's body. This crash was triggered by
-test-ir-visitors when walking functions that contain graph-region ops.

Change the behavior of ForwardDominanceIterator<true> and
ReverseDominanceIterator<true> to silently skip graph regions instead of
asserting, and update the documentation accordingly. This matches the
intended semantics of the NoGraphRegions flag: the traversal simply does
not enumerate blocks/ops inside such regions.

Fixes #116370

Assisted-by: Claude Code
DeltaFile
+26-16mlir/include/mlir/IR/Iterators.h
+22-0mlir/test/IR/visitors.mlir
+9-7mlir/test/lib/IR/TestVisitors.cpp
+57-233 files

LLVM/project 43a0e59llvm/lib/Target/Hexagon HexagonInstrInfo.cpp HexagonInstrInfo.h, llvm/lib/Target/Hexagon/MCTargetDesc HexagonBaseInfo.h

[Hexagon] Add new register input/output types for qf instructions (#184398)

The v81 iset has been updated with input and output register
types/extensions for instructions. Currently, it supports qf32/qf16
register types. This patch implements a qf reg type lookup to query
these types. In the future, the register type extractor can be improved
and more APIs can be added to support other register types.

Co-authored-by: <santdas at qti.qualcomm.com>
DeltaFile
+165-0llvm/lib/Target/Hexagon/HexagonInstrInfo.cpp
+30-0llvm/lib/Target/Hexagon/MCTargetDesc/HexagonBaseInfo.h
+11-0llvm/lib/Target/Hexagon/HexagonInstrInfo.h
+206-03 files

LLVM/project 85d5e6fclang/lib/CodeGen/TargetBuiltins PPC.cpp, clang/test/CodeGen/PowerPC builtins-ppc-amo.c builtins-amo-err.c

Add AMO load with Compare and Swap Not Equal

This commit adds support for lwat/ldat atomic operations with function
code 16 (Compare and Swap Not Equal) via 4 clang builtins:

__builtin_amo_lwat_csne for 32-bit unsigned operations
__builtin_amo_ldat_csne for 64-bit unsigned operations
__builtin_amo_lwat_csne_s for 32-bit signed operations
__builtin_amo_ldat_csne_s for 64-bit signed operations
DeltaFile
+90-0llvm/test/CodeGen/PowerPC/amo-enable.ll
+76-0clang/test/CodeGen/PowerPC/builtins-ppc-amo.c
+43-1llvm/lib/Target/PowerPC/PPCISelLowering.cpp
+21-0llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp
+16-0clang/test/CodeGen/PowerPC/builtins-amo-err.c
+14-0clang/lib/CodeGen/TargetBuiltins/PPC.cpp
+260-14 files not shown
+277-110 files

LLVM/project 77b29callvm/lib/Target/PowerPC PPCISelLowering.cpp PPCAsmPrinter.cpp, llvm/test/CodeGen/PowerPC amo-enable.ll

Address review comments
DeltaFile
+3-11llvm/test/CodeGen/PowerPC/amo-enable.ll
+4-3llvm/lib/Target/PowerPC/PPCISelLowering.cpp
+2-2llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp
+9-163 files

LLVM/project 6d3517bllvm/utils/TableGen/Common CodeGenRegisters.cpp

[TableGen] Fix ordering of register classes with artificial members.

The current implementation wouldn't advance IB to skip artificial
registers once IA has reached the end.
DeltaFile
+8-10llvm/utils/TableGen/Common/CodeGenRegisters.cpp
+8-101 files

LLVM/project 013bdbellvm/lib/Target/ARM MVEGatherScatterLowering.cpp, llvm/test/CodeGen/Thumb2 mve-gather-increment.ll

[MVEGatherScatter] Fix GEP scale calculations (#185437)

The GEP scale for a single index GEP is the type alloc size of the
source element type. The pass was mostly computing it correctly, but two
places were doing something different.
DeltaFile
+152-139llvm/test/CodeGen/Thumb2/mve-gather-increment.ll
+2-2llvm/lib/Target/ARM/MVEGatherScatterLowering.cpp
+154-1412 files

LLVM/project e4580d2llvm/include/llvm/Transforms/Vectorize/SandboxVectorizer DependencyGraph.h, llvm/lib/Transforms/Vectorize/SandboxVectorizer DependencyGraph.cpp

[SandboxVec][DAG] Fix unscheduled succs when nodes are scheduled (#184946)

When we update use-def edges the DAG gets notified to update the
UnscheduledSuccs counters. However, if either edge node is already
scheduled we should not update UnscheduledSuccs because the
UnscheduledSuccs counter value should be treated as "undefined" after a
node has been scheduled, i.e., it's value has a meaning only before the
node gets scheduled.
DeltaFile
+36-0llvm/unittests/Transforms/Vectorize/SandboxVectorizer/DependencyGraphTest.cpp
+6-2llvm/lib/Transforms/Vectorize/SandboxVectorizer/DependencyGraph.cpp
+2-1llvm/include/llvm/Transforms/Vectorize/SandboxVectorizer/DependencyGraph.h
+44-33 files

LLVM/project f7eba03clang/include/clang/Basic BuiltinsAMDGPU.td, clang/test/CodeGenHIP builtins-amdgcn-gfx12-f16-w64.hip builtins-amdgcn-gfx12-f16-w32.hip

[Clang][AMDGPU] Change __fp16 to _Float16 in builtin definitions
DeltaFile
+96-0clang/test/CodeGenHIP/builtins-amdgcn-gfx12-f16-w64.hip
+96-0clang/test/CodeGenHIP/builtins-amdgcn-gfx12-f16-w32.hip
+88-0clang/test/CodeGenHIP/builtins-amdgcn-f16-misc.hip
+70-0clang/test/CodeGenHIP/builtins-amdgcn-gfx1250-f16-misc.hip
+27-0clang/test/CodeGenHIP/builtins-amdgcn-gfx950-f16.hip
+13-13clang/include/clang/Basic/BuiltinsAMDGPU.td
+390-138 files not shown
+399-2214 files

LLVM/project c0a3996llvm/utils/gn/secondary/llvm/lib/Target/X86 BUILD.gn, llvm/utils/gn/secondary/llvm/unittests/Target/X86 BUILD.gn

[gn] port 443ce5569ee9854c more
DeltaFile
+1-0llvm/utils/gn/secondary/llvm/lib/Target/X86/BUILD.gn
+1-0llvm/utils/gn/secondary/llvm/unittests/Target/X86/BUILD.gn
+2-02 files

LLVM/project 2a9372fclang/docs TypeSanitizer.rst, compiler-rt/lib/tysan tysan.cpp tysan_flags.inc

halt_on_error flag for TySan and docs (#182479)
DeltaFile
+23-0compiler-rt/test/tysan/halt_on_error.c
+12-0clang/docs/TypeSanitizer.rst
+5-0compiler-rt/lib/tysan/tysan.cpp
+2-0compiler-rt/lib/tysan/tysan_flags.inc
+42-04 files

LLVM/project ce22796mlir/include/mlir/Dialect/XeGPU/TransformOps XeGPUTransformOps.td, mlir/lib/Dialect/XeGPU/TransformOps XeGPUTransformOps.cpp

[mlir][xegpu] Add support for setting `order` in `SetDescLayoutOp` and `SetOpLayoutAttrOp` transform ops. (#184705)

Currently XeGPU transform dialect does not allow the user to set the
`order` attribute of a layout in `SetDescLayoutOp` and
`SetOpLayoutAttrOp`. This PR adds `order` as an optional argument to
these transform ops.
DeltaFile
+66-6mlir/test/Dialect/XeGPU/transform-ops.mlir
+51-0mlir/test/python/dialects/transform_xegpu_ext.py
+26-19mlir/lib/Dialect/XeGPU/TransformOps/XeGPUTransformOps.cpp
+14-2mlir/include/mlir/Dialect/XeGPU/TransformOps/XeGPUTransformOps.td
+16-0mlir/python/mlir/dialects/transform/xegpu.py
+173-275 files

LLVM/project 87e21efclang/lib/Sema SemaHLSL.cpp HLSLBuiltinTypeDeclBuilder.cpp, clang/test/AST/HLSL TypedBuffers-AST.hlsl

Gemini code review.
DeltaFile
+31-14clang/lib/Sema/SemaHLSL.cpp
+13-8clang/lib/Sema/HLSLBuiltinTypeDeclBuilder.cpp
+10-0clang/test/AST/HLSL/TypedBuffers-AST.hlsl
+4-1llvm/lib/Target/SPIRV/SPIRVInstructionSelector.cpp
+2-2clang/test/CodeGenHLSL/resources/Texture2D-Mips.hlsl
+2-2clang/test/CodeGenHLSL/resources/Texture2D-default-explicit-binding.hlsl
+62-272 files not shown
+64-308 files

LLVM/project 8f668cemlir/lib/Target/LLVMIR/Dialect/OpenMP OpenMPToLLVMIRTranslation.cpp, mlir/test/Target/LLVMIR openmp-teams-distribute-reduction.mlir openmp-teams-reduction.mlir

[MLIR][OpenMP] Prevent teams reductions from deadlocking (#184625)

Currently, simple Fortran reductions like the example below cause a
deadlock at runtime:

```f90
integer :: i, x

!$omp teams distribute reduction(+:x)
do i=1, 10
  x = x + 1
end do
```

Preventing a redundant barrier from being added in that case addresses
this issue. Synchronization is already being handled by the
`__kmpc_reduce` and `__kmpc_end_reduce` runtime calls for the host, and
by the OMPIRBuilder-generated `_omp_reduction_inter_warp_copy_func`
function for GPUs.
DeltaFile
+9-5mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
+0-6mlir/test/Target/LLVMIR/openmp-teams-distribute-reduction.mlir
+0-4mlir/test/Target/LLVMIR/openmp-teams-reduction.mlir
+0-2mlir/test/Target/LLVMIR/omptarget-teams-distribute-reduction.mlir
+0-1mlir/test/Target/LLVMIR/omptarget-teams-reduction.mlir
+9-185 files

LLVM/project 8ef5ff6flang/lib/Lower/OpenMP OpenMP.cpp Utils.cpp, flang/lib/Optimizer/OpenMP DoConcurrentConversion.cpp

Review comments
DeltaFile
+1-9flang/lib/Lower/OpenMP/OpenMP.cpp
+9-0flang/lib/Lower/OpenMP/Utils.cpp
+8-0flang/lib/Lower/OpenMP/Utils.h
+1-6flang/lib/Lower/OpenMP/ClauseProcessor.cpp
+7-0flang/lib/Utils/OpenMP.cpp
+1-0flang/lib/Optimizer/OpenMP/DoConcurrentConversion.cpp
+27-156 files

LLVM/project ba1f3d1lldb/unittests/Platform PlatformDarwinTest.cpp

[lldb][test] PlatformDarwinTest.cpp: add more test-cases

Add various test-cases that exercise `LocateExecutableScriptingResourcesFromDSYM` further.
DeltaFile
+303-5lldb/unittests/Platform/PlatformDarwinTest.cpp
+303-51 files

LLVM/project dcf2161lldb/unittests/Platform PlatformDarwinTest.cpp

[lldb][test] PlatformDarwinTest.cpp: move directory creation into SetUp

So it can be shared by future test-cases.

Drive-by change:
* Make the `TestParseVersionBuildDir` not depend on the test-fixture,
  since it doesn't require any directory/debugger setup
DeltaFile
+60-47lldb/unittests/Platform/PlatformDarwinTest.cpp
+60-471 files

LLVM/project c3d040bllvm/utils/gn/secondary/bolt/unittests/Core BUILD.gn, llvm/utils/gn/secondary/llvm/lib/Target/X86 BUILD.gn

[gn] port 443ce5569ee9854c (X86 SDNodeInfo)
DeltaFile
+10-0llvm/utils/gn/secondary/llvm/lib/Target/X86/BUILD.gn
+4-1llvm/utils/gn/secondary/bolt/unittests/Core/BUILD.gn
+14-12 files

LLVM/project 0da2aecllvm/include/llvm/Analysis TargetTransformInfoImpl.h, llvm/test/Transforms/SLPVectorizer non-power-of-2-bswap.ll

[SLP]Invalid cost for non-power-of-2 bswaps (#185407)

bswaps are supported only for power-of-2 types, need to disable it for
the default cost model to fix a compiler crash.

Fixes
https://github.com/llvm/llvm-project/pull/184018#issuecomment-4022697189
DeltaFile
+55-0llvm/test/Transforms/SLPVectorizer/non-power-of-2-bswap.ll
+4-0llvm/include/llvm/Analysis/TargetTransformInfoImpl.h
+59-02 files

LLVM/project d15ca01llvm/utils/gn/secondary/lldb/source/Plugins BUILD.gn, llvm/utils/gn/secondary/lldb/source/Plugins/LanguageRuntime/CPlusPlus BUILD.gn

[gn] port 58efc426d70 (de-plugin lldb ItaniumABI)
DeltaFile
+0-21llvm/utils/gn/secondary/lldb/source/Plugins/LanguageRuntime/CPlusPlus/ItaniumABI/BUILD.gn
+11-0llvm/utils/gn/secondary/lldb/source/Plugins/LanguageRuntime/CPlusPlus/BUILD.gn
+0-4llvm/utils/gn/secondary/lldb/source/Plugins/BUILD.gn
+11-253 files

LLVM/project 404b3eaclang/lib/CIR/CodeGen CIRGenBuiltinAArch64.cpp, clang/lib/CodeGen/TargetBuiltins ARM.cpp

[CIR][AArch64] Add support for the remaining `vceqz` builtins

Implement the remaining CIR lowerings for the AdvSIMD (Neon)
`vceqz` intrinsic group (bitwise equal to zero).

Most variants of `vceqz` variant were already supported; this patch
completes the rest of the group [1] that was left as a TODO.

Tests for these intrinsics are moved from:
  * test/CodeGen/AArch64/neon_intrinsics.c
  * test/CodeGen/AArch64/v8.2a-fp16-intrinsics.c

to:
  * test/CodeGen/AArch64/neon/intrinsics.c
  * test/CodeGen/AArch64/neon/fullfp16,

respectively.

The implementation largely mirrors the existing lowering in

    [4 lines not shown]
DeltaFile
+45-2clang/test/CodeGen/AArch64/neon/intrinsics.c
+0-33clang/test/CodeGen/AArch64/neon-intrinsics.c
+20-0clang/test/CodeGen/AArch64/neon/fullfp16.c
+8-4clang/lib/CIR/CodeGen/CIRGenBuiltinAArch64.cpp
+0-8clang/test/CodeGen/AArch64/v8.2a-fp16-intrinsics.c
+3-4clang/lib/CodeGen/TargetBuiltins/ARM.cpp
+76-516 files

LLVM/project 0bf9bb5flang/lib/Optimizer/OpenMP MapInfoFinalization.cpp, flang/test/Transforms omp-map-info-finalization-usm.fir

[Flang][OpenMP] Fix close map flag propagation for derived types in USM (#185330)

This fixes a bug in USM mode where the `close` map type modifer was
attached to some `map.info.op`'s corresponding to user-defined type
members while the parent type instance itself is not marked as `close`.

This fix ensures that if a parent record type map does not have the
'close' flag, it is cleared from its members as well, maintaining
consistency.

Gemini was used to create tests. AI generated test code was reviewed
line-by-line by me. Which were derived from a reproducer I was working
with to debug the issue.

Assisted-by: Gemini <gemini at google.com>
DeltaFile
+35-0flang/lib/Optimizer/OpenMP/MapInfoFinalization.cpp
+35-0offload/test/offloading/fortran/usm_derived_type_allocatable_member.f90
+24-0flang/test/Transforms/omp-map-info-finalization-usm.fir
+94-03 files

LLVM/project fc86350clang-tools-extra/clang-tidy/performance UseStdMoveCheck.cpp, clang-tools-extra/test/clang-tidy/checkers/performance use-std-move.cpp

[clang-tidy] Improve performance-use-std-move in presence of control-flow (#184136)
DeltaFile
+87-0clang-tools-extra/test/clang-tidy/checkers/performance/use-std-move.cpp
+69-11clang-tools-extra/clang-tidy/performance/UseStdMoveCheck.cpp
+156-112 files