LLVM/project e7ac60cutils/bazel/llvm-project-overlay/mlir BUILD.bazel

[Bazel] Fixes ce1a9fd (#190577)

This fixes ce1a9fd76640929fe340c5c5d1bb493ea09ca9bc.

Co-authored-by: Google Bazel Bot <google-bazel-bot at google.com>
DeltaFile
+2-0utils/bazel/llvm-project-overlay/mlir/BUILD.bazel
+2-01 files

LLVM/project baa1e50flang/lib/Semantics check-cuda.cpp, flang/test/Semantics cuf25.cuf

[flang][cuda] Do not consider kernel result as host variable (#190626)
DeltaFile
+8-0flang/test/Semantics/cuf25.cuf
+2-2flang/lib/Semantics/check-cuda.cpp
+10-22 files

LLVM/project 9265f92mlir/include/mlir/Dialect/LLVMIR LLVMDialect.td, mlir/lib/Target/LLVMIR AttrKindDetail.h

[mlir][ABI] Add writable, dead_on_unwind, dead_on_return, nofpclass param attrs to LLVM dialect (#188374)

The MLIR LLVM dialect is missing support for several parameter
attributes that
exist in LLVM IR: `writable`, `dead_on_unwind`, `dead_on_return`, and
`nofpclass`. This adds them to the kind-to-name mapping in
`AttrKindDetail.h`
and the corresponding name accessors in `LLVMDialect.td`.

The existing generic conversion infrastructure in `ModuleTranslation`
and
`ModuleImport` picks them up automatically — `writable` and
`dead_on_unwind`
round-trip as `UnitAttr`, while `dead_on_return` and `nofpclass`
round-trip as
`IntegerAttr`.

CIR needs these to match classic codegen's ABI output (sret gets
`writable

    [2 lines not shown]
DeltaFile
+35-0mlir/test/Target/LLVMIR/llvmir.mlir
+15-3mlir/test/Target/LLVMIR/Import/function-attributes.ll
+7-0mlir/lib/Target/LLVMIR/AttrKindDetail.h
+4-0mlir/include/mlir/Dialect/LLVMIR/LLVMDialect.td
+61-34 files

LLVM/project 348295aclang/include/clang/CIR/Dialect/IR CIROps.td, clang/lib/CIR/CodeGen CIRGenExprAggregate.cpp

[CIR] Use data size in emitAggregateCopy for overlapping copies (#186702)

Add skip_tail_padding property to cir.copy to handle
potentially-overlapping
subobject copies directly, instead of falling back to cir.libc.memcpy.
When
set, the lowering uses the record's data size (excluding tail padding)
for
the memcpy length. This keeps typed semantics and promotability of
cir.copy.

Also fix CXXABILowering to preserve op properties when recreating
operations,
and expose RecordType::computeStructDataSize() for computing data size
of
padded record types.
DeltaFile
+73-0clang/test/CIR/CodeGen/aggregate-copy-overlap.cpp
+21-5clang/include/clang/CIR/Dialect/IR/CIROps.td
+23-0clang/lib/CIR/Dialect/IR/CIRTypes.cpp
+11-6clang/lib/CIR/CodeGen/CIRGenExprAggregate.cpp
+11-0clang/test/CIR/IR/invalid-copy.cir
+6-0clang/test/CIR/IR/copy.cir
+145-117 files not shown
+164-2013 files

LLVM/project 930ef77mlir/include/mlir/Dialect/AMDGPU/IR AMDGPUOps.td, mlir/lib/Conversion/AMDGPUToROCDL AMDGPUToROCDL.cpp

[mlir][amdgpu] Add optional write mask to amdgpu.global_load_async_to_lds (#190498)
DeltaFile
+17-0mlir/test/Conversion/AMDGPUToROCDL/gfx1250.mlir
+13-0mlir/lib/Conversion/AMDGPUToROCDL/AMDGPUToROCDL.cpp
+5-2mlir/include/mlir/Dialect/AMDGPU/IR/AMDGPUOps.td
+5-1mlir/test/Dialect/AMDGPU/ops.mlir
+40-34 files

LLVM/project 06e666allvm/test/Analysis/DependenceAnalysis banerjee-overflow.ll

[DA] Add overflow test for BanerjeeMIVtest (#190468)
DeltaFile
+75-0llvm/test/Analysis/DependenceAnalysis/banerjee-overflow.ll
+75-01 files

LLVM/project 8d7823eclang/lib/CIR/CodeGen CIRGenBuiltinAArch64.cpp, clang/test/CodeGen/AArch64 neon-intrinsics.c

[CIR][AArch64] Added vector intrinsics for shift left (#187516)

Added vector intrinsics for 
vshlq_n_s8
vshlq_n_s16
vshlq_n_s32
vshlq_n_s64
vshlq_n_u8
vshlq_n_u16
vshlq_n_u32
vshlq_n_u64

vshl_n_s8
vshl_n_s16
vshl_n_s32
vshl_n_s64
vshl_n_u8
vshl_n_u16
vshl_n_u32

    [21 lines not shown]
DeltaFile
+231-45clang/test/CodeGen/AArch64/neon/intrinsics.c
+0-184clang/test/CodeGen/AArch64/neon-intrinsics.c
+31-1clang/lib/CIR/CodeGen/CIRGenBuiltinAArch64.cpp
+262-2303 files

LLVM/project 34a1639llvm/include/llvm/Analysis DependenceAnalysis.h, llvm/lib/Analysis DependenceAnalysis.cpp

[DA] Use SmallVector instead of raw new/delete (NFC) (#190586)

Some functions used `new`/`delete` to allocate/free arrays. To avoid
memory leaks, it would be better to avoid using raw pointers. This patch
replaces the use of them with `SmallVector`.
DeltaFile
+33-27llvm/lib/Analysis/DependenceAnalysis.cpp
+18-16llvm/include/llvm/Analysis/DependenceAnalysis.h
+51-432 files

LLVM/project 4994a97flang/lib/Semantics openmp-utils.cpp

[flang][OpenMP] Remove namespace qualification from GetUpperName, NFC (#190619)

This applies to flang/lib/Semantics/openmp-utils.cpp, since it contains
`using namespace Fortran::parser::omp`.
DeltaFile
+5-6flang/lib/Semantics/openmp-utils.cpp
+5-61 files

LLVM/project 7ceeb36llvm/lib/Transforms/Vectorize SLPVectorizer.cpp

Address comments

Created using spr 1.3.7
DeltaFile
+3-2llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+3-21 files

LLVM/project 3fee094flang/lib/Semantics openmp-utils.cpp

[flang][OpenMP] Remove namespace qualification from GetUpperName, NFC

This applies to flang/lib/Semantics/openmp-utils.cpp, since it contains
`using namespace Fortran::parser::omp`.
DeltaFile
+5-6flang/lib/Semantics/openmp-utils.cpp
+5-61 files

LLVM/project 66f9001llvm/lib/Transforms/Vectorize SLPVectorizer.cpp, llvm/test/Transforms/SLPVectorizer/AArch64 extracts-from-scalarizable-vector.ll

[𝘀𝗽𝗿] initial version

Created using spr 1.3.7
DeltaFile
+13-96llvm/test/Transforms/SLPVectorizer/X86/bool-mask.ll
+45-29llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+42-1llvm/test/Transforms/SLPVectorizer/X86/identity-match-splat-less-defined.ll
+17-18llvm/test/Transforms/SLPVectorizer/X86/reduced-value-stored.ll
+8-19llvm/test/Transforms/SLPVectorizer/AArch64/extracts-from-scalarizable-vector.ll
+5-16llvm/test/Transforms/SLPVectorizer/X86/inversed-icmp-to-gather.ll
+130-1794 files not shown
+161-20110 files

LLVM/project f72e1ceclang/test/Headers __clang_hip_math.hip, llvm/test/CodeGen/AMDGPU clmul.ll integer-mad-patterns.ll

Merge branch 'main' into users/cabbaken/04-04-_da_add_overflow_test_for_banerjeemivtest
DeltaFile
+3,666-5,073llvm/test/CodeGen/RISCV/rvv/expandload.ll
+4,371-0llvm/test/CodeGen/AMDGPU/clmul.ll
+1,318-117llvm/test/CodeGen/AMDGPU/integer-mad-patterns.ll
+736-647clang/test/Headers/__clang_hip_math.hip
+835-387llvm/test/CodeGen/AMDGPU/fcanonicalize.bf16.ll
+610-305llvm/test/CodeGen/AMDGPU/atomics-system-scope.ll
+11,536-6,529956 files not shown
+35,844-16,704962 files

LLVM/project bf2a97allvm/lib/Target/AMDGPU AMDGPUInstCombineIntrinsic.cpp, llvm/test/Transforms/InstCombine/AMDGPU mbcnt.ll llvm.amdgcn.wave.shuffle.ll

AMDGPU: Add range attribute to mbcnt intrinsic callsites (#189191)

It seems the known bits handling added in
686987a540bc176bceaad43ffe530cb3e88796d5
is insufficient to perform many range based optimizations. For some
reason
computeConstantRange doesn't fall back on KnownBits, and has a separate,
less used form which tries to use computeKnownBits.
DeltaFile
+236-15llvm/test/Transforms/InstCombine/AMDGPU/mbcnt.ll
+22-22llvm/test/Transforms/InstCombine/AMDGPU/llvm.amdgcn.wave.shuffle.ll
+22-2llvm/lib/Target/AMDGPU/AMDGPUInstCombineIntrinsic.cpp
+1-1llvm/test/Transforms/InstCombine/AMDGPU/canonicalize-add-to-gep.ll
+281-404 files

LLVM/project 297a70cclang/lib/CIR/CodeGen CIRGenModule.cpp, clang/test/CIR/CodeGen global-decomp-decls.cpp

[CIR] Implement global decomposition declarations (#190364)

No real challenge to these, it is effectively a copy/paste of the
classic codegen as it just requires we properly emit the holding
variable. The rest falls out of the rest of our handling of variables.
DeltaFile
+114-0clang/test/CIR/CodeGen/global-decomp-decls.cpp
+5-6clang/lib/CIR/CodeGen/CIRGenModule.cpp
+119-62 files

LLVM/project c4281fdllvm/include/llvm/Support KnownFPClass.h, llvm/lib/Support KnownFPClass.cpp

[Support][ValueTraking] Improve KnownFPClass for fadd. Handle infinity signs (#190559)

Improve KnownFPClass reasoning for fadd:

- Refine NaN handling for infinities by checking opposite-sign cases:
   - `-inf` + `+inf` --> `nan`
  - `+inf` + `-inf` --> `nan`
  - `+inf` + `+inf` --> `+inf`
  - `-inf` + `-inf` --> `-inf`
- Introduce `cannotBeOrderedLessEqZero` as pair to
`cannotBeOrderedGreaterEqZero`.
DeltaFile
+44-0llvm/test/Transforms/Attributor/nofpclass.ll
+11-0llvm/include/llvm/Support/KnownFPClass.h
+4-3llvm/lib/Support/KnownFPClass.cpp
+1-4llvm/test/Transforms/InstSimplify/known-never-nan.ll
+2-2llvm/test/Transforms/InstCombine/simplify-demanded-fpclass-fadd.ll
+62-95 files

LLVM/project 8519f41llvm/lib/Target/AMDGPU AMDGPUInstCombineIntrinsic.cpp

Update llvm/lib/Target/AMDGPU/AMDGPUInstCombineIntrinsic.cpp
DeltaFile
+1-1llvm/lib/Target/AMDGPU/AMDGPUInstCombineIntrinsic.cpp
+1-11 files

LLVM/project 59e899eclang/lib/AST/ByteCode Interp.h Interp.cpp, clang/test/AST/ByteCode cxx26.cpp

[clang][bytecode] Don't unref constexpr-unknown references (#190177)

If the pointer for a reference is constexpr-unknown, use the pointer
itself instead, instead of dereferencing it. Unfortunately, that means
constexpr-unknown pointers to reach a lot more places than before.
DeltaFile
+45-1clang/lib/AST/ByteCode/Interp.h
+29-6clang/lib/AST/ByteCode/Interp.cpp
+20-2clang/test/AST/ByteCode/cxx26.cpp
+4-8clang/test/SemaCXX/constant-expression-p2280r4.cpp
+7-5clang/lib/AST/ByteCode/Compiler.cpp
+10-0clang/lib/AST/ByteCode/Pointer.h
+115-222 files not shown
+127-238 files

LLVM/project 2ccc941llvm/lib/Target/AMDGPU VOP2Instructions.td

[AMDGPU] Mark two instructions as DPMACC (#190391)

It appears these were accidentally missed in #170319
DeltaFile
+2-2llvm/lib/Target/AMDGPU/VOP2Instructions.td
+2-21 files

LLVM/project 74ad441llvm/test/DebugInfo/Generic debug-info-enum-dwarf2.ll incorrect-variable-debugloc1-dwarf2.ll

Split DWARF v2 tests to exclude 64-bit AIX targets (#189077)

64-bit AIX requires DWARF64 format, which was only introduced in DWARF
v3. DWARF v2 only supports 32-bit DWARF format, making it incompatible
with 64-bit AIX (the compiler throws a fatal error). These changes split
DWARF v2 tests into separate files that exclude 64-bit AIX targets while
still running on 32-bit AIX and other 64-bit platforms where DWARF v2 is
supported.
DeltaFile
+15-0llvm/test/DebugInfo/Generic/debug-info-enum-dwarf2.ll
+10-0llvm/test/DebugInfo/Generic/incorrect-variable-debugloc1-dwarf2.ll
+6-0llvm/test/DebugInfo/Generic/restrict-dwarf2.ll
+2-4llvm/test/DebugInfo/Generic/debug-info-enum.ll
+2-1llvm/test/DebugInfo/Generic/incorrect-variable-debugloc1.ll
+2-1llvm/test/DebugInfo/Generic/restrict.ll
+37-66 files

LLVM/project 4670f59llvm/test/Analysis/DependenceAnalysis banerjee-overflow.ll

update
DeltaFile
+5-4llvm/test/Analysis/DependenceAnalysis/banerjee-overflow.ll
+5-41 files

LLVM/project 4376a41llvm/test/Transforms/LoopVectorize find-last-iv-sinkable-expr-epilogue.ll, llvm/test/Transforms/LoopVectorize/AArch64 epilog-iv-live-outs.ll find-last-iv-sinkable-expr-epilogue.ll

Address comments

Created using spr 1.3.7
DeltaFile
+257-0llvm/test/Transforms/LoopVectorize/AArch64/epilog-iv-live-outs.ll
+209-23mlir/lib/Dialect/LLVMIR/IR/NVVMDialect.cpp
+206-14mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td
+212-0llvm/test/Transforms/LoopVectorize/find-last-iv-sinkable-expr-epilogue.ll
+189-0mlir/test/Target/LLVMIR/nvvm/convert_s2f6x2.mlir
+172-0llvm/test/Transforms/LoopVectorize/AArch64/find-last-iv-sinkable-expr-epilogue.ll
+1,245-37135 files not shown
+2,509-1,044141 files

LLVM/project b6e7c47llvm/lib/CodeGen/SelectionDAG ScheduleDAGRRList.cpp, llvm/test/CodeGen/ARM pr190497.ll

[CodeGen] Ignore `ANNOTATION_LABEL` in scheduler (#190499)

This fixes a crash in `clang` for `armv7` targets when optimizations are
enabled.

Fixes #190497
DeltaFile
+39-0llvm/test/CodeGen/ARM/pr190497.ll
+1-0llvm/lib/CodeGen/SelectionDAG/ScheduleDAGRRList.cpp
+40-02 files

LLVM/project 0403639llvm/lib/Transforms/Vectorize VPlan.cpp, llvm/test/Transforms/LoopVectorize early_exit_with_outer_loop.ll

[VPlan] Skip successors outside any loop when updating LoopInfo. (#190553)

Successors outside of any loop do not contribute to the innermost loop,
skip them to avoid incorrect results due to
getSmallestCommonLoop(nullptr, X) returning nullptr.
DeltaFile
+115-0llvm/test/Transforms/LoopVectorize/early_exit_with_outer_loop.ll
+15-8llvm/lib/Transforms/Vectorize/VPlan.cpp
+130-82 files

LLVM/project 05ff170llvm/lib/Transforms/InstCombine InstCombineShifts.cpp InstCombineCompares.cpp, llvm/test/Transforms/InstCombine icmp-shl-add-to-add.ll apint-shift.ll

[InstCombine] Fix #163110: Support peeling off matching shifts from icmp operands via canEvaluateShifted (#165975)

Consider a pattern like `icmp (shl nsw X, L), (add nsw (shl nsw Y, L),
K)`. When the constant K is a multiple of 2^L, this can be simplified to
`icmp X, (add nsw Y, K >> L)`.
This patch extends canEvaluateShifted to support `Instruction::Add` and
updates its signature to accept `Instruction::BinaryOps` instead of a
boolean. This change allows the function to distinguish between LShr and
AShr requirements, ensuring that information is preserved according to
the signedness and overflow flags (nsw/nuw) of the operands.
The logic is integrated into `foldICmpCommutative` to enable peeling off
matching shifts from both sides of a comparison even when an offset is
present.

Fixes: #163110
DeltaFile
+311-0llvm/test/Transforms/InstCombine/icmp-shl-add-to-add.ll
+111-41llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp
+28-0llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
+14-0llvm/lib/Transforms/InstCombine/InstCombineInternal.h
+3-3llvm/test/Transforms/InstCombine/apint-shift.ll
+1-1llvm/test/Transforms/InstCombine/icmp-select.ll
+468-456 files

LLVM/project 3b02210llvm/utils/gn/secondary/lldb/source/Host BUILD.gn

[gn] fix mistake from 88f6b181b6ab2 (#190601)
DeltaFile
+1-1llvm/utils/gn/secondary/lldb/source/Host/BUILD.gn
+1-11 files

LLVM/project 4539d71llvm/lib/Target/AMDGPU AMDGPUResourceUsageAnalysis.cpp, llvm/test/CodeGen/AMDGPU resource-usage-asan-O0.ll

[AMDGPU] Preserve assumed stack size for ASan-instrumented functions at -O0
DeltaFile
+29-0llvm/test/CodeGen/AMDGPU/resource-usage-asan-O0.ll
+18-4llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp
+47-42 files

LLVM/project ded8e89llvm/test/CodeGen/AMDGPU amdgpu-sw-lower-lds-multi-static-dynamic-indirect-access-asan.ll amdgpu-sw-lower-lds-static-dynamic-indirect-access-asan.ll

[AMDGPU] Use ASan callback functions instead of inline checks in SW lower LDS pass
DeltaFile
+31-157llvm/test/CodeGen/AMDGPU/amdgpu-sw-lower-lds-multi-static-dynamic-indirect-access-asan.ll
+8-119llvm/test/CodeGen/AMDGPU/amdgpu-sw-lower-lds-static-dynamic-indirect-access-asan.ll
+6-117llvm/test/CodeGen/AMDGPU/amdgpu-sw-lower-lds-dynamic-indirect-access-asan.ll
+3-118llvm/test/CodeGen/AMDGPU/amdgpu-sw-lower-lds-static-lds-test-atomicrmw-asan.ll
+7-98llvm/test/CodeGen/AMDGPU/amdgpu-sw-lower-lds-static-indirect-access-asan.ll
+4-89llvm/test/CodeGen/AMDGPU/amdgpu-sw-lower-lds-static-dynamic-lds-test-asan.ll
+59-6987 files not shown
+121-96913 files

LLVM/project 64a0bd1llvm/lib/Transforms/Vectorize LoopVectorize.cpp LoopVectorizationPlanner.h

[LV] Return best VPlan together with VF from computeBestVF (NFC). (#190385)

computeBestVF iterates over all VPlans and picks the VF of the most
profitable VPlan. This VPlan is later needed for execution and
additional checks. Instead of retrieving it multiple times later, just
directly return it from computeBestVF.

This removes some redundant lookups.

PR: https://github.com/llvm/llvm-project/pull/190385
DeltaFile
+33-29llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+8-6llvm/lib/Transforms/Vectorize/LoopVectorizationPlanner.h
+41-352 files

LLVM/project 4cce6f8llvm/test/tools/llvm-ir2vec/bindings ir2vec-initEmbedding.py ir2vec-getInstEmbMap.py, llvm/tools/llvm-ir2vec/Bindings PyIR2Vec.cpp

[llvm-ir2vec] Added Enum for ir2vec embedding mode (#190466)

Currently, the initEmbedding() takes mode as an input. This input is a
string input. This PR introduces a patch to take the input as an enum
value.
DeltaFile
+19-7llvm/test/tools/llvm-ir2vec/bindings/ir2vec-initEmbedding.py
+12-12llvm/tools/llvm-ir2vec/Bindings/PyIR2Vec.cpp
+3-1llvm/test/tools/llvm-ir2vec/bindings/ir2vec-getInstEmbMap.py
+3-1llvm/test/tools/llvm-ir2vec/bindings/ir2vec-getFuncNames.py
+3-1llvm/test/tools/llvm-ir2vec/bindings/ir2vec-getFuncEmbMap.py
+3-1llvm/test/tools/llvm-ir2vec/bindings/ir2vec-getFuncEmb.py
+43-231 files not shown
+46-247 files