LLVM/project 0e8db6bclang/lib/CIR/CodeGen CIRGenBuiltinAArch64.cpp, clang/test/CodeGen/AArch64/neon intrinsics.c

clang-format

Created using spr 1.3.7
DeltaFile
+4-3llvm/lib/CAS/MappedFileRegionArena.cpp
+2-2clang/lib/CIR/CodeGen/CIRGenBuiltinAArch64.cpp
+2-1llvm/lib/CAS/OnDiskGraphDB.cpp
+2-1clang/test/CodeGen/AArch64/neon/intrinsics.c
+10-74 files

LLVM/project 3d4e02cllvm/tools/llvm-cas-fuzzer cas-fuzzer.cpp DummyCASFuzzer.cpp

[𝘀𝗽𝗿] initial version

Created using spr 1.3.7
DeltaFile
+388-0llvm/tools/llvm-cas-fuzzer/cas-fuzzer.cpp
+14-0llvm/tools/llvm-cas-fuzzer/DummyCASFuzzer.cpp
+10-0llvm/tools/llvm-cas-fuzzer/CMakeLists.txt
+412-03 files

LLVM/project 83d1aeallvm/include/llvm/CAS MappedFileRegionArena.h, llvm/lib/CAS OnDiskTrieRawHashMap.cpp MappedFileRegionArena.cpp

[𝘀𝗽𝗿] initial version

Created using spr 1.3.7
DeltaFile
+49-2llvm/lib/CAS/OnDiskTrieRawHashMap.cpp
+22-9llvm/lib/CAS/MappedFileRegionArena.cpp
+22-1llvm/lib/CAS/OnDiskGraphDB.cpp
+6-1llvm/lib/CAS/DatabaseFile.cpp
+1-1llvm/include/llvm/CAS/MappedFileRegionArena.h
+100-145 files

LLVM/project 38c53b3clang/lib/CIR/CodeGen CIRGenBuiltinAArch64.cpp, clang/test/CodeGen/AArch64/neon intrinsics.c

[clang][cir][nfc] Fix comments, add missing EOF (#190623)
DeltaFile
+2-2clang/lib/CIR/CodeGen/CIRGenBuiltinAArch64.cpp
+2-1clang/test/CodeGen/AArch64/neon/intrinsics.c
+4-32 files

LLVM/project b44d2c9llvm/lib/Target/RISCV RISCVISelLowering.cpp, llvm/test/CodeGen/RISCV/rvv pr189037.ll

[RISCV] Use a vector MemVT when converting store+extractelt into a vector store. (#190107)

This is needed so that `allowsMemoryAccessForAlignment` checks for
unaligned vector memory
support instead of unaligned scalar memory support when called from
`RISCVTargetLowering::expandUnalignedVPStore`

While there remove incorrect setting of the truncating store flag
on the vector instruction. And restrict the transform to simple stores
since we don't have tests for volatile or atomic.

Fixes #189037
DeltaFile
+14-0llvm/test/CodeGen/RISCV/rvv/pr189037.ll
+6-4llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+20-42 files

LLVM/project 0d14772llvm/lib/Target/RISCV RISCVInstrInfoP.td, llvm/test/CodeGen/RISCV rv64p.ll rv32p.ll

[RISCV][P-ext] Add isel patterns for for macc*.h00/macc*.w00. (#190444)

The RV32 macc*.h00 instructions take the lower half words from rs1 and
rs2, compute the full word product by extending the inputs, and
add to rd. The RV64 macc*.w00 is similar but operates on words
and produces a double word result.

I've restricted this to case where the multiply has a single use.
We don't have a general macc that multiplies the full xlen bits
of rs1 and rs2, so I'm allowing the input to be sext_inreg/and or
have sufficient sign/zero bits according to
ComputeNumSignBits/computeKnownBits.

We should also add mul*.h00/mul.*w00 patterns, but those we should
restrict to at least one input being sext_inreg/and and prefer
regular mul when there are no sext_inreg/and.
DeltaFile
+114-0llvm/test/CodeGen/RISCV/rv64p.ll
+114-0llvm/test/CodeGen/RISCV/rv32p.ll
+14-0llvm/lib/Target/RISCV/RISCVInstrInfoP.td
+242-03 files

LLVM/project 0bef4c7llvm/lib/Target/AMDGPU VOP3Instructions.td, llvm/test/CodeGen/AMDGPU and_or.ll or3.ll

[AMDGPU] Add v2i32 and/or patterns for VOP3 AND_OR and OR3 operations (#188375)

Add ThreeOp_v2i32_Pats pattern class to support v2i32 vector operations
for AND_OR_B32 and OR3_B32 instructions. The new patterns check the
v2i32 and-or or or-or instruction sequence, extract individual 32-bit
elements from v2i32 operands, and applies the and_or or or3 vop3
operations.
DeltaFile
+299-0llvm/test/CodeGen/AMDGPU/and_or.ll
+205-0llvm/test/CodeGen/AMDGPU/or3.ll
+20-0llvm/lib/Target/AMDGPU/VOP3Instructions.td
+524-03 files

LLVM/project c067528llvm/include/llvm/TargetParser Triple.h, llvm/lib/TargetParser Triple.cpp

Triple: Add constructor from enum entries

Don't require hardcoding the string names.
DeltaFile
+40-0llvm/unittests/TargetParser/TripleTest.cpp
+7-0llvm/lib/TargetParser/Triple.cpp
+4-0llvm/include/llvm/TargetParser/Triple.h
+51-03 files

LLVM/project 5b33f85llvm/test/CodeGen/AMDGPU amdgpu-attributor-min-agpr-alloc.ll attributor-wwm.ll, llvm/test/CodeGen/AMDGPU/GlobalISel divergence-divergent-i1-phis-no-lane-mask-merging.ll

[AMDGPU] Change isSingleLaneExecution to account for WWM enabling lanes even if there's only one workitem (#188316)

This issue was discovered during some downstream work around Vulkan CTS
tests, specifically
`dEQP-VK.subgroups.arithmetic.compute.subgroupadd_float`

---------

Co-authored-by: Matt Arsenault <arsenm2 at gmail.com>
DeltaFile
+70-110llvm/test/Transforms/SimpleLoopUnswitch/AMDGPU/nontrivial-unswitch-divergent-target.ll
+39-38llvm/test/CodeGen/AMDGPU/amdgpu-attributor-min-agpr-alloc.ll
+36-33llvm/test/CodeGen/AMDGPU/GlobalISel/divergence-divergent-i1-phis-no-lane-mask-merging.ll
+64-0llvm/test/CodeGen/AMDGPU/attributor-wwm.ll
+21-21llvm/test/CodeGen/AMDGPU/propagate-waves-per-eu.ll
+19-19llvm/test/CodeGen/AMDGPU/annotate-kernel-features-hsa-call.ll
+249-22136 files not shown
+400-34142 files

LLVM/project e7ac60cutils/bazel/llvm-project-overlay/mlir BUILD.bazel

[Bazel] Fixes ce1a9fd (#190577)

This fixes ce1a9fd76640929fe340c5c5d1bb493ea09ca9bc.

Co-authored-by: Google Bazel Bot <google-bazel-bot at google.com>
DeltaFile
+2-0utils/bazel/llvm-project-overlay/mlir/BUILD.bazel
+2-01 files

LLVM/project baa1e50flang/lib/Semantics check-cuda.cpp, flang/test/Semantics cuf25.cuf

[flang][cuda] Do not consider kernel result as host variable (#190626)
DeltaFile
+8-0flang/test/Semantics/cuf25.cuf
+2-2flang/lib/Semantics/check-cuda.cpp
+10-22 files

LLVM/project 9265f92mlir/include/mlir/Dialect/LLVMIR LLVMDialect.td, mlir/lib/Target/LLVMIR AttrKindDetail.h

[mlir][ABI] Add writable, dead_on_unwind, dead_on_return, nofpclass param attrs to LLVM dialect (#188374)

The MLIR LLVM dialect is missing support for several parameter
attributes that
exist in LLVM IR: `writable`, `dead_on_unwind`, `dead_on_return`, and
`nofpclass`. This adds them to the kind-to-name mapping in
`AttrKindDetail.h`
and the corresponding name accessors in `LLVMDialect.td`.

The existing generic conversion infrastructure in `ModuleTranslation`
and
`ModuleImport` picks them up automatically — `writable` and
`dead_on_unwind`
round-trip as `UnitAttr`, while `dead_on_return` and `nofpclass`
round-trip as
`IntegerAttr`.

CIR needs these to match classic codegen's ABI output (sret gets
`writable

    [2 lines not shown]
DeltaFile
+35-0mlir/test/Target/LLVMIR/llvmir.mlir
+15-3mlir/test/Target/LLVMIR/Import/function-attributes.ll
+7-0mlir/lib/Target/LLVMIR/AttrKindDetail.h
+4-0mlir/include/mlir/Dialect/LLVMIR/LLVMDialect.td
+61-34 files

LLVM/project 348295aclang/include/clang/CIR/Dialect/IR CIROps.td, clang/lib/CIR/CodeGen CIRGenExprAggregate.cpp

[CIR] Use data size in emitAggregateCopy for overlapping copies (#186702)

Add skip_tail_padding property to cir.copy to handle
potentially-overlapping
subobject copies directly, instead of falling back to cir.libc.memcpy.
When
set, the lowering uses the record's data size (excluding tail padding)
for
the memcpy length. This keeps typed semantics and promotability of
cir.copy.

Also fix CXXABILowering to preserve op properties when recreating
operations,
and expose RecordType::computeStructDataSize() for computing data size
of
padded record types.
DeltaFile
+73-0clang/test/CIR/CodeGen/aggregate-copy-overlap.cpp
+21-5clang/include/clang/CIR/Dialect/IR/CIROps.td
+23-0clang/lib/CIR/Dialect/IR/CIRTypes.cpp
+11-6clang/lib/CIR/CodeGen/CIRGenExprAggregate.cpp
+11-0clang/test/CIR/IR/invalid-copy.cir
+6-0clang/test/CIR/IR/copy.cir
+145-117 files not shown
+164-2013 files

LLVM/project 930ef77mlir/include/mlir/Dialect/AMDGPU/IR AMDGPUOps.td, mlir/lib/Conversion/AMDGPUToROCDL AMDGPUToROCDL.cpp

[mlir][amdgpu] Add optional write mask to amdgpu.global_load_async_to_lds (#190498)
DeltaFile
+17-0mlir/test/Conversion/AMDGPUToROCDL/gfx1250.mlir
+13-0mlir/lib/Conversion/AMDGPUToROCDL/AMDGPUToROCDL.cpp
+5-2mlir/include/mlir/Dialect/AMDGPU/IR/AMDGPUOps.td
+5-1mlir/test/Dialect/AMDGPU/ops.mlir
+40-34 files

LLVM/project 06e666allvm/test/Analysis/DependenceAnalysis banerjee-overflow.ll

[DA] Add overflow test for BanerjeeMIVtest (#190468)
DeltaFile
+75-0llvm/test/Analysis/DependenceAnalysis/banerjee-overflow.ll
+75-01 files

LLVM/project 8d7823eclang/lib/CIR/CodeGen CIRGenBuiltinAArch64.cpp, clang/test/CodeGen/AArch64 neon-intrinsics.c

[CIR][AArch64] Added vector intrinsics for shift left (#187516)

Added vector intrinsics for 
vshlq_n_s8
vshlq_n_s16
vshlq_n_s32
vshlq_n_s64
vshlq_n_u8
vshlq_n_u16
vshlq_n_u32
vshlq_n_u64

vshl_n_s8
vshl_n_s16
vshl_n_s32
vshl_n_s64
vshl_n_u8
vshl_n_u16
vshl_n_u32

    [21 lines not shown]
DeltaFile
+231-45clang/test/CodeGen/AArch64/neon/intrinsics.c
+0-184clang/test/CodeGen/AArch64/neon-intrinsics.c
+31-1clang/lib/CIR/CodeGen/CIRGenBuiltinAArch64.cpp
+262-2303 files

LLVM/project 34a1639llvm/include/llvm/Analysis DependenceAnalysis.h, llvm/lib/Analysis DependenceAnalysis.cpp

[DA] Use SmallVector instead of raw new/delete (NFC) (#190586)

Some functions used `new`/`delete` to allocate/free arrays. To avoid
memory leaks, it would be better to avoid using raw pointers. This patch
replaces the use of them with `SmallVector`.
DeltaFile
+33-27llvm/lib/Analysis/DependenceAnalysis.cpp
+18-16llvm/include/llvm/Analysis/DependenceAnalysis.h
+51-432 files

LLVM/project 4994a97flang/lib/Semantics openmp-utils.cpp

[flang][OpenMP] Remove namespace qualification from GetUpperName, NFC (#190619)

This applies to flang/lib/Semantics/openmp-utils.cpp, since it contains
`using namespace Fortran::parser::omp`.
DeltaFile
+5-6flang/lib/Semantics/openmp-utils.cpp
+5-61 files

LLVM/project 7ceeb36llvm/lib/Transforms/Vectorize SLPVectorizer.cpp

Address comments

Created using spr 1.3.7
DeltaFile
+3-2llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+3-21 files

LLVM/project 3fee094flang/lib/Semantics openmp-utils.cpp

[flang][OpenMP] Remove namespace qualification from GetUpperName, NFC

This applies to flang/lib/Semantics/openmp-utils.cpp, since it contains
`using namespace Fortran::parser::omp`.
DeltaFile
+5-6flang/lib/Semantics/openmp-utils.cpp
+5-61 files

LLVM/project 66f9001llvm/lib/Transforms/Vectorize SLPVectorizer.cpp, llvm/test/Transforms/SLPVectorizer/AArch64 extracts-from-scalarizable-vector.ll

[𝘀𝗽𝗿] initial version

Created using spr 1.3.7
DeltaFile
+13-96llvm/test/Transforms/SLPVectorizer/X86/bool-mask.ll
+45-29llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+42-1llvm/test/Transforms/SLPVectorizer/X86/identity-match-splat-less-defined.ll
+17-18llvm/test/Transforms/SLPVectorizer/X86/reduced-value-stored.ll
+8-19llvm/test/Transforms/SLPVectorizer/AArch64/extracts-from-scalarizable-vector.ll
+5-16llvm/test/Transforms/SLPVectorizer/X86/inversed-icmp-to-gather.ll
+130-1794 files not shown
+161-20110 files

LLVM/project f72e1ceclang/test/Headers __clang_hip_math.hip, llvm/test/CodeGen/AMDGPU clmul.ll integer-mad-patterns.ll

Merge branch 'main' into users/cabbaken/04-04-_da_add_overflow_test_for_banerjeemivtest
DeltaFile
+3,666-5,073llvm/test/CodeGen/RISCV/rvv/expandload.ll
+4,371-0llvm/test/CodeGen/AMDGPU/clmul.ll
+1,318-117llvm/test/CodeGen/AMDGPU/integer-mad-patterns.ll
+736-647clang/test/Headers/__clang_hip_math.hip
+835-387llvm/test/CodeGen/AMDGPU/fcanonicalize.bf16.ll
+610-305llvm/test/CodeGen/AMDGPU/atomics-system-scope.ll
+11,536-6,529956 files not shown
+35,844-16,704962 files

LLVM/project bf2a97allvm/lib/Target/AMDGPU AMDGPUInstCombineIntrinsic.cpp, llvm/test/Transforms/InstCombine/AMDGPU mbcnt.ll llvm.amdgcn.wave.shuffle.ll

AMDGPU: Add range attribute to mbcnt intrinsic callsites (#189191)

It seems the known bits handling added in
686987a540bc176bceaad43ffe530cb3e88796d5
is insufficient to perform many range based optimizations. For some
reason
computeConstantRange doesn't fall back on KnownBits, and has a separate,
less used form which tries to use computeKnownBits.
DeltaFile
+236-15llvm/test/Transforms/InstCombine/AMDGPU/mbcnt.ll
+22-22llvm/test/Transforms/InstCombine/AMDGPU/llvm.amdgcn.wave.shuffle.ll
+22-2llvm/lib/Target/AMDGPU/AMDGPUInstCombineIntrinsic.cpp
+1-1llvm/test/Transforms/InstCombine/AMDGPU/canonicalize-add-to-gep.ll
+281-404 files

LLVM/project 297a70cclang/lib/CIR/CodeGen CIRGenModule.cpp, clang/test/CIR/CodeGen global-decomp-decls.cpp

[CIR] Implement global decomposition declarations (#190364)

No real challenge to these, it is effectively a copy/paste of the
classic codegen as it just requires we properly emit the holding
variable. The rest falls out of the rest of our handling of variables.
DeltaFile
+114-0clang/test/CIR/CodeGen/global-decomp-decls.cpp
+5-6clang/lib/CIR/CodeGen/CIRGenModule.cpp
+119-62 files

LLVM/project c4281fdllvm/include/llvm/Support KnownFPClass.h, llvm/lib/Support KnownFPClass.cpp

[Support][ValueTraking] Improve KnownFPClass for fadd. Handle infinity signs (#190559)

Improve KnownFPClass reasoning for fadd:

- Refine NaN handling for infinities by checking opposite-sign cases:
   - `-inf` + `+inf` --> `nan`
  - `+inf` + `-inf` --> `nan`
  - `+inf` + `+inf` --> `+inf`
  - `-inf` + `-inf` --> `-inf`
- Introduce `cannotBeOrderedLessEqZero` as pair to
`cannotBeOrderedGreaterEqZero`.
DeltaFile
+44-0llvm/test/Transforms/Attributor/nofpclass.ll
+11-0llvm/include/llvm/Support/KnownFPClass.h
+4-3llvm/lib/Support/KnownFPClass.cpp
+1-4llvm/test/Transforms/InstSimplify/known-never-nan.ll
+2-2llvm/test/Transforms/InstCombine/simplify-demanded-fpclass-fadd.ll
+62-95 files

LLVM/project 8519f41llvm/lib/Target/AMDGPU AMDGPUInstCombineIntrinsic.cpp

Update llvm/lib/Target/AMDGPU/AMDGPUInstCombineIntrinsic.cpp
DeltaFile
+1-1llvm/lib/Target/AMDGPU/AMDGPUInstCombineIntrinsic.cpp
+1-11 files

LLVM/project 59e899eclang/lib/AST/ByteCode Interp.h Interp.cpp, clang/test/AST/ByteCode cxx26.cpp

[clang][bytecode] Don't unref constexpr-unknown references (#190177)

If the pointer for a reference is constexpr-unknown, use the pointer
itself instead, instead of dereferencing it. Unfortunately, that means
constexpr-unknown pointers to reach a lot more places than before.
DeltaFile
+45-1clang/lib/AST/ByteCode/Interp.h
+29-6clang/lib/AST/ByteCode/Interp.cpp
+20-2clang/test/AST/ByteCode/cxx26.cpp
+4-8clang/test/SemaCXX/constant-expression-p2280r4.cpp
+7-5clang/lib/AST/ByteCode/Compiler.cpp
+10-0clang/lib/AST/ByteCode/Pointer.h
+115-222 files not shown
+127-238 files

LLVM/project 2ccc941llvm/lib/Target/AMDGPU VOP2Instructions.td

[AMDGPU] Mark two instructions as DPMACC (#190391)

It appears these were accidentally missed in #170319
DeltaFile
+2-2llvm/lib/Target/AMDGPU/VOP2Instructions.td
+2-21 files

LLVM/project 74ad441llvm/test/DebugInfo/Generic debug-info-enum-dwarf2.ll incorrect-variable-debugloc1-dwarf2.ll

Split DWARF v2 tests to exclude 64-bit AIX targets (#189077)

64-bit AIX requires DWARF64 format, which was only introduced in DWARF
v3. DWARF v2 only supports 32-bit DWARF format, making it incompatible
with 64-bit AIX (the compiler throws a fatal error). These changes split
DWARF v2 tests into separate files that exclude 64-bit AIX targets while
still running on 32-bit AIX and other 64-bit platforms where DWARF v2 is
supported.
DeltaFile
+15-0llvm/test/DebugInfo/Generic/debug-info-enum-dwarf2.ll
+10-0llvm/test/DebugInfo/Generic/incorrect-variable-debugloc1-dwarf2.ll
+6-0llvm/test/DebugInfo/Generic/restrict-dwarf2.ll
+2-4llvm/test/DebugInfo/Generic/debug-info-enum.ll
+2-1llvm/test/DebugInfo/Generic/incorrect-variable-debugloc1.ll
+2-1llvm/test/DebugInfo/Generic/restrict.ll
+37-66 files

LLVM/project 4670f59llvm/test/Analysis/DependenceAnalysis banerjee-overflow.ll

update
DeltaFile
+5-4llvm/test/Analysis/DependenceAnalysis/banerjee-overflow.ll
+5-41 files