LLVM/project 9870ef1clang/include/clang/Basic BuiltinsAMDGPU.def, clang/lib/CodeGen/TargetBuiltins AMDGPU.cpp

Review comments: remove the float overload.
DeltaFile
+48-48clang/test/CodeGenOpenCL/builtins-amdgcn.cl
+12-8clang/lib/CodeGen/TargetBuiltins/AMDGPU.cpp
+4-4clang/include/clang/Basic/BuiltinsAMDGPU.def
+64-603 files

LLVM/project 95b31cfclang/include/clang/Basic BuiltinsAMDGPU.def, clang/lib/CodeGen/TargetBuiltins AMDGPU.cpp

[AMDGPU] Add builtins for wave reduction intrinsics
DeltaFile
+84-0clang/test/CodeGenOpenCL/builtins-amdgcn.cl
+8-0clang/lib/CodeGen/TargetBuiltins/AMDGPU.cpp
+4-0clang/include/clang/Basic/BuiltinsAMDGPU.def
+96-03 files

LLVM/project 131cf7dclang/lib/CodeGen CGExpr.cpp, clang/test/CodeGenCXX alloc-token.cpp

[AllocToken] Enable alloc token instrumentation for size-returning functions (#168840)

Consider a newly added "malloc_span" attribute in the allocation token
instrumentation to ensure that allocation functions with the
"malloc_span" attribute are processed similarly to other memory
allocation functions.

Update the tests to demonstrate applicability to __size_returning_new.
DeltaFile
+8-9clang/test/CodeGenCXX/alloc-token.cpp
+1-0clang/lib/CodeGen/CGExpr.cpp
+9-92 files

LLVM/project dc343d2flang/test/Lower/Intrinsics modulo.f90, flang/test/Lower/OpenMP/Todo omp-clause-indirect.f90 omp-declarative-allocate.f90

[NFC][flang] Replace use of flang -fc1 with %flang_fc1 in few test case (#168830)

Replace use of flang -fc1 with %flang_fc1 in few test case
DeltaFile
+2-2flang/test/Semantics/indirect02.f90
+1-1flang/test/Lower/Intrinsics/modulo.f90
+1-1flang/test/Lower/OpenMP/Todo/omp-clause-indirect.f90
+1-1flang/test/Lower/OpenMP/Todo/omp-declarative-allocate.f90
+1-1flang/test/Lower/OpenMP/Todo/omp-declare-reduction-initsub.f90
+1-1flang/test/Lower/OpenMP/Todo/omp-declare-reduction.f90
+7-78 files not shown
+15-1514 files

LLVM/project 2ae7caaclang/include/clang/Basic BuiltinsAMDGPU.def, clang/lib/CodeGen/TargetBuiltins AMDGPU.cpp

Review comments: remove the float overload.
DeltaFile
+48-48clang/test/CodeGenOpenCL/builtins-amdgcn.cl
+12-8clang/lib/CodeGen/TargetBuiltins/AMDGPU.cpp
+4-4clang/include/clang/Basic/BuiltinsAMDGPU.def
+64-603 files

LLVM/project f6ed1d8clang/include/clang/Basic BuiltinsAMDGPU.def, clang/lib/CodeGen/TargetBuiltins AMDGPU.cpp

[AMDGPU] Add builtins for wave reduction intrinsics
DeltaFile
+84-0clang/test/CodeGenOpenCL/builtins-amdgcn.cl
+8-0clang/lib/CodeGen/TargetBuiltins/AMDGPU.cpp
+4-0clang/include/clang/Basic/BuiltinsAMDGPU.def
+96-03 files

LLVM/project bdf598fllvm/lib/Target/ARC ARCISelLowering.cpp, llvm/lib/Target/CSKY CSKYISelLowering.cpp

CodeGen: Add missing subtarget to TargetLoweringBase constructor for ARC, CSKY and M68K (#168811)

Those were missing in https://github.com/llvm/llvm-project/pull/168620.
DeltaFile
+1-1llvm/lib/Target/CSKY/CSKYISelLowering.cpp
+1-1llvm/lib/Target/ARC/ARCISelLowering.cpp
+1-1llvm/lib/Target/M68k/M68kISelLowering.cpp
+3-33 files

LLVM/project 07a31adllvm/lib/Target/X86 X86ISelLowering.cpp, llvm/test/CodeGen/X86 merge-consecutive-loads-128.ll merge-consecutive-loads-256.ll

[X86] EltsFromConsecutiveLoads - recognise reverse load patterns. (#168706)

See if we can create a vector load from the src elements in reverse and
then shuffle these back into place.

SLP will (usually) catch this in the middle-end, but there are a few
BUILD_VECTOR scalarizations etc. that appear during DAG legalization.

I did start looking at a more general permute fold, but I haven't found
any good test examples for this yet - happy to take another look if
somebody has examples.
DeltaFile
+48-216llvm/test/CodeGen/X86/merge-consecutive-loads-128.ll
+75-154llvm/test/CodeGen/X86/merge-consecutive-loads-256.ll
+14-113llvm/test/CodeGen/X86/merge-consecutive-loads-512.ll
+8-12llvm/test/CodeGen/X86/bitcnt-big-integer.ll
+15-2llvm/lib/Target/X86/X86ISelLowering.cpp
+2-3llvm/test/CodeGen/X86/build-vector-256.ll
+162-5001 files not shown
+163-5027 files

LLVM/project 1a0a6b8clang/include/clang/Basic BuiltinsAMDGPU.def, clang/lib/CodeGen/TargetBuiltins AMDGPU.cpp

[AMDGPU] Add builtins for wave reduction intrinsics
DeltaFile
+84-0clang/test/CodeGenOpenCL/builtins-amdgcn.cl
+8-0clang/lib/CodeGen/TargetBuiltins/AMDGPU.cpp
+4-0clang/include/clang/Basic/BuiltinsAMDGPU.def
+96-03 files

LLVM/project c5cf1b2clang/include/clang/Basic BuiltinsAMDGPU.def, clang/lib/CodeGen/TargetBuiltins AMDGPU.cpp

Review comments: remove the float overload.
DeltaFile
+48-48clang/test/CodeGenOpenCL/builtins-amdgcn.cl
+12-8clang/lib/CodeGen/TargetBuiltins/AMDGPU.cpp
+4-4clang/include/clang/Basic/BuiltinsAMDGPU.def
+64-603 files

LLVM/project e44646bllvm/lib/Target/WebAssembly WebAssemblyISelLowering.cpp, llvm/test/CodeGen/WebAssembly simd-arith.ll simd-vecreduce-bool.ll

[WebAssembly] Lower ANY_EXTEND_VECTOR_INREG (#167529)

Treat it in the same manner of zero_extend_vector_inreg and generate an
extend_low_u if possible. This is to try an prevent expensive shuffles
from being generated instead. computeKnownBitsForTargetNode has also
been updated to specify known zeros on extend_low_u. 
DeltaFile
+14-22llvm/test/CodeGen/WebAssembly/simd-arith.ll
+26-1llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
+2-2llvm/test/CodeGen/WebAssembly/simd-vecreduce-bool.ll
+42-253 files

LLVM/project dcab4cbllvm/include/llvm/IR IntrinsicsAMDGPU.td, llvm/lib/Target/AMDGPU SIISelLowering.cpp SIInstructions.td

[AMDGPU] Add wave reduce intrinsics for float types - 2 (#161815)

Supported Ops: `fadd`, `fsub`
DeltaFile
+1,001-0llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.sub.ll
+1,001-0llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.add.ll
+39-3llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+2-0llvm/lib/Target/AMDGPU/SIInstructions.td
+2-0llvm/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp
+1-1llvm/include/llvm/IR/IntrinsicsAMDGPU.td
+2,046-46 files

LLVM/project 8fda2cfmlir/include/mlir/Target/LLVMIR ModuleImport.h, mlir/lib/Target/LLVMIR ModuleImport.cpp ConvertFromLLVMIR.cpp

[mlir][llvm] Handle debug record import edge cases

This commit enables the direct import of debug records by default and
fixes issues with two edge cases:
- Detect early on if the address operand is an argument list
  (calling getAddress() for argument lists asserts)
- Use getAddress() to check if the address operand is null, which
  means the address operand is an empty metadata node, which currently
  is not supported.
- Add support for debug label records.

This is a follow-up to:
https://github.com/llvm/llvm-project/pull/167812
DeltaFile
+81-47mlir/lib/Target/LLVMIR/ModuleImport.cpp
+9-5mlir/include/mlir/Target/LLVMIR/ModuleImport.h
+5-7mlir/test/Target/LLVMIR/Import/import-failure.ll
+1-1mlir/lib/Target/LLVMIR/ConvertFromLLVMIR.cpp
+96-604 files

LLVM/project 14d480dllvm/include/llvm/IR IntrinsicsAMDGPU.td, llvm/lib/Target/AMDGPU SIISelLowering.cpp SIInstructions.td

Review comments: remove the `.float` suffix and overload.
DeltaFile
+130-96llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.sub.ll
+129-77llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.add.ll
+2-2llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+2-2llvm/lib/Target/AMDGPU/SIInstructions.td
+2-0llvm/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp
+1-1llvm/include/llvm/IR/IntrinsicsAMDGPU.td
+266-1786 files

LLVM/project 396067bllvm/lib/Target/AMDGPU SIISelLowering.cpp SIInstructions.td, llvm/test/CodeGen/AMDGPU llvm.amdgcn.reduce.sub.ll llvm.amdgcn.reduce.add.ll

[AMDGPU] Add wave reduce intrinsics for float types - 2

Supported Ops: `fadd`, `fsub`
DeltaFile
+967-0llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.sub.ll
+949-0llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.add.ll
+39-3llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+2-0llvm/lib/Target/AMDGPU/SIInstructions.td
+1,957-34 files

LLVM/project dbf4525llvm/include/llvm/IR IntrinsicsAMDGPU.td, llvm/lib/Target/AMDGPU SIISelLowering.cpp SIInstructions.td

[AMDGPU] Add wave reduce intrinsics for float types - 1 (#161814)

Supported Ops: `fmin`, `fmax`
DeltaFile
+911-0llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.min.ll
+911-0llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.max.ll
+39-4llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+4-1llvm/lib/Target/AMDGPU/SIInstructions.td
+2-2llvm/include/llvm/IR/IntrinsicsAMDGPU.td
+2-0llvm/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp
+1,869-76 files

LLVM/project 0ce2f67llvm/lib/Target/AMDGPU SIISelLowering.cpp

Hardcode quietNaN val.
DeltaFile
+2-4llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+2-41 files

LLVM/project e756fc6llvm/include/llvm/IR IntrinsicsAMDGPU.td, llvm/lib/Target/AMDGPU SIISelLowering.cpp SIInstructions.td

Review comments: remove `.float` suffix
DeltaFile
+90-60llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.min.ll
+90-60llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.max.ll
+2-3llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+2-2llvm/lib/Target/AMDGPU/SIInstructions.td
+2-0llvm/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp
+1-1llvm/include/llvm/IR/IntrinsicsAMDGPU.td
+187-1266 files

LLVM/project b73e88allvm/include/llvm/IR IntrinsicsAMDGPU.td, llvm/lib/Target/AMDGPU SIISelLowering.cpp SIInstructions.td

[AMDGPU] Add wave reduce intrinsics for float types - 1

Supported Ops: `fmin`, `fmax`
DeltaFile
+881-0llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.max.ll
+881-0llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.min.ll
+42-4llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+4-1llvm/lib/Target/AMDGPU/SIInstructions.td
+1-1llvm/include/llvm/IR/IntrinsicsAMDGPU.td
+1,809-65 files

LLVM/project 3e5fafdllvm/lib/Target/RISCV RISCVISelLowering.cpp RISCVInstrInfoP.td, llvm/test/CodeGen/RISCV rvp-ext-rv32.ll rvp-ext-rv64.ll

[RISCV][llvm] Select splat_vector(constant) with PLI (#168204)

Default DAG combiner combine BUILD_VECTOR with same elements to
SPLAT_VECTOR, we can just map constant splat to PLI if possible.
DeltaFile
+1-32llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+27-0llvm/test/CodeGen/RISCV/rvp-ext-rv32.ll
+11-9llvm/lib/Target/RISCV/RISCVInstrInfoP.td
+14-0llvm/test/CodeGen/RISCV/rvp-ext-rv64.ll
+2-0llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp
+55-415 files

LLVM/project fde2aadllvm/include/llvm/Support CodeGen.h

[CodeGen] update code generation optimization level(nfc) (#168190)

DeltaFile
+1-1llvm/include/llvm/Support/CodeGen.h
+1-11 files

LLVM/project 11294e7llvm/lib/Target/AMDGPU SIISelLowering.cpp

Hardcode quietNaN val.
DeltaFile
+2-4llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+2-41 files

LLVM/project 1feee7fllvm/include/llvm/IR IntrinsicsAMDGPU.td, llvm/lib/Target/AMDGPU SIISelLowering.cpp SIInstructions.td

[AMDGPU] Add wave reduce intrinsics for float types - 1

Supported Ops: `fmin`, `fmax`
DeltaFile
+881-0llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.max.ll
+881-0llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.min.ll
+42-4llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+4-1llvm/lib/Target/AMDGPU/SIInstructions.td
+1-1llvm/include/llvm/IR/IntrinsicsAMDGPU.td
+1,809-65 files

LLVM/project 7b5a092llvm/include/llvm/IR IntrinsicsAMDGPU.td, llvm/lib/Target/AMDGPU SIISelLowering.cpp SIInstructions.td

Review comments: remove `.float` suffix
DeltaFile
+90-60llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.min.ll
+90-60llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.max.ll
+2-3llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+2-2llvm/lib/Target/AMDGPU/SIInstructions.td
+2-0llvm/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp
+1-1llvm/include/llvm/IR/IntrinsicsAMDGPU.td
+187-1266 files

LLVM/project 8608344llvm/lib/CodeGen CFIInstrInserter.cpp, llvm/test/CodeGen/RISCV cfi-multiple-locations.mir

[CFIInserter] Turn a reachable llvm_unreachable into a report_fatal_error. (#168777)

This prevents it from being optimized out in non-asserts builds.

Update X86 test to remove REQUIRES: asserts and check for LLVM ERROR.
Add FileCheck to RISC-V test and remove UNSUPPORTED.

This is the more complete fix for #168772 and #168525.
DeltaFile
+4-3llvm/test/CodeGen/RISCV/cfi-multiple-locations.mir
+1-3llvm/test/CodeGen/X86/cfi-inserter-verify-inconsistent-loc.mir
+2-1llvm/lib/CodeGen/CFIInstrInserter.cpp
+7-73 files

LLVM/project b887217llvm/lib/Target/AMDGPU SIShrinkInstructions.cpp

[AMDGPU] Make SIShrinkInstructions pass return valid changed state
DeltaFile
+62-34llvm/lib/Target/AMDGPU/SIShrinkInstructions.cpp
+62-341 files

LLVM/project 27b71d1llvm/include/llvm/Support JSON.h

[Support] add vector::erase to JSON::Array
DeltaFile
+2-0llvm/include/llvm/Support/JSON.h
+2-01 files

LLVM/project eddb8adllvm/lib/Target/AMDGPU SIISelLowering.cpp

Hardcode quietNaN val.
DeltaFile
+2-4llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+2-41 files

LLVM/project 6c51566llvm/include/llvm/IR IntrinsicsAMDGPU.td, llvm/lib/Target/AMDGPU SIISelLowering.cpp SIInstructions.td

Review comments: remove `.float` suffix
DeltaFile
+90-60llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.min.ll
+90-60llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.max.ll
+2-3llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+2-2llvm/lib/Target/AMDGPU/SIInstructions.td
+1-1llvm/include/llvm/IR/IntrinsicsAMDGPU.td
+2-0llvm/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp
+187-1266 files

LLVM/project af61f4dllvm/include/llvm/IR IntrinsicsAMDGPU.td, llvm/lib/Target/AMDGPU SIISelLowering.cpp SIInstructions.td

[AMDGPU] Add wave reduce intrinsics for float types - 1

Supported Ops: `fmin`, `fmax`
DeltaFile
+881-0llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.max.ll
+881-0llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.min.ll
+42-4llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+4-1llvm/lib/Target/AMDGPU/SIInstructions.td
+1-1llvm/include/llvm/IR/IntrinsicsAMDGPU.td
+1,809-65 files