LLVM/project 30fa415llvm/lib/Target/X86 X86FastISel.cpp, llvm/test/CodeGen/X86 fast-isel-struct-ret.ll bf16-fast-isel.ll

[X86][FastISel] Restore support for struct returns (#194586)

After #180322, X86 FastISel forces SDAG fallback for any call with a
struct return. This caused major compile-time regressions for debug
builds in Rust, where struct returns are very common.

The type legality check should work on the de-aggregated types, not on
the return type directly.
DeltaFile
+58-0llvm/test/CodeGen/X86/fast-isel-struct-ret.ll
+30-0llvm/test/CodeGen/X86/bf16-fast-isel.ll
+16-11llvm/lib/Target/X86/X86FastISel.cpp
+104-113 files

LLVM/project a39ba6elld/COFF InputFiles.cpp Driver.cpp

[LLD][COFF] Move Archive::create call to LinkerDriver::addBuffer (NFC) (#194346)

This allows an upcoming change to Archive::create() to make decisions
based on the archive type.
DeltaFile
+5-5lld/COFF/InputFiles.cpp
+5-4lld/COFF/Driver.cpp
+2-1lld/COFF/InputFiles.h
+12-103 files

LLVM/project 2696eb3lldb/include/lldb/Target Process.h, lldb/source/Plugins/Process/MacOSX-Kernel ProcessKDP.cpp

fixup! replace pointer overload with references
DeltaFile
+6-12lldb/source/Target/Process.cpp
+6-6lldb/source/Plugins/Process/gdb-remote/ProcessGDBRemote.cpp
+3-3lldb/source/Plugins/Process/Utility/StopInfoMachException.cpp
+2-2lldb/source/Plugins/Process/MacOSX-Kernel/ProcessKDP.cpp
+1-2lldb/include/lldb/Target/Process.h
+1-1lldb/source/Plugins/Process/scripted/ScriptedProcess.cpp
+19-262 files not shown
+21-288 files

LLVM/project 4d33c69mlir/lib/ExecutionEngine CudaRuntimeWrappers.cpp RocmRuntimeWrappers.cpp, mlir/lib/Target/LLVMIR/Dialect/GPU SelectObjectAttr.cpp

[MLIR][GPU] Add cooperative launch support to gpu.launch_func (#190639)

Add a `cooperative` UnitAttr to `gpu.launch_func` that enables
cooperative kernel launch semantics. Cooperative launches guarantee that
all thread blocks in the grid are co-resident on the GPU simultaneously,
enabling grid-wide synchronization patterns.

## Implementation

When `cooperative` is set (with or without cluster sizes), the lowering
emits a call to the new `mgpuLaunchKernelCooperative` runtime function,
which uses `cuLaunchKernelEx` with a `CUlaunchConfig` and
`CU_LAUNCH_ATTRIBUTE_COOPERATIVE`. This API is guarded behind
`CUDA_VERSION >= 12000`. The HIP path funnels through
`hipModuleLaunchCooperativeKernel`.

## Changes

- **GPUOps.td**: add `cooperative` UnitAttr and assembly format keyword

    [17 lines not shown]
DeltaFile
+67-0mlir/lib/ExecutionEngine/CudaRuntimeWrappers.cpp
+45-1mlir/test/Target/LLVMIR/gpu.mlir
+33-2mlir/lib/Target/LLVMIR/Dialect/GPU/SelectObjectAttr.cpp
+20-0mlir/lib/ExecutionEngine/RocmRuntimeWrappers.cpp
+18-0mlir/test/Dialect/GPU/ops.mlir
+14-0mlir/test/Dialect/GPU/outlining.mlir
+197-34 files not shown
+219-510 files

LLVM/project 0541a00lldb/include/lldb/Breakpoint BreakpointSite.h, lldb/include/lldb/Target Process.h

[lldb][NFC] Move BreakpointSite::IsEnabled/SetEnabled into Process

The Process class is the one responsible for managing the state of a
BreakpointSite inside the process. As such, it should be the one
answering questions about the state of the site.

Future patches will make this even more important by introducing a
"logical" is enabled, by delaying the moment in which breakpoints are
actually updated in the process.

The following PRs are related to the MultiBreakpoint feature:

* https://github.com/llvm/llvm-project/pull/192910
* https://github.com/llvm/llvm-project/pull/192914
* https://github.com/llvm/llvm-project/pull/192915
* https://github.com/llvm/llvm-project/pull/192919
* https://github.com/llvm/llvm-project/pull/192962
* https://github.com/llvm/llvm-project/pull/192964
* https://github.com/llvm/llvm-project/pull/192971

    [3 lines not shown]
DeltaFile
+23-29lldb/source/Plugins/Process/gdb-remote/ProcessGDBRemote.cpp
+16-7lldb/source/Target/Process.cpp
+6-14lldb/include/lldb/Breakpoint/BreakpointSite.h
+10-0lldb/include/lldb/Target/Process.h
+5-5lldb/source/Plugins/Process/MacOSX-Kernel/ProcessKDP.cpp
+8-0lldb/source/Plugins/Process/gdb-remote/ProcessGDBRemote.h
+68-555 files not shown
+76-6411 files

LLVM/project 1ef7d35lldb/source/Plugins/Process/gdb-remote GDBRemoteCommunicationClient.cpp GDBRemoteCommunicationClient.h, lldb/unittests/Process/gdb-remote GDBRemoteCommunicationClientTest.cpp

[lldb][GDBRemote] Parse MultiBreakpoint+ capability (#192962)

The following PRs are related to the MultiBreakpoint feature:

* https://github.com/llvm/llvm-project/pull/192910
* https://github.com/llvm/llvm-project/pull/192914
* https://github.com/llvm/llvm-project/pull/192915
* https://github.com/llvm/llvm-project/pull/192919
* https://github.com/llvm/llvm-project/pull/192962
* https://github.com/llvm/llvm-project/pull/192964
* https://github.com/llvm/llvm-project/pull/192971
* https://github.com/llvm/llvm-project/pull/192988
DeltaFile
+22-0lldb/unittests/Process/gdb-remote/GDBRemoteCommunicationClientTest.cpp
+10-0lldb/source/Plugins/Process/gdb-remote/GDBRemoteCommunicationClient.cpp
+3-0lldb/source/Plugins/Process/gdb-remote/GDBRemoteCommunicationClient.h
+35-03 files

LLVM/project 284bf51clang/lib/StaticAnalyzer/Checkers/UninitializedObject UninitializedObjectChecker.cpp, clang/test/Analysis cxx-uninitialized-object.cpp cxx-uninitialized-object-ptr-ref.cpp

[clang][analyzer] Add support for detecting uninitialized dynamically-allocated objects

Adapt the allocated region into a `TypedValueRegion` by retrieving its
type and wrapping it in an `ElementRegion`.

The `willObjectBeAnalyzedLater` function must therefore fall back on
using `SubRegion`s.

CPP-7677
DeltaFile
+81-66clang/test/Analysis/cxx-uninitialized-object.cpp
+59-47clang/test/Analysis/cxx-uninitialized-object-ptr-ref.cpp
+44-34clang/test/Analysis/cxx-uninitialized-object-inheritance.cpp
+26-22clang/test/Analysis/cxx-uninitialized-object-unguarded-access.cpp
+28-11clang/lib/StaticAnalyzer/Checkers/UninitializedObject/UninitializedObjectChecker.cpp
+14-5clang/test/Analysis/cxx-uninitialized-object-unionlike-constructs.cpp
+252-1853 files not shown
+283-1919 files

LLVM/project 3102d40mlir/lib/Conversion/MemRefToLLVM MemRefToLLVM.cpp, mlir/test/Conversion/MemRefToLLVM memref-to-llvm.mlir

[mlir][MemRefToLLVM] Support floating-point types in GenericAtomicRMWOp lowering (#194300)

`llvm.cmpxchg` only accepts integer or pointer operands. When the memref
element type is floating-point (e.g. `f16`), bitcast values to a
same-width integer for the CAS and bitcast the new-loaded result back to
the original float type.
DeltaFile
+28-5mlir/lib/Conversion/MemRefToLLVM/MemRefToLLVM.cpp
+31-0mlir/test/Conversion/MemRefToLLVM/memref-to-llvm.mlir
+59-52 files

LLVM/project 224a386clang/lib/AST/ByteCode Compiler.cpp Compiler.h

[clang][bytecode] Add visitInitializerPop() helper (#194802)

To pop the pointer as part of the finishInit op
DeltaFile
+20-35clang/lib/AST/ByteCode/Compiler.cpp
+2-0clang/lib/AST/ByteCode/Compiler.h
+22-352 files

LLVM/project 6e92a11llvm/test/CodeGen/LoongArch/lasx rotl-rotr.ll, llvm/test/CodeGen/LoongArch/lsx rotl-rotr.ll

update tests
DeltaFile
+2-6llvm/test/CodeGen/LoongArch/lasx/rotl-rotr.ll
+2-4llvm/test/CodeGen/LoongArch/lsx/rotl-rotr.ll
+4-102 files

LLVM/project d067b83llvm/lib/Target/SPIRV SPIRVBuiltins.td SPIRVBuiltins.cpp, llvm/test/CodeGen/SPIRV/transcoding/OpenCL atomic_fetch_min_max.ll

[SPIRV] Add missing OpenCL atomic_fetch_min/max builtin mappings (#190443)

## Summary

The SPIR-V backend maps OpenCL `atomic_fetch_add`/`sub`/`or`/`xor`/`and`
(and their `_explicit` variants) to SPIR-V atomic opcodes, but was
missing support for `atomic_fetch_min`/`atomic_fetch_max`, their
`_explicit` variants, and the legacy `atom_min`/`atom_max` builtins.
This caused OpenCL programs using these atomics to emit unresolved
function calls instead of the correct
`OpAtomicSMin`/`OpAtomicSMax`/`OpAtomicUMin`/`OpAtomicUMax`
instructions.

### Approach

Unlike add/sub/or/xor/and (which are sign-agnostic), min/max require
distinct signed vs unsigned SPIR-V opcodes. Rather than inspecting the
`OpTypeInt` signedness bit at runtime (which is always 0 in this
backend), this patch uses the existing prefix-based builtin lookup

    [17 lines not shown]
DeltaFile
+164-0llvm/test/CodeGen/SPIRV/transcoding/OpenCL/atomic_fetch_min_max.ll
+12-0llvm/lib/Target/SPIRV/SPIRVBuiltins.td
+4-0llvm/lib/Target/SPIRV/SPIRVBuiltins.cpp
+180-03 files

LLVM/project 9d63b2ellvm/lib/Target/LoongArch LoongArchISelLowering.cpp

[LoongArch] Legalize BUILD_VECTOR into a broadcast when all non-undef elements are identical

When a BUILD_VECTOR consists of the same element (ignoring undefs),
it is better emitting a broadcast instead of multiple insertions.

Some floating-point cases suffer performance regressions, those
specific cases are excluded in this commit. Including when:

- only one element is non-undef,
- only two elements are non-undef, and one of them must at index 0,
- for v8f32 vector type, specially exclude the cases when the only
two non-undefs are at index (1,2)/(1,3)/(2,3).
DeltaFile
+31-5llvm/lib/Target/LoongArch/LoongArchISelLowering.cpp
+31-51 files

LLVM/project f5c1cc3llvm/test/CodeGen/LoongArch/lasx build-vector.ll scalar-to-vector.ll, llvm/test/CodeGen/LoongArch/lasx/ir-instruction insertelement.ll

update tests
DeltaFile
+10-40llvm/test/CodeGen/LoongArch/lasx/build-vector.ll
+7-19llvm/test/CodeGen/LoongArch/lsx/build-vector.ll
+4-6llvm/test/CodeGen/LoongArch/lasx/ir-instruction/insertelement.ll
+4-4llvm/test/CodeGen/LoongArch/lasx/scalar-to-vector.ll
+4-4llvm/test/CodeGen/LoongArch/lsx/scalar-to-vector.ll
+29-735 files

LLVM/project 736024cllvm/test/CodeGen/LoongArch/lasx build-vector.ll, llvm/test/CodeGen/LoongArch/lsx build-vector.ll

[LoongArch][NFC] Add tests for build_vector containing same elements except for undefs
DeltaFile
+231-18llvm/test/CodeGen/LoongArch/lasx/build-vector.ll
+149-18llvm/test/CodeGen/LoongArch/lsx/build-vector.ll
+380-362 files

LLVM/project 3713325llvm/lib/Target/LoongArch LoongArchISelLowering.cpp, llvm/test/CodeGen/LoongArch/lasx/ir-instruction shuffle-broadcast.ll

[LoongArch] Legalize broadcasting the first element of 256-bit vector using `xvreplve0`
DeltaFile
+19-0llvm/lib/Target/LoongArch/LoongArchISelLowering.cpp
+6-10llvm/test/CodeGen/LoongArch/lasx/ir-instruction/shuffle-broadcast.ll
+25-102 files

LLVM/project eeef5d5llvm/test/CodeGen/LoongArch/lasx/ir-instruction shuffle-broadcast.ll

[LoongArch][NFC] Add tests for 256-bit vector broadcast
DeltaFile
+179-0llvm/test/CodeGen/LoongArch/lasx/ir-instruction/shuffle-broadcast.ll
+179-01 files

LLVM/project 53b4c84llvm/test/CodeGen/X86 ucmp.ll pr45563-2.ll

[X86] Attempt to fold extract_vector_elt(logicop(x,y),i) -> extract_vector_elt(x,i) (#194581)

When extracting from logicops, we often don't need to extract the result
if one of the element sources is identity (and(x,-1) -> x, or/xor(x,0)
-> x etc.), so this patch uses SimplifyMultipleUseDemandedVectorElts to
peek through to an underlying build_vector.

I had hoped to make this generic, but there's still a lot of yak shaving
to deal with first, as usual - I've included the minimal x86-specific
fixes:
 * missing constant folding of (vXi1 logicop(bitcast(c1),bitcast(c2)))
 * fold kshiftr(concat_vectors(x,y,z,w),c) -> concat_vectors(z,w,0,0)

Fixes #193700
DeltaFile
+861-864llvm/test/CodeGen/X86/ucmp.ll
+107-109llvm/test/CodeGen/X86/pr45563-2.ll
+68-138llvm/test/CodeGen/X86/masked_store.ll
+97-101llvm/test/CodeGen/X86/pr45833.ll
+22-92llvm/test/CodeGen/X86/pr193700.ll
+21-22llvm/test/CodeGen/X86/pr173924.ll
+1,176-1,3263 files not shown
+1,211-1,3319 files

LLVM/project 402d309flang/include/flang/Parser tools.h, flang/lib/Lower PFTBuilder.cpp OpenACC.cpp

[flang][pft] visit original symbol in acc use_device  (#194588)

Fix regression after https://github.com/llvm/llvm-project/pull/193689
when a use_device is referring to variables from a host module.

The original symbol needs to be visited in the PFT so that it will be
instantiated, but it is not visible anymore from the parse tree, and not
directly connected to the new symbol (this is because variables in
use_device are treated in a special way in order to give them the DEVICE
attribute, other data clause do not need such handling).

Look into the parent scope for a symbol with the same name and visit it.
DeltaFile
+75-0flang/test/Lower/host_module_variable_instantiation_use_device.f90
+19-0flang/lib/Lower/PFTBuilder.cpp
+2-13flang/lib/Lower/OpenACC.cpp
+5-0flang/lib/Parser/tools.cpp
+1-0flang/include/flang/Parser/tools.h
+102-135 files

LLVM/project df0bec5clang/include/clang/Basic BuiltinsAMDGPU.td, clang/lib/CodeGen/TargetBuiltins AMDGPU.cpp

[AMDGPU] Add builtins for wave reduction intrinsics

Assisted by - Claude-sonnet:4.6
DeltaFile
+189-0clang/test/CodeGenOpenCL/builtins-amdgcn.cl
+18-0clang/lib/CodeGen/TargetBuiltins/AMDGPU.cpp
+9-0clang/include/clang/Basic/BuiltinsAMDGPU.td
+216-03 files

LLVM/project 9465d29llvm/lib/Target/AMDGPU SIISelLowering.cpp SIInstructions.td, llvm/test/CodeGen/AMDGPU llvm.amdgcn.reduce.xor.ll llvm.amdgcn.reduce.or.ll

[AMDGPU] Support Wave Reduction for true-16 types - 3

Supporting true-16 versions of the reduction intrinsics
Supported Ops: `and`, `or`, `xor`.
Supports only the iterative stratergy, DPP is yet
to be supported.
DeltaFile
+292-145llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.xor.ll
+251-124llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.or.ll
+251-124llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.and.ll
+20-3llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+4-1llvm/lib/Target/AMDGPU/SIInstructions.td
+818-3975 files

LLVM/project 4be91fdllvm/test/CodeGen/LoongArch expandmemcmp.ll expandmemcmp-optsize.ll

update tests
DeltaFile
+670-297llvm/test/CodeGen/LoongArch/expandmemcmp.ll
+612-155llvm/test/CodeGen/LoongArch/expandmemcmp-optsize.ll
+1,282-4522 files

LLVM/project 2c0b673llvm/lib/Target/LoongArch LoongArchISelLowering.cpp LoongArchTargetTransformInfo.cpp

[LoongArch] Support memcmp expansion for vectors and combine for i128/i256 setcc

This commit enables memcmp expansion for lsx/lasx. After doing
this, i128 and i256 loads which are illegal types on LoongArch
will be generated. Without process, they will be splited to
legal scalar type.

So this commit also enable combination for `setcc` to bitcast
i128/i256 types to vector types before type legalization and
generate vector instructions.

Inspired by x86 and riscv.
DeltaFile
+114-8llvm/lib/Target/LoongArch/LoongArchISelLowering.cpp
+8-3llvm/lib/Target/LoongArch/LoongArchTargetTransformInfo.cpp
+122-112 files

LLVM/project a70898bllvm/lib/Analysis DependenceAnalysis.cpp, llvm/test/Analysis/DependenceAnalysis Banerjee.ll gcd-miv-overflow.ll

[DA] Disable the BanerjeeMIV dependence test (#174733)

The various `findBounds` helpers (e.g. `findBoundsLT`) are suspected to
be incorrect because they do not account for potential integer overflow,
which can lead the dependence analysis to produce incorrect results.
Since these helpers are used by the BanerjeeMIV dependence test, this
patch disables BanerjeeMIV by default to avoid unsafe results and
progress the default enablement of DA. The Banerjee test is required for
our motivating example, and we will working on correctness issues and
reenabling it after default enablement.

This is working around issue: #169813
DeltaFile
+11-2llvm/lib/Analysis/DependenceAnalysis.cpp
+3-3llvm/test/Analysis/DependenceAnalysis/Banerjee.ll
+1-1llvm/test/Transforms/LoopInterchange/legality-check.ll
+1-1llvm/test/Transforms/LoopInterchange/interchange-insts-between-indvar.ll
+1-1llvm/test/Transforms/LoopInterchange/inner-indvar-depend-on-outer-indvar.ll
+1-1llvm/test/Analysis/DependenceAnalysis/gcd-miv-overflow.ll
+18-94 files not shown
+22-1310 files

LLVM/project 317a942llvm/lib/Target/AMDGPU SIISelLowering.cpp SIInstructions.td, llvm/test/CodeGen/AMDGPU llvm.amdgcn.reduce.xor.ll llvm.amdgcn.reduce.and.ll

[AMDGPU] Support Wave Reduction for i16 types - 3

Supported Ops: `and`, `or`, `xor`.
Supports only the iterative stratergy, DPP is yet
to be supported.
Supports only Fake-16 versions of the lowering.
True-16 support is yet to be added.
DeltaFile
+587-160llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.xor.ll
+487-136llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.and.ll
+487-136llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.or.ll
+24-2llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+3-0llvm/lib/Target/AMDGPU/SIInstructions.td
+1,588-4345 files

LLVM/project bcb38c1llvm/test/CodeGen/LoongArch expandmemcmp.ll expandmemcmp-optsize.ll

[LoongArch][NFC] Add lsx/lasx checks for memcmp expansion tests

Add checks for lsx/lasx and check-prefixes to reduce the duplication.
DeltaFile
+1,801-193llvm/test/CodeGen/LoongArch/expandmemcmp.ll
+1,023-197llvm/test/CodeGen/LoongArch/expandmemcmp-optsize.ll
+2,824-3902 files

LLVM/project 5d172f7llvm/lib/Target/AMDGPU SIISelLowering.cpp SIInstructions.td, llvm/test/CodeGen/AMDGPU llvm.amdgcn.reduce.sub.ll llvm.amdgcn.reduce.add.ll

[AMDGPU] Support Wave Reduction for true-16 types - 2

Supporting true-16 versions of the reduction intrinsics
Supported Ops: `add`, `sub`.
Supports only the iterative stratergy, DPP is yet
to be supported.
DeltaFile
+373-185llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.sub.ll
+357-176llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.add.ll
+15-2llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+3-1llvm/lib/Target/AMDGPU/SIInstructions.td
+748-3644 files

LLVM/project dff6f0cllvm/lib/Target/AMDGPU SIISelLowering.cpp SIInstructions.td, llvm/test/CodeGen/AMDGPU llvm.amdgcn.reduce.min.ll llvm.amdgcn.reduce.max.ll

[AMDGPU] Support Wave Reduction for true-16 types - 1

Supporting true-16 versions of the reduction intrinsics
Supported Ops: `min`, `umin`, `max`, `umax`.
Supports only the iterative stratergy, DPP is yet
to be supported.
DeltaFile
+255-124llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.min.ll
+255-123llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.max.ll
+255-123llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.umin.ll
+272-101llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.umax.ll
+44-6llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+17-8llvm/lib/Target/AMDGPU/SIInstructions.td
+1,098-4856 files

LLVM/project dddef41llvm/lib/Target/AMDGPU SIISelLowering.cpp SIInstructions.td, llvm/test/CodeGen/AMDGPU llvm.amdgcn.reduce.sub.ll llvm.amdgcn.reduce.add.ll

[AMDGPU] Support Wave Reduction for i16 types - 2

Supported Ops: `add`, `sub`.
Supports only the iterative stratergy, DPP is yet
to be supported.
Supports only Fake-16 versions of the lowering.
True-16 support is yet to be added.
DeltaFile
+658-177llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.sub.ll
+637-173llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.add.ll
+31-5llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+2-0llvm/lib/Target/AMDGPU/SIInstructions.td
+1,328-3554 files

LLVM/project 8508e57llvm/lib/Target/AMDGPU SIISelLowering.cpp SIInstructions.td, llvm/test/CodeGen/AMDGPU llvm.amdgcn.reduce.max.ll llvm.amdgcn.reduce.min.ll

[AMDGPU] Support Wave Reduction for i16 types - 1

Supported Ops: `min`, `umin`, `max`, `umax`.
Supports only the iterative stratergy, DPP is yet
to be supported.
Supports only Fake-16 versions of the lowering.
True-16 support is yet to be added.

Assisted by - Claude-sonnet:4.6
DeltaFile
+494-136llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.max.ll
+494-136llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.min.ll
+493-136llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.umin.ll
+480-136llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.umax.ll
+96-5llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+5-0llvm/lib/Target/AMDGPU/SIInstructions.td
+2,062-5493 files not shown
+2,069-5499 files

LLVM/project 852a449clang/include/clang/Basic BuiltinsAMDGPU.td, clang/lib/CodeGen/TargetBuiltins AMDGPU.cpp

[AMDGPU] Add builtins for wave reduction intrinsics

Assisted by - Claude-sonnet:4.6
DeltaFile
+189-0clang/test/CodeGenOpenCL/builtins-amdgcn.cl
+18-0clang/lib/CodeGen/TargetBuiltins/AMDGPU.cpp
+9-0clang/include/clang/Basic/BuiltinsAMDGPU.td
+216-03 files