LLVM/project 03347e0libclc/clc/include/clc/workitem clc_get_max_sub_group_size.h, libclc/clc/lib/amdgcn/workitem clc_get_max_sub_group_size.cl

libclc: Correctly declare __clc_get_max_sub_group_size as taking no arguments (#185265)
DeltaFile
+1-1libclc/clc/lib/amdgcn/workitem/clc_get_max_sub_group_size.cl
+1-1libclc/clc/include/clc/workitem/clc_get_max_sub_group_size.h
+2-22 files

LLVM/project 7752b2elibclc/opencl/lib/generic/atomic atomic_fetch_add.cl atomic_fetch_sub.cl

libclc: Add uintptr_t overloads of atomic_fetch_add and sub (#185263)

This is a special case because the pointee type is unsigned, but the
input value is signed. Directly use the opencl builtins, because these
work correctly without any ugly casting required, and it's not worth
putting a wrapper in clc.
DeltaFile
+28-0libclc/opencl/lib/generic/atomic/atomic_fetch_add.cl
+28-0libclc/opencl/lib/generic/atomic/atomic_fetch_sub.cl
+56-02 files

LLVM/project a2fd44alibclc/opencl/lib/amdgcn/async wait_group_events.cl

libclc: Fix wait_group_events build for targets without generic (#185264)
DeltaFile
+5-1libclc/opencl/lib/amdgcn/async/wait_group_events.cl
+5-11 files

LLVM/project 2e453f5libclc/opencl/lib/amdgcn/async wait_group_events.cl

libclc: Fix wait_group_events build for targets without generic
DeltaFile
+5-1libclc/opencl/lib/amdgcn/async/wait_group_events.cl
+5-11 files

LLVM/project abb3b98libclc/clc/include/clc/collective clc_work_group_broadcast.h clc_work_group_broadcast.inc, libclc/clc/lib/generic/collective clc_work_group_broadcast.inc clc_work_group_broadcast.cl

libclc: Add work_group_broadcast (#185261)
DeltaFile
+51-0libclc/clc/lib/generic/collective/clc_work_group_broadcast.inc
+29-0libclc/opencl/lib/generic/collective/work_group_broadcast.inc
+23-0libclc/clc/lib/generic/collective/clc_work_group_broadcast.cl
+20-0libclc/clc/include/clc/collective/clc_work_group_broadcast.h
+16-0libclc/clc/include/clc/collective/clc_work_group_broadcast.inc
+15-0libclc/opencl/lib/generic/collective/work_group_broadcast.cl
+154-02 files not shown
+156-08 files

LLVM/project 68bb8a0libclc/clc/include/clc/collective clc_work_group_any_all.h, libclc/clc/lib/generic SOURCES

libclc: Add work_group_any/work_group_all implementation (#185260)
DeltaFile
+58-0libclc/clc/lib/generic/collective/clc_work_group_any_all.cl
+17-0libclc/clc/include/clc/collective/clc_work_group_any_all.h
+17-0libclc/opencl/lib/generic/collective/work_group_any_all.cl
+1-0libclc/clc/lib/generic/SOURCES
+1-0libclc/opencl/lib/generic/SOURCES
+94-05 files

LLVM/project 81a9f1elibclc/opencl/lib/generic/atomic atomic_fetch_add.cl atomic_fetch_sub.cl

libclc: Add uintptr_t overloads of atomic_fetch_add and sub

This is a special case because the pointee type is unsigned, but the
input value is signed. Directly use the opencl builtins, because these
work correctly without any ugly casting required, and it's not worth
putting a wrapper in clc.
DeltaFile
+6-0libclc/opencl/lib/generic/atomic/atomic_fetch_add.cl
+6-0libclc/opencl/lib/generic/atomic/atomic_fetch_sub.cl
+12-02 files

LLVM/project d568570clang/include/clang/CIR/Dialect/IR CIROps.td

[CIR] Extract base classes for complex ops

Extract CIR_ComplexPartOp for ComplexRealOp/ComplexImagOp,
CIR_ComplexPartPtrOp for ComplexRealPtrOp/ComplexImagPtrOp,
CIR_ComplexBinOp for ComplexAddOp/ComplexSubOp, and
CIR_ComplexRangeBinOp for ComplexMulOp/ComplexDivOp to
eliminate duplicated arguments, results, format, and traits.
DeltaFile
+69-123clang/include/clang/CIR/Dialect/IR/CIROps.td
+69-1231 files

LLVM/project 78c6ebdlibclc/clc/include/clc/amdgpu amdgpu_utils.h, libclc/clc/include/clc/subgroup clc_subgroup.h

libclc: Move subgroup functions into clc (#185220)

It turns out there was a generic implementation of the id and sizes.
The practice of splitting every single function into its own file is
kind of a pain here, so introduce a utility header for amdgpu.
DeltaFile
+0-60libclc/opencl/lib/amdgcn/subgroup/subgroup.cl
+41-0libclc/opencl/lib/generic/subgroup/subgroup.cl
+28-0libclc/clc/lib/amdgcn/subgroup/subgroup.cl
+27-0libclc/clc/include/clc/amdgpu/amdgpu_utils.h
+23-0libclc/clc/include/clc/subgroup/clc_subgroup.h
+18-0libclc/clc/lib/amdgcn/workitem/clc_get_sub_group_size.cl
+137-606 files not shown
+178-6312 files

LLVM/project 8ba7837libclc/clc/include/clc/collective clc_work_group_broadcast.h clc_work_group_broadcast.inc, libclc/clc/lib/generic/collective clc_work_group_broadcast.inc clc_work_group_broadcast.cl

libclc: Add work_group_broadcast
DeltaFile
+51-0libclc/clc/lib/generic/collective/clc_work_group_broadcast.inc
+28-0libclc/opencl/lib/generic/collective/work_group_broadcast.inc
+24-0libclc/clc/lib/generic/collective/clc_work_group_broadcast.cl
+18-0libclc/clc/include/clc/collective/clc_work_group_broadcast.h
+16-0libclc/clc/include/clc/collective/clc_work_group_broadcast.inc
+15-0libclc/opencl/lib/generic/collective/work_group_broadcast.cl
+152-02 files not shown
+154-08 files

LLVM/project 1a39e70libclc/clc/include/clc/collective clc_work_group_any_all.h, libclc/clc/lib/generic SOURCES

libclc: Add work_group_any/work_group_all implementation
DeltaFile
+58-0libclc/clc/lib/generic/collective/clc_work_group_any_all.cl
+17-0libclc/clc/include/clc/collective/clc_work_group_any_all.h
+17-0libclc/opencl/lib/generic/collective/work_group_any_all.cl
+1-0libclc/clc/lib/generic/SOURCES
+1-0libclc/opencl/lib/generic/SOURCES
+94-05 files

LLVM/project 363d14clibclc/clc/include/clc/amdgpu amdgpu_utils.h, libclc/clc/include/clc/subgroup clc_subgroup.h

libclc: Move subgroup functions into clc

It turns out there was a generic implementation of the id and sizes.
The practice of splitting every single function into its own file is
kind of a pain here, so introduce a utility header for amdgpu.
DeltaFile
+0-60libclc/opencl/lib/amdgcn/subgroup/subgroup.cl
+41-0libclc/opencl/lib/generic/subgroup/subgroup.cl
+28-0libclc/clc/lib/amdgcn/subgroup/subgroup.cl
+27-0libclc/clc/include/clc/amdgpu/amdgpu_utils.h
+23-0libclc/clc/include/clc/subgroup/clc_subgroup.h
+18-0libclc/clc/lib/amdgcn/workitem/clc_get_sub_group_size.cl
+137-606 files not shown
+178-6312 files

LLVM/project 327f16flibclc/clc/include/clc/subgroup clc_sub_group_broadcast.h sub_group_broadcast.h, libclc/clc/lib/amdgcn/subgroup sub_group_broadcast.cl

libclc: Rename sub_group_broadcast header (#185212)

The other clc headers have the clc prefix, so add it.
DeltaFile
+22-0libclc/clc/include/clc/subgroup/clc_sub_group_broadcast.h
+0-22libclc/clc/include/clc/subgroup/sub_group_broadcast.h
+1-1libclc/opencl/lib/generic/subgroup/sub_group_broadcast.cl
+1-1libclc/clc/lib/amdgcn/subgroup/sub_group_broadcast.cl
+24-244 files

LLVM/project 4c9d448libclc/clc/include/clc/synchronization clc_sub_group_barrier.h, libclc/clc/lib/amdgcn/synchronization clc_sub_group_barrier.cl

libclc: Move sub_group_barrier to clc (#185208)
DeltaFile
+23-0libclc/opencl/lib/generic/synchronization/sub_group_barrier.cl
+21-0libclc/clc/include/clc/synchronization/clc_sub_group_barrier.h
+0-21libclc/opencl/lib/amdgcn/synchronization/sub_group_barrier.cl
+18-0libclc/clc/lib/amdgcn/synchronization/clc_sub_group_barrier.cl
+14-0libclc/clc/lib/generic/subgroup/sub_group_barrier.cl
+1-0libclc/clc/lib/generic/SOURCES
+77-213 files not shown
+79-229 files

LLVM/project 5785568libclc/opencl/lib/amdgcn SOURCES, libclc/opencl/lib/amdgcn/mem_fence fence.cl

libclc: Remove target opencl copies of mem_fence (#185207)
DeltaFile
+31-0libclc/opencl/lib/generic/mem_fence/fence.cl
+0-31libclc/opencl/lib/amdgcn/mem_fence/fence.cl
+0-31libclc/opencl/lib/ptx-nvidiacl/mem_fence/fence.cl
+0-1libclc/opencl/lib/amdgcn/SOURCES
+1-0libclc/opencl/lib/generic/SOURCES
+0-1libclc/opencl/lib/ptx-nvidiacl/SOURCES
+32-646 files

LLVM/project 871946alibclc/clc/include/clc/amdgpu amdgpu_utils.h, libclc/clc/include/clc/subgroup clc_subgroup.h

libclc: Move subgroup functions into clc

It turns out there was a generic implementation of the id and sizes.
The practice of splitting every single function into its own file is
kind of a pain here, so introduce a utility header for amdgpu.
DeltaFile
+0-60libclc/opencl/lib/amdgcn/subgroup/subgroup.cl
+41-0libclc/opencl/lib/generic/subgroup/subgroup.cl
+33-0libclc/clc/lib/amdgcn/subgroup/subgroup.cl
+27-0libclc/clc/include/clc/amdgpu/amdgpu_utils.h
+23-0libclc/clc/include/clc/subgroup/clc_subgroup.h
+18-0libclc/clc/lib/amdgcn/workitem/clc_get_sub_group_size.cl
+142-606 files not shown
+183-6312 files

LLVM/project 80a2ea0libclc/clc/include/clc/subgroup clc_sub_group_broadcast.h sub_group_broadcast.h, libclc/clc/lib/amdgcn/subgroup sub_group_broadcast.cl

libclc: Rename sub_group_broadcast header

The other clc headers have the clc prefix, so add it.
DeltaFile
+22-0libclc/clc/include/clc/subgroup/clc_sub_group_broadcast.h
+0-22libclc/clc/include/clc/subgroup/sub_group_broadcast.h
+1-1libclc/clc/lib/amdgcn/subgroup/sub_group_broadcast.cl
+1-1libclc/opencl/lib/generic/subgroup/sub_group_broadcast.cl
+24-244 files

LLVM/project a7f32d0libclc/clc/include/clc/synchronization clc_sub_group_barrier.h, libclc/clc/lib/amdgcn/synchronization clc_sub_group_barrier.cl

libclc: Move sub_group_barrier to clc
DeltaFile
+23-0libclc/opencl/lib/generic/synchronization/sub_group_barrier.cl
+21-0libclc/clc/include/clc/synchronization/clc_sub_group_barrier.h
+0-21libclc/opencl/lib/amdgcn/synchronization/sub_group_barrier.cl
+18-0libclc/clc/lib/amdgcn/synchronization/clc_sub_group_barrier.cl
+14-0libclc/clc/lib/generic/subgroup/sub_group_barrier.cl
+1-0libclc/clc/lib/generic/SOURCES
+77-213 files not shown
+79-229 files

LLVM/project ec10439libclc/opencl/lib/amdgcn SOURCES, libclc/opencl/lib/amdgcn/mem_fence fence.cl

libclc: Remove target opencl copies of mem_fence
DeltaFile
+0-31libclc/opencl/lib/amdgcn/mem_fence/fence.cl
+31-0libclc/opencl/lib/generic/mem_fence/fence.cl
+0-31libclc/opencl/lib/ptx-nvidiacl/mem_fence/fence.cl
+0-1libclc/opencl/lib/ptx-nvidiacl/SOURCES
+0-1libclc/opencl/lib/amdgcn/SOURCES
+31-645 files

LLVM/project 6577520libclc/clc/include/clc/synchronization clc_work_group_barrier.h, libclc/clc/lib/amdgcn/synchronization clc_work_group_barrier.cl

libclc: Use separate acquire and release fences in work_group_barrier (#185190)
DeltaFile
+17-3libclc/clc/lib/amdgcn/synchronization/clc_work_group_barrier.cl
+1-2libclc/clc/lib/ptx-nvidiacl/synchronization/clc_work_group_barrier.cl
+1-2libclc/opencl/lib/generic/synchronization/work_group_barrier.cl
+1-1libclc/clc/include/clc/synchronization/clc_work_group_barrier.h
+20-84 files

LLVM/project 592c0f5libclc/clc/lib/generic/workitem clc_get_global_id.cl

libclc: Use enqueued local size to implement get_global_id (#185181)
DeltaFile
+2-2libclc/clc/lib/generic/workitem/clc_get_global_id.cl
+2-21 files

LLVM/project b1b8a00libclc/clc/lib/generic/workitem clc_get_global_id.cl

libclc: Use enqueued local size to implement get_global_id
DeltaFile
+2-2libclc/clc/lib/generic/workitem/clc_get_global_id.cl
+2-21 files

LLVM/project b740343libclc/clc/lib/generic SOURCES, libclc/clc/lib/generic/workitem clc_get_global_id.cl

libclc: Move get_global_id into clc (#185180)
DeltaFile
+17-0libclc/clc/lib/generic/workitem/clc_get_global_id.cl
+2-3libclc/opencl/lib/generic/workitem/get_global_id.cl
+1-0libclc/clc/lib/generic/SOURCES
+20-33 files

LLVM/project 80ff736libclc/opencl/lib/amdgcn SOURCES, libclc/opencl/lib/amdgcn/printf __printf_alloc.cl

libclc: Add __printf_alloc implementation (#185231)

AMDGPU OpenCL printf implementation emits a call to this helper
function.
DeltaFile
+36-0libclc/opencl/lib/amdgcn/printf/__printf_alloc.cl
+1-0libclc/opencl/lib/amdgcn/SOURCES
+37-02 files

LLVM/project a729e0aclang/include/clang/CIR/Dialect/IR CIROps.td

[CIR] Extract CIR_VAOp base class for VAStartOp and VAEndOp

Both ops share identical arguments and assembly format. Extract a common
base class to eliminate the duplication.
DeltaFile
+10-13clang/include/clang/CIR/Dialect/IR/CIROps.td
+10-131 files

LLVM/project 7693489libunwind/src libunwind.cpp, libunwind/test cfi_violating_handler.pass.cpp

[libunwind][PAC] Defang ptrauth's PC in valid CFI range abort

It turns out making the CFI check a release mode abort causes many,
if not the majority, of JITs to fail during unwinding as they do not
set up CFI sections for their generated code. As a result any JITs
that do nominally support unwinding (and catching) through their JIT
or assembly frames trip this abort.

rdar://170862047
DeltaFile
+52-0libunwind/test/cfi_violating_handler.pass.cpp
+11-17libunwind/src/libunwind.cpp
+63-172 files

LLVM/project 4a82b2dclang/test/CIR/CodeGenOpenACC private-clause-pointer-array-recipes-CtorDtor.cpp loop-reduction-clause-default-ops.cpp, clang/test/CIR/IR cmp.cir

[CIR] Change CmpOp assembly format to use bare keyword style

Update the assembly format of cir.cmp from the parenthesized style
  cir.cmp(gt, %a, %b) : !s32i, !cir.bool
to the bare keyword style used by other CIR ops like cir.cast:
  cir.cmp gt %a, %b : !s32i

The result type (!cir.bool) is now automatically inferred as it is
always cir::BoolType.
DeltaFile
+64-64clang/test/CIR/CodeGenOpenACC/private-clause-pointer-array-recipes-CtorDtor.cpp
+60-60clang/test/CIR/IR/cmp.cir
+57-57clang/test/CIR/CodeGenOpenACC/loop-reduction-clause-default-ops.cpp
+57-57clang/test/CIR/CodeGenOpenACC/compute-reduction-clause-default-ops.cpp
+57-57clang/test/CIR/CodeGenOpenACC/combined-reduction-clause-default-ops.cpp
+57-57clang/test/CIR/CodeGenOpenACC/compute-reduction-clause-default-ops.c
+352-35282 files not shown
+1,334-1,30888 files

LLVM/project d61d45clldb/source/Plugins/LanguageRuntime/CPlusPlus ItaniumABIRuntime.cpp, lldb/source/Plugins/LanguageRuntime/CPlusPlus/ItaniumABI ItaniumABILanguageRuntime.cpp

Merge branch 'main' into users/c8ef/apinotes-for-bounds-safety
DeltaFile
+0-690lldb/source/Plugins/LanguageRuntime/CPlusPlus/ItaniumABI/ItaniumABILanguageRuntime.cpp
+564-0llvm/test/Transforms/LoopVectorize/X86/epilog-vectorization-ordered-reduction.ll
+452-0lldb/source/Plugins/LanguageRuntime/CPlusPlus/ItaniumABIRuntime.cpp
+305-32llvm/tools/llubi/lib/Context.cpp
+192-0llvm/test/tools/llubi/loadstore_le.ll
+190-0llvm/test/tools/llubi/loadstore_be.ll
+1,703-72282 files not shown
+3,207-1,35488 files

LLVM/project 0b2aaaclibclc/opencl/lib/amdgcn/workitem get_group_id.cl, libclc/opencl/lib/generic/workitem get_group_id.cl get_local_id.cl

libclc: Remove target definitions of opencl workitem functions (#185206)

These were just calling the __clc implementation, so move it
to generic.
DeltaFile
+0-13libclc/opencl/lib/amdgcn/workitem/get_group_id.cl
+13-0libclc/opencl/lib/generic/workitem/get_group_id.cl
+13-0libclc/opencl/lib/generic/workitem/get_local_id.cl
+0-13libclc/opencl/lib/ptx-nvidiacl/workitem/get_local_linear_id.cl
+0-13libclc/opencl/lib/ptx-nvidiacl/workitem/get_sub_group_id.cl
+0-13libclc/opencl/lib/ptx-nvidiacl/workitem/get_max_sub_group_size.cl
+26-5214 files not shown
+54-18020 files

LLVM/project fc12d06llvm/test/CodeGen/AMDGPU amdgcn.bitcast.1024bit.ll amdgcn.bitcast.512bit.ll, llvm/test/CodeGen/RISCV/rvv clmulh-sdnode.ll

Merge branch 'main' into users/c8ef/improve_fold_left_test
DeltaFile
+84,419-78,498llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.1024bit.ll
+66,293-29,491llvm/test/CodeGen/RISCV/rvv/clmulh-sdnode.ll
+25,751-24,782llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.512bit.ll
+23,663-20,281llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.960bit.ll
+21,867-18,577llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.896bit.ll
+19,112-16,445llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.832bit.ll
+241,105-188,0743,777 files not shown
+507,172-331,6613,783 files