LLVM/project 2f6a8a7mlir/docs/Dialects NVVMDialect.md

[MLIR][NVVM] Add operations and interfaces
DeltaFile
+14-1mlir/docs/Dialects/NVVMDialect.md
+14-11 files

LLVM/project e38529dllvm/lib/CodeGen/SelectionDAG SelectionDAG.cpp, llvm/test/CodeGen/X86 vector-compress-freeze.ll

[DAG] Update canCreateUndefOrPoison to handle ISD::VECTOR_COMPRESS (#168010)

Fixes #167710
DeltaFile
+36-0llvm/test/CodeGen/X86/vector-compress-freeze.ll
+3-0llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
+39-02 files

LLVM/project 0730913llvm/lib/Transforms/Vectorize VPlanRecipes.cpp, llvm/test/Transforms/LoopVectorize vplan-printing.ll

[VPlan] Print debug info for all recipes. (#168454)

Use the recently refactored VPRecipeBase::print to print debug location
for all recipes.

PR: https://github.com/llvm/llvm-project/pull/168454
DeltaFile
+11-22llvm/test/Transforms/LoopVectorize/vplan-printing.ll
+4-5llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
+15-272 files

LLVM/project 915e9adclang-tools-extra/clang-tidy/google AvoidCStyleCastsCheck.cpp, clang-tools-extra/docs ReleaseNotes.rst

[clang-tidy] Provide fix-its for casts to void* in google-readability-casting (#167655)

DeltaFile
+6-0clang-tools-extra/clang-tidy/google/AvoidCStyleCastsCheck.cpp
+4-0clang-tools-extra/test/clang-tidy/checkers/google/readability-casting.cpp
+1-1clang-tools-extra/docs/ReleaseNotes.rst
+11-13 files

LLVM/project 907e851clang/lib/Interpreter IncrementalExecutor.cpp, llvm/include/llvm/ExecutionEngine/Orc EPCDebugObjectRegistrar.h

[ORC] Remove now unused EPCDebugObjectRegistrar (NFC) (#167868)

EPCDebugObjectRegistrar is unused now that the ELF debugger support plugin uses AllocActions
https://github.com/llvm/llvm-project/pull/167866
DeltaFile
+0-69llvm/include/llvm/ExecutionEngine/Orc/EPCDebugObjectRegistrar.h
+0-61llvm/lib/ExecutionEngine/Orc/EPCDebugObjectRegistrar.cpp
+0-16llvm/lib/ExecutionEngine/Orc/TargetProcess/JITLoaderGDB.cpp
+1-3clang/lib/Interpreter/IncrementalExecutor.cpp
+2-2utils/bazel/llvm-project-overlay/llvm/BUILD.bazel
+0-3llvm/include/llvm/ExecutionEngine/Orc/TargetProcess/JITLoaderGDB.h
+3-1549 files not shown
+5-16415 files

LLVM/project a2af185mlir/lib/Dialect/Tosa/Transforms CMakeLists.txt

[mlir][tosa] Fix linker failure in build bots introduced by #165581 (#168581)

This commit fixes linker failures evident on some failing build bots.
DeltaFile
+1-0mlir/lib/Dialect/Tosa/Transforms/CMakeLists.txt
+1-01 files

LLVM/project 4ab1d06mlir/include/mlir/Dialect/LLVMIR NVVMOps.td, mlir/lib/Dialect/LLVMIR/IR NVVMDialect.cpp

Reland "[MLIR][NVVM] Add tcgen05.mma MLIR Ops (#164356)" (#168638)

Reland commit fb829bf11feeb53f815a3abf539e63ec3a23ed3d with additional fixes relating to post-merge CI failure

```
/vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.src/mlir/lib/Dialect/LLVMIR/IR/NVVMDialect.cpp: In function ‘constexpr llvm::nvvm::CTAGroupKind getNVVMCtaGroupKind(mlir::NVVM::CTAGroupKind)’:
/vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.src/llvm/include/llvm/Support/ErrorHandling.h:165:36: error: call to non-constexpr function ‘void llvm::llvm_unreachable_internal(const char*, const char*, unsigned int)’
   ::llvm::llvm_unreachable_internal(msg, __FILE__, __LINE__)
   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~
/vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.src/mlir/lib/Dialect/LLVMIR/IR/NVVMDialect.cpp:73:3: note: in expansion of macro ‘llvm_unreachable’
   llvm_unreachable("unsupported cta_group value");
   ^
```
DeltaFile
+634-0mlir/test/Target/LLVMIR/nvvm/tcgen05-mma-sp-tensor.mlir
+633-0mlir/test/Target/LLVMIR/nvvm/tcgen05-mma-tensor.mlir
+612-0mlir/lib/Dialect/LLVMIR/IR/NVVMDialect.cpp
+545-0mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td
+442-0mlir/test/Target/LLVMIR/nvvm/tcgen05-mma-shared.mlir
+442-0mlir/test/Target/LLVMIR/nvvm/tcgen05-mma-sp-shared.mlir
+3,308-09 files not shown
+4,875-015 files

LLVM/project 711a295llvm/lib/Target/AMDGPU AMDGPUBarrierLatency.cpp AMDGPUTargetMachine.cpp, llvm/test/CodeGen/AMDGPU schedule-barrier-latency.mir

[AMDGPU] Ignore wavefront barrier latency during scheduling DAG mutation (#168500)

Do not add latency for wavefront and singlethread scope fences during
barrier latency DAG mutation.
These scopes do not typically introduce any latency and adjusting
schedules based on them significantly impacts latency hiding.
DeltaFile
+277-2llvm/test/CodeGen/AMDGPU/schedule-barrier-latency.mir
+17-5llvm/lib/Target/AMDGPU/AMDGPUBarrierLatency.cpp
+3-3llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
+4-1llvm/lib/Target/AMDGPU/AMDGPUBarrierLatency.h
+301-114 files

LLVM/project fddfc70clang-tools-extra/docs/clang-tidy/checks list.rst

[clang-tidy][NFC] Fix order in `list.rst` (#168683)

This issue was introduced in
https://github.com/llvm/llvm-project/pull/167689
DeltaFile
+1-1clang-tools-extra/docs/clang-tidy/checks/list.rst
+1-11 files

LLVM/project de9c182orc-rt/docs Design.md

[orc-rt] Initial ORC Runtime design documentation. (#168681)

This document aims to lay out the high level design and goals of the ORC
runtime, and the relationships between key components.
DeltaFile
+119-0orc-rt/docs/Design.md
+119-01 files

LLVM/project f8e83c4mlir/include/mlir/Transforms Passes.h

[mlir] Use dictionary order to order the pass decl (NFC) (#168648)

DeltaFile
+7-7mlir/include/mlir/Transforms/Passes.h
+7-71 files

LLVM/project 429e315llvm/lib/Target/RISCV RISCVSubtarget.cpp

[RISCV] Convert -mtune=generic to generic-rv32/rv64 in RISCVSubtarget::initializeSubtargetDependencies. (#168612)

The "generic" entry in tablegen is really a dummy entry. We shouldn't
use it for anything. Remap "generic" to either generic-rv32 or
generic-rv64 based on the triple.
DeltaFile
+2-0llvm/lib/Target/RISCV/RISCVSubtarget.cpp
+2-01 files

LLVM/project 58d9e47bolt/test lit.local.cfg

[NFCI][bolt][test] Use AT&T syntax explicitly (#167225)

This enables building LLVM with `-mllvm -x86-asm-syntax=intel` in one's
Clang config files (i.e. a global preference for Intel syntax).

`-masm=att` is insufficient as it doesn't override a specification of `-mllvm -x86-asm-syntax`.
DeltaFile
+1-1bolt/test/lit.local.cfg
+1-11 files

LLVM/project 669c30cmlir/docs/Dialects NVVMDialect.md, mlir/docs/Dialects/NVVM _index.md

[MLIR][NVVM] Move docs to correct folder
DeltaFile
+87-0mlir/docs/Dialects/NVVMDialect.md
+0-84mlir/docs/Dialects/NVVM/_index.md
+87-842 files

LLVM/project ec90912clang/include/clang/Basic BuiltinsNVPTX.td, clang/test/CodeGen builtins-nvptx.c

[clang][NVPTX] Add remaining float to fp16 conversions (#167641)

This change adds intrinsics and clang builtins for the remaining float
to fp16 conversions. This includes the following conversions:

- float to bf16x2 - satfinite variants
- float to f16x2 - satfinite variants
- float to bf16 - satfinite variants
- float to f16 - all variants

Tests are added in `convert-sm80.ll` and `convert-sm80-sf.ll` for the
intrinsics and in `builtins-nvptx.c` for the clang builtins.
DeltaFile
+260-0llvm/test/CodeGen/NVPTX/convert-sm80-sf.ll
+65-0llvm/test/CodeGen/NVPTX/convert-sm80.ll
+49-0clang/test/CodeGen/builtins-nvptx.c
+29-1llvm/lib/Target/NVPTX/NVPTXIntrinsics.td
+21-0clang/include/clang/Basic/BuiltinsNVPTX.td
+11-6llvm/include/llvm/IR/IntrinsicsNVVM.td
+435-71 files not shown
+449-77 files

LLVM/project 208be48llvm/lib/Target/LoongArch LoongArchLateBranchOpt.cpp

update isLoadImm
DeltaFile
+17-11llvm/lib/Target/LoongArch/LoongArchLateBranchOpt.cpp
+17-111 files

LLVM/project 50e7702llvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs loongarch_generated_funcs.ll.generated.expected loongarch_generated_funcs.ll.nogenerated.expected

tests passed
DeltaFile
+1-1llvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/loongarch_generated_funcs.ll.generated.expected
+1-1llvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/loongarch_generated_funcs.ll.nogenerated.expected
+2-22 files

LLVM/project ac68dd5llvm/lib/Target/RISCV RISCVCodeGenPrepare.cpp RISCVPassRegistry.def, llvm/test/CodeGen/RISCV riscv-codegenprepare.ll

[RISCV][NewPM] Port RISCVCodeGenPrepare to the new pass manager (#168381)

As suggested in the review for #160536 it would be good to follow up and
port the RISC-V passes to the new pass manager. This PR starts that
task. It provides the bare minimum necessary to run RISCVCodeGenPrepare
with opt -passes=riscv-codegenprepare. The approach used is modeled on
my observations of the AMDGPU backend and the recent work to port the
X86 passes.

The testing approach is to add a `-passes=riscv-foo` RUN line to at
least one test, if an appropriate test exists.
DeltaFile
+53-29llvm/lib/Target/RISCV/RISCVCodeGenPrepare.cpp
+20-0llvm/lib/Target/RISCV/RISCVPassRegistry.def
+10-2llvm/lib/Target/RISCV/RISCV.h
+5-2llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
+1-0llvm/test/CodeGen/RISCV/riscv-codegenprepare.ll
+89-335 files

LLVM/project 5109f2allvm/utils profcheck-xfail.txt

Exclude from profcheck a vplan test under phase ordering (#168669)

DeltaFile
+1-0llvm/utils/profcheck-xfail.txt
+1-01 files

LLVM/project ed1c8d7lld/test/ELF dso-undef-extract-lazy.s

ELF,test: Test unversioned undefined symbols of index 0 and 1

My 2020 change that added versioned symbol recognition
(reviews.llvm.org/D80059) checks both VER_NDX_LOCAL and VER_NDX_GLOBAL,
though test coverage was missing. lld/test/ELF/dso-undef-extract-lazy.s
checks that the undefined symbol is indeed considered unversioned.
DeltaFile
+41-0lld/test/ELF/dso-undef-extract-lazy.s
+41-01 files

LLVM/project 5bba4fdlibc/test/src/stdio fileop_test.cpp

[libc] Fix -Wshorten-64-to-32 in fileop_test. (#168451)

Explicitly cast 0 to size_t type to match fread() return type. This
follows the pattern used elsewhere in this file, and fixes
-Wshorten-64-to-32 warnings when building the test.
DeltaFile
+2-2libc/test/src/stdio/fileop_test.cpp
+2-21 files

LLVM/project be1a504orc-rt/include/orc-rt Session.h, orc-rt/lib/executor Session.cpp

[orc-rt] Simplify Session shutdown. (#168664)

Moves all Session member variables dedicated to shutdown into a new
ShutdownInfo struct, and uses the presence / absence of this struct as
the flag to indicate that we've entered the "shutting down" state. This
simplifies the implementation of the shutdown process.
DeltaFile
+18-25orc-rt/lib/executor/Session.cpp
+8-7orc-rt/include/orc-rt/Session.h
+26-322 files

LLVM/project 9dc4ebfmlir/include/mlir/Dialect/XeGPU/IR XeGPUTypes.td XeGPUOps.td, mlir/lib/Conversion/XeGPUToXeVM XeGPUToXeVM.cpp

[MLIR][XeGPU] Allow create mem desc from 2d memref (#167767)

This PR relax the create_mem_desc's restriction on source memref,
allowing it to be a 2d memref.
DeltaFile
+52-29mlir/test/Conversion/XeGPUToXeVM/loadstore_matrix.mlir
+9-29mlir/lib/Conversion/XeGPUToXeVM/XeGPUToXeVM.cpp
+21-0mlir/test/Dialect/XeGPU/ops.mlir
+11-0mlir/include/mlir/Dialect/XeGPU/IR/XeGPUTypes.td
+3-8mlir/include/mlir/Dialect/XeGPU/IR/XeGPUOps.td
+1-1mlir/test/Dialect/XeGPU/invalid.mlir
+97-676 files

LLVM/project f38cf01libclc/opencl/lib/amdgcn/cl_khr_int64_extended_atomics minmax_helpers.ll, libclc/opencl/lib/generic/atomic atom_min.cl atom_max.cl

[libclc] Use CLC atomic functions for legacy OpenCL atom/atomic builtins (#168325)

Main changes:
* OpenCL legacy atom/atomic builtins now call CLC atomic functions
(which use Clang __scoped_atomic_*), replacing previous Clang __sync_*
functions.
* Change memory order from seq_cst to relaxed; keep device scope (spec
permits broader than workgroup). LLVM IR for _Z8atom_decPU3AS1Vi in
amdgcn--amdhsa.bc:
  Before:
%2 = atomicrmw volatile sub ptr subrspace(1) %0, i32 1
syncscope("agent") seq_cst
  After:
%2 = atomicrmw volatile sub ptr subrspace(1) %0, i32 1
syncscope("agent") monotonic
* Also adds OpenCL 1.0 atom_* variants without volatile on the pointer.
They are added for backward compatibility.
DeltaFile
+0-55libclc/opencl/lib/amdgcn/cl_khr_int64_extended_atomics/minmax_helpers.ll
+20-25libclc/opencl/lib/generic/atomic/atom_min.cl
+20-25libclc/opencl/lib/generic/atomic/atom_max.cl
+19-16libclc/opencl/lib/generic/atomic/atom_xor.cl
+19-16libclc/opencl/lib/generic/atomic/atom_xchg.cl
+19-16libclc/opencl/lib/generic/atomic/atom_sub.cl
+97-15320 files not shown
+208-28326 files

LLVM/project f7f4135llvm/lib/Transforms/Vectorize LoopVectorize.cpp, llvm/test/Transforms/LoopVectorize/AArch64 interleave-with-gaps.ll store-costs-sve.ll

[LV]: Skip Epilogue scalable VF greater than RemainingIterations. (#156724)

Consider skipping epilogue scalable VF when they are greater than
RemainingIterations same as fixed VF.
And skip scalable RemainingIterations from that comparison because
SCEV ATM can't evaluate non-canonical vscale-based expressions.
DeltaFile
+17-119llvm/test/Transforms/LoopVectorize/AArch64/interleave-with-gaps.ll
+28-27llvm/test/Transforms/LoopVectorize/AArch64/store-costs-sve.ll
+18-7llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+63-1533 files

LLVM/project c942ebd.github new-prs-labeler.yml, .github/workflows new-prs.yml

Reapply "[Github] Update PR labeller to v6.0.1 (#167246)"

This reverts commit d772663a9f003a08ee76414397963c58e80b27d7.

This fixes the final issue with the labeller landing. There were
two remaining issues:
1. There was an extra quote on one of the globs
2. Some of the yaml keys were named incorrectly (should have been
   plural)
DeltaFile
+1,130-812.github/new-prs-labeler.yml
+1-4.github/workflows/new-prs.yml
+1,131-8162 files

LLVM/project fa50a68llvm/lib/Target/PowerPC PPCISelLowering.cpp PPCISelLowering.h, llvm/test/CodeGen/PowerPC saddo-ssubo.ll

[PowerPC] Add custom lowering for SADD overflow for i32 and i64 (#159255)

This patch improves the codegen for saddo on i32 and i64 in both 32-bit
and 64-bit modes by custom lowering. It implements signed-add overflow
detection using the `(x eqv y) & (sum xor x)`bit-level sequence.
DeltaFile
+37-1llvm/lib/Target/PowerPC/PPCISelLowering.cpp
+10-12llvm/test/CodeGen/PowerPC/saddo-ssubo.ll
+1-0llvm/lib/Target/PowerPC/PPCISelLowering.h
+48-133 files

LLVM/project 56c9655compiler-rt/test/asan/TestCases/Darwin lit.local.cfg.py interface_symbols_darwin.cpp

[𝘀𝗽𝗿] changes to main this commit is based on

Created using spr 1.3.7

[skip ci]
DeltaFile
+25-0compiler-rt/test/asan/TestCases/Darwin/lit.local.cfg.py
+11-7compiler-rt/test/asan/TestCases/Darwin/interface_symbols_darwin.cpp
+2-13compiler-rt/test/asan/TestCases/Darwin/dyld_insert_libraries_reexec.cpp
+38-203 files

LLVM/project ae8c0e1compiler-rt/test/asan/TestCases/Darwin lit.local.cfg.py interface_symbols_darwin.cpp, compiler-rt/test/asan/TestCases/Darwin/Inputs check-syslog.sh

[𝘀𝗽𝗿] initial version

Created using spr 1.3.7
DeltaFile
+25-0compiler-rt/test/asan/TestCases/Darwin/lit.local.cfg.py
+11-7compiler-rt/test/asan/TestCases/Darwin/interface_symbols_darwin.cpp
+2-13compiler-rt/test/asan/TestCases/Darwin/dyld_insert_libraries_reexec.cpp
+3-7compiler-rt/test/asan/TestCases/Darwin/duplicate_os_log_reports.cpp
+6-0compiler-rt/test/asan/TestCases/Darwin/Inputs/check-syslog.sh
+47-275 files

LLVM/project 725ef09compiler-rt/test/asan/TestCases/Darwin interface_symbols_darwin.cpp

[𝘀𝗽𝗿] changes to main this commit is based on

Created using spr 1.3.7

[skip ci]
DeltaFile
+11-7compiler-rt/test/asan/TestCases/Darwin/interface_symbols_darwin.cpp
+11-71 files