LLVM/project 08f554fcross-project-tests lit.cfg.py

[cross-project-tests] Use lit internal shell (#203138)
DeltaFile
+1-1cross-project-tests/lit.cfg.py
+1-11 files

LLVM/project 433a41e.github/workflows test-suite.yml

[GitHub] React to /test-suite comment (#203151)

So the user knows the workflow has kicked off. I've put it in a separate
job with write permissions so the main job should still only have a read
only token.
DeltaFile
+18-5.github/workflows/test-suite.yml
+18-51 files

LLVM/project 5d74065.github/workflows subscriber.yml

workflows/subscriber: Use github-automation container (#202777)

This simplifies the workflow and might help it run faster too.
DeltaFile
+4-14.github/workflows/subscriber.yml
+4-141 files

LLVM/project 5cf20a6llvm/lib/Target/NVPTX NVPTXISelLowering.cpp, llvm/test/CodeGen/NVPTX math-intrins.ll

Reapply "[NVPTX] Support lowering of `(l)lround`" (#202876)

Reverts llvm/llvm-project#202500

Original PR llvm/llvm-project#183901 was mistakenly reverted due to an
unrelated build failure.
DeltaFile
+151-0llvm/test/CodeGen/NVPTX/math-intrins.ll
+2-0llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp
+153-02 files

LLVM/project 9b06039mlir/examples/standalone/test lit.cfg.py

[MLIR] Use internal shell for standalone tests (#203134)

The external shell will be removed soon

(https://discourse.llvm.org/t/rfc-removal-of-the-lit-external-shell/90951),
and this is one of the places where it hasn't been enabled by default.
There are no test failures caused by this, so we can just turn it on by
not explicitly setting execute_external as it defaults to False.
DeltaFile
+1-1mlir/examples/standalone/test/lit.cfg.py
+1-11 files

LLVM/project dd315a5llvm/lib/Target/AMDGPU AMDGPULowerBufferFatPointers.cpp, llvm/test/CodeGen/AMDGPU lower-buffer-fat-pointers-memops.ll

[AMDGPU] Set success flag for weak cmpxchg in LowerBufferFatPointers (#203033)
DeltaFile
+2-4llvm/lib/Target/AMDGPU/AMDGPULowerBufferFatPointers.cpp
+3-1llvm/test/CodeGen/AMDGPU/lower-buffer-fat-pointers-memops.ll
+5-52 files

LLVM/project 4a3fe8ellvm/lib/Target/AMDGPU AMDGPULibCalls.cpp, llvm/test/CodeGen/AMDGPU amdgpu-simplify-libcall-rootn.ll amdgpu-simplify-libcall-rootn-fast.ll

[AMDGPU] Gate rootn(x, +-2) -> sqrt/rsqrt fold on nsz/ninf (#200578)
DeltaFile
+51-28llvm/test/CodeGen/AMDGPU/amdgpu-simplify-libcall-rootn.ll
+20-12llvm/test/CodeGen/AMDGPU/amdgpu-simplify-libcall-rootn-fast.ll
+7-2llvm/lib/Target/AMDGPU/AMDGPULibCalls.cpp
+78-423 files

LLVM/project 7588c95compiler-rt/test/builtins/Unit lit.cfg.py

Revert "[Compiler-rt][test] Fix circular link dependency between builtins and libc" (#203152)

Reverts llvm/llvm-project#199482 due to failures when it's used on
platforms with non-ELF linkers. The patch needs additional guards, but
it's not immediately clear which platform linkers support the required
options.
DeltaFile
+1-3compiler-rt/test/builtins/Unit/lit.cfg.py
+1-31 files

LLVM/project 2feff1fcompiler-rt/test/builtins/Unit lit.cfg.py

Revert "[Compiler-rt][test] Fix circular link dependency between builtins and…"

This reverts commit 8b4902300521d4a0980d9d35210c02b405f0df86.
DeltaFile
+1-3compiler-rt/test/builtins/Unit/lit.cfg.py
+1-31 files

LLVM/project b19f4dbllvm/lib/Target/AMDGPU EvergreenInstructions.td AMDGPUInstrInfo.td

AMDGPU: Remove AMDGPUbfm (#203148)

It wasn't actually used. We select [SV]_BFM_B32 by directly matching
shift-based patterns.
DeltaFile
+1-4llvm/lib/Target/AMDGPU/EvergreenInstructions.td
+0-3llvm/lib/Target/AMDGPU/AMDGPUInstrInfo.td
+1-2llvm/lib/Target/AMDGPU/SOPInstructions.td
+2-93 files

LLVM/project 1cda91dllvm/lib/Demangle DLangDemangle.cpp

[Demangle] Fix leak of temporary TypeBuf buffer in DLangDemangle (#203116)

Detected by sanitizer
https://lab.llvm.org/buildbot/#/builders/55/builds/28902 after merge.
DeltaFile
+1-0llvm/lib/Demangle/DLangDemangle.cpp
+1-01 files

LLVM/project 4b97a8dllvm/lib/Target/AMDGPU GCNHazardRecognizer.cpp AMDGPU.td, llvm/test/CodeGen/AMDGPU wmma-hazards-gfx1250-w32.mir wmma-coexecution-valu-hazards.mir

[AMDGPU] Handle gfx1251 wmma hazard

Generic target affected too in a pessimistic way.
DeltaFile
+1,537-0llvm/test/CodeGen/AMDGPU/wmma-hazards-gfx1250-w32.mir
+895-0llvm/test/CodeGen/AMDGPU/wmma-coexecution-valu-hazards.mir
+33-10llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp
+42-0llvm/test/CodeGen/AMDGPU/wmma-hazards-gfx1251-w32.mir
+7-1llvm/lib/Target/AMDGPU/AMDGPU.td
+2,514-115 files

LLVM/project d67eca2llvm/include/llvm/IR IntrinsicsAMDGPU.td, llvm/lib/Target/AMDGPU AMDGPUISelDAGToDAG.cpp

[AMDGPU] Intrinsic and codegen for wmma_f64_16x16x4_f64
DeltaFile
+145-0llvm/test/CodeGen/AMDGPU/llvm.amdgcn.wmma.imod.gfx1251.w32.ll
+144-0llvm/test/CodeGen/AMDGPU/llvm.amdgcn.wmma.imm.gfx1251.w32.ll
+61-0llvm/test/CodeGen/AMDGPU/llvm.amdgcn.wmma.gfx1251.w32.ll
+23-0llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp
+7-0llvm/test/Analysis/UniformityAnalysis/AMDGPU/intrinsics.ll
+5-1llvm/include/llvm/IR/IntrinsicsAMDGPU.td
+385-12 files not shown
+390-18 files

LLVM/project b9b64c5clang/include/clang/Basic BuiltinsAMDGPUDocs.td BuiltinsAMDGPU.td, clang/lib/CodeGen/TargetBuiltins AMDGPU.cpp

[AMDGPU] Builtin support for wmma_f64_16x16x4_f64
DeltaFile
+19-0clang/test/CodeGenOpenCL/builtins-amdgcn-gfx1251-wmma-w32.cl
+17-0clang/test/SemaOpenCL/builtins-amdgcn-error-gfx1251-wmma-w32-param.cl
+15-0clang/include/clang/Basic/BuiltinsAMDGPUDocs.td
+6-0clang/include/clang/Basic/BuiltinsAMDGPU.td
+5-0clang/lib/CodeGen/TargetBuiltins/AMDGPU.cpp
+2-2clang/test/CodeGenCXX/dynamic-cast-address-space.cpp
+64-23 files not shown
+68-49 files

LLVM/project 77825c8llvm/lib/Target/AMDGPU SISchedule.td GCNProcessors.td

[AMDGPU] Add gfx1251 speed model

Adjust generic speed model to account for a slowest.
DeltaFile
+60-5llvm/lib/Target/AMDGPU/SISchedule.td
+2-2llvm/lib/Target/AMDGPU/GCNProcessors.td
+62-72 files

LLVM/project 674fc1allvm/lib/Target/AMDGPU VOP3PInstructions.td AMDGPU.td, llvm/test/MC/AMDGPU gfx1251_asm_wmma_w32.s gfx1251_asm_wmma_w32_err.s

[AMDGPU] MC support for v_wmma_f64_16x16x4_f64
DeltaFile
+49-0llvm/test/MC/AMDGPU/gfx1251_asm_wmma_w32.s
+29-0llvm/test/MC/Disassembler/AMDGPU/gfx1251_dasm_wmma_w32.txt
+7-0llvm/lib/Target/AMDGPU/VOP3PInstructions.td
+7-0llvm/test/MC/AMDGPU/gfx1251_asm_wmma_w32_err.s
+5-0llvm/lib/Target/AMDGPU/AMDGPU.td
+97-05 files

LLVM/project a36610cllvm/include/llvm Pass.h, llvm/include/llvm/IR PrintPasses.h

[CodeGen] Support --print-changed for legacy codegen IR passes (#202252)

--print-changed is only wired into MachineFunctionPass (
https://reviews.llvm.org/D133055), so the IR-level passes in the codegen
pipeline (atomic-expand, codegenprepare, etc.) are not reported.

Report them from FPPassManager/MPPassManager instead, via a new
Pass::printIRUnit hook that MachineFunctionPass overrides to print MIR.
Analyses are skipped, matching the new pass manager.

Aided by Claude Opus 4.8
DeltaFile
+9-63llvm/lib/CodeGen/MachineFunctionPass.cpp
+66-1llvm/lib/IR/LegacyPassManager.cpp
+50-0llvm/lib/IR/PrintPasses.cpp
+34-1llvm/test/Other/print-changed-machine.ll
+7-0llvm/include/llvm/IR/PrintPasses.h
+5-0llvm/include/llvm/Pass.h
+171-652 files not shown
+178-658 files

LLVM/project 2ec6d5dllvm/lib/Target/AMDGPU GCNHazardRecognizer.cpp AMDGPU.td, llvm/test/CodeGen/AMDGPU wmma-hazards-gfx1250-w32.mir wmma-coexecution-valu-hazards.mir

[AMDGPU] Handle gfx1251 wmma hazard

Generic target affected too in a pessimistic way.
DeltaFile
+1,537-0llvm/test/CodeGen/AMDGPU/wmma-hazards-gfx1250-w32.mir
+895-0llvm/test/CodeGen/AMDGPU/wmma-coexecution-valu-hazards.mir
+33-10llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp
+42-0llvm/test/CodeGen/AMDGPU/wmma-hazards-gfx1251-w32.mir
+7-1llvm/lib/Target/AMDGPU/AMDGPU.td
+2,514-115 files

LLVM/project 0717455llvm/lib/Target/AMDGPU GCNHazardRecognizer.cpp AMDGPU.td, llvm/test/CodeGen/AMDGPU wmma-hazards-gfx1250-w32.mir wmma-coexecution-valu-hazards.mir

[AMDGPU] Handle gfx1251 wmma hazard

Generic target affected too in a pessimistic way.
DeltaFile
+1,537-0llvm/test/CodeGen/AMDGPU/wmma-hazards-gfx1250-w32.mir
+895-0llvm/test/CodeGen/AMDGPU/wmma-coexecution-valu-hazards.mir
+42-0llvm/test/CodeGen/AMDGPU/wmma-hazards-gfx1251-w32.mir
+31-8llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp
+7-1llvm/lib/Target/AMDGPU/AMDGPU.td
+2,512-95 files

LLVM/project 2edc546clang/include/clang/Basic BuiltinsAMDGPUDocs.td BuiltinsAMDGPU.td, clang/lib/CodeGen/TargetBuiltins AMDGPU.cpp

[AMDGPU] Builtin support for wmma_f64_16x16x4_f64
DeltaFile
+19-0clang/test/CodeGenOpenCL/builtins-amdgcn-gfx1251-wmma-w32.cl
+17-0clang/test/SemaOpenCL/builtins-amdgcn-error-gfx1251-wmma-w32-param.cl
+15-0clang/include/clang/Basic/BuiltinsAMDGPUDocs.td
+6-0clang/include/clang/Basic/BuiltinsAMDGPU.td
+5-0clang/lib/CodeGen/TargetBuiltins/AMDGPU.cpp
+2-2clang/test/CodeGenCXX/dynamic-cast-address-space.cpp
+64-23 files not shown
+68-49 files

LLVM/project fe76f30llvm/lib/Target/AMDGPU SISchedule.td GCNProcessors.td

[AMDGPU] Add gfx1251 speed model

Adjust generic speed model to account for a slowest.
DeltaFile
+60-5llvm/lib/Target/AMDGPU/SISchedule.td
+2-2llvm/lib/Target/AMDGPU/GCNProcessors.td
+62-72 files

LLVM/project e5ff461llvm/include/llvm/IR IntrinsicsAMDGPU.td, llvm/lib/Target/AMDGPU AMDGPUISelDAGToDAG.cpp

[AMDGPU] Intrinsic and codegen for wmma_f64_16x16x4_f64
DeltaFile
+145-0llvm/test/CodeGen/AMDGPU/llvm.amdgcn.wmma.imod.gfx1251.w32.ll
+144-0llvm/test/CodeGen/AMDGPU/llvm.amdgcn.wmma.imm.gfx1251.w32.ll
+61-0llvm/test/CodeGen/AMDGPU/llvm.amdgcn.wmma.gfx1251.w32.ll
+23-0llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp
+7-0llvm/test/Analysis/UniformityAnalysis/AMDGPU/intrinsics.ll
+5-1llvm/include/llvm/IR/IntrinsicsAMDGPU.td
+385-12 files not shown
+390-18 files

LLVM/project 097a528llvm/lib/Target/AMDGPU VOP3PInstructions.td AMDGPU.td, llvm/test/MC/AMDGPU gfx1251_asm_wmma_w32.s gfx1251_asm_wmma_w32_err.s

[AMDGPU] MC support for v_wmma_f64_16x16x4_f64
DeltaFile
+49-0llvm/test/MC/AMDGPU/gfx1251_asm_wmma_w32.s
+29-0llvm/test/MC/Disassembler/AMDGPU/gfx1251_dasm_wmma_w32.txt
+7-0llvm/lib/Target/AMDGPU/VOP3PInstructions.td
+7-0llvm/test/MC/AMDGPU/gfx1251_asm_wmma_w32_err.s
+5-0llvm/lib/Target/AMDGPU/AMDGPU.td
+97-05 files

LLVM/project 91b9f3fllvm/lib/Target/AMDGPU EvergreenInstructions.td AMDGPUInstrInfo.td

AMDGPU: Remove AMDGPUbfm

It wasn't actually used. We select [SV]_BFM_B32 by directly matching
shift-based patterns.

commit-id:b5cd6327
DeltaFile
+1-4llvm/lib/Target/AMDGPU/EvergreenInstructions.td
+0-3llvm/lib/Target/AMDGPU/AMDGPUInstrInfo.td
+1-2llvm/lib/Target/AMDGPU/SOPInstructions.td
+2-93 files

LLVM/project 1272df2llvm/test/Transforms/SLPVectorizer/X86 runtime-alias-checks.ll

[SLP][NFC] Add tests with non-movable calls, NFC



Reviewers: 

Pull Request: https://github.com/llvm/llvm-project/pull/203140
DeltaFile
+240-1llvm/test/Transforms/SLPVectorizer/X86/runtime-alias-checks.ll
+240-11 files

LLVM/project 3c7cea8llvm/include/llvm/Target/GlobalISel Combine.td, llvm/test/CodeGen/AArch64/GlobalISel combine-or-and-xor.ll combine-or-and-xor.mir

Revert "[GlobalISel] Add `or_and_xor_to_or` pattern from SelectionDAG" (#203136)

Reverts llvm/llvm-project#201108
DeltaFile
+0-213llvm/test/CodeGen/AArch64/GlobalISel/combine-or-and-xor.ll
+0-206llvm/test/CodeGen/AArch64/GlobalISel/combine-or-and-xor.mir
+1-40llvm/include/llvm/Target/GlobalISel/Combine.td
+1-4593 files

LLVM/project 8b625b2llvm/test/MC/RISCV rv32c-invalid.s xqcibm-invalid.s

[RISC-V] Add --implicit-check-not="error:" to a few tests

Ensures that the test checks for every error emitted by llvm-mc. To do this
we have to move the CHECK lines to the next line rather than the same line
since otherwise we get a false-positive match.

This adds a few missing CHECK line in the xqcibm-invalid test and is needed
to minimize the diff in one of my subsequent commit.

Pull Request: https://github.com/llvm/llvm-project/pull/203091
DeltaFile
+112-57llvm/test/MC/RISCV/rv32c-invalid.s
+44-23llvm/test/MC/RISCV/xqcibm-invalid.s
+36-19llvm/test/MC/RISCV/rv64c-invalid.s
+20-11llvm/test/MC/RISCV/rvc-hints-invalid.s
+212-1104 files

LLVM/project 9617b2amlir/lib/Dialect/XeGPU/Transforms XeGPUSgToLaneDistribute.cpp, mlir/test/Dialect/XeGPU sg-to-lane-distribute-unit.mlir

[MLIR][XeGPU] Support partial subgroup lane distribution  (#201667)

for convert_layout

Add lowering support in XeGPUSgToLaneDistribute for values that are
distributed across only a fraction of the subgroup.

- SgToLaneConvertLayout now lowers a rank-2 xegpu.convert_layout that
  shrinks the lane layout along the outer (distributed) dimension while
  keeping lane_data unchanged (e.g. [16, 1] -> [8, 1]). The partial-subgroup
  case is detected directly in the pattern: equal order, rank 2, unit inner
  lane layout, and a genuinely distributed outer lane layout (> 1, which also
  rules out the degenerate [1, 1] layout). Because the data is no longer
  replicated in every lane, it is gathered across lanes and the distributed
  outer dimension is doubled when the lane count is halved.

- The cross-lane gather is factored into a dedicated helper,
  shuffleDataAsLaneLayoutChange(): it bitcasts the source to i32, issues
  gpu.shuffle up to fetch the values from the dropped lanes, and concatenates

    [9 lines not shown]
DeltaFile
+112-6mlir/lib/Dialect/XeGPU/Transforms/XeGPUSgToLaneDistribute.cpp
+68-0mlir/test/Dialect/XeGPU/sg-to-lane-distribute-unit.mlir
+180-62 files

LLVM/project afe5014llvm/test/CodeGen/RISCV clmul.ll clmulr.ll, llvm/test/CodeGen/RISCV/rvv clmulh-sdnode.ll clmul-sdnode.ll

rebase, update name of internal feature flag

Created using spr 1.3.8-beta.1
DeltaFile
+38,494-84,026llvm/test/CodeGen/RISCV/rvv/clmulh-sdnode.ll
+22,388-22,086llvm/test/CodeGen/RISCV/rvv/clmul-sdnode.ll
+19,087-24,391llvm/test/CodeGen/RISCV/clmul.ll
+10,473-12,572llvm/test/CodeGen/RISCV/clmulr.ll
+10,281-12,374llvm/test/CodeGen/RISCV/clmulh.ll
+8,361-8,920llvm/test/CodeGen/RISCV/rvv/expandload.ll
+109,084-164,3695,547 files not shown
+533,116-396,8755,553 files

LLVM/project 47a1d53llvm/test/CodeGen/RISCV clmul.ll clmulr.ll, llvm/test/CodeGen/RISCV/rvv clmulh-sdnode.ll clmul-sdnode.ll

[𝘀𝗽𝗿] changes introduced through rebase

Created using spr 1.3.8-beta.1

[skip ci]
DeltaFile
+38,494-84,026llvm/test/CodeGen/RISCV/rvv/clmulh-sdnode.ll
+22,388-22,086llvm/test/CodeGen/RISCV/rvv/clmul-sdnode.ll
+19,087-24,391llvm/test/CodeGen/RISCV/clmul.ll
+10,473-12,572llvm/test/CodeGen/RISCV/clmulr.ll
+10,281-12,374llvm/test/CodeGen/RISCV/clmulh.ll
+8,361-8,920llvm/test/CodeGen/RISCV/rvv/expandload.ll
+109,084-164,3695,540 files not shown
+533,158-396,8445,546 files