LLVM/project 79d9ae7llvm/lib/Target/AMDGPU AMDGPURegBankLegalizeRules.cpp, llvm/test/CodeGen/AMDGPU/GlobalISel regbankselect-amdgpu-wave-address.mir

[AMDGPU][GISel] Add RegBankLegalize support for G_AMDGPU_WAVE_ADDRESS (#167456)

DeltaFile
+3-4llvm/test/CodeGen/AMDGPU/GlobalISel/regbankselect-amdgpu-wave-address.mir
+2-0llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeRules.cpp
+5-42 files

LLVM/project 2acd652lldb/test/API/tools/lldb-dap/stackTraceCompilerGeneratedCode TestDAP_stackTraceCompilerGeneratedCode.py main.c, lldb/tools/lldb-dap DAP.cpp ProtocolUtils.cpp

Fix lldb-dap non-leaf frame source resolution issue (#165944)

Summary
-------

While dogfooding lldb-dap, I observed that VSCode frequently displays
certain stack frames as greyed out. Although these frames have valid
debug information, double-clicking them shows disassembly instead of
source code. However, running `bt` from the LLDB command line correctly
displays source file and line information for these same frames,
indicating this is an lldb-dap specific issue.

Root Cause
----------

Investigation revealed that `DAP::ResolveSource()` incorrectly uses a
frame's PC address directly to determine whether valid source line
information exists. This approach works for leaf frames, but fails for
non-leaf (caller) frames where the PC points to the return address

    [29 lines not shown]
DeltaFile
+66-0lldb/test/API/tools/lldb-dap/stackTraceCompilerGeneratedCode/TestDAP_stackTraceCompilerGeneratedCode.py
+19-0lldb/test/API/tools/lldb-dap/stackTraceCompilerGeneratedCode/main.c
+7-5lldb/tools/lldb-dap/DAP.cpp
+3-4lldb/tools/lldb-dap/ProtocolUtils.cpp
+3-0lldb/test/API/tools/lldb-dap/stackTraceCompilerGeneratedCode/Makefile
+2-1lldb/tools/lldb-dap/ProtocolUtils.h
+100-106 files

LLVM/project 53455f7llvm/test/CodeGen/AMDGPU flat-saddr-atomics.ll memory-legalizer-local-workgroup.ll

[AMDGPU] Insert `s_wait_xcnt(0)` before atomics to work around write-combining miss hazard

This patch adds a workaround for a hazzard on GFX1250, which inserts an `s_wait_xcnt(0)` instruction before any atomic operation that might write to memory.

Fixes SWDEV-543703.
DeltaFile
+192-0llvm/test/CodeGen/AMDGPU/flat-saddr-atomics.ll
+128-0llvm/test/CodeGen/AMDGPU/memory-legalizer-local-workgroup.ll
+128-0llvm/test/CodeGen/AMDGPU/memory-legalizer-local-wavefront.ll
+128-0llvm/test/CodeGen/AMDGPU/memory-legalizer-local-singlethread.ll
+128-0llvm/test/CodeGen/AMDGPU/memory-legalizer-local-cluster.ll
+128-0llvm/test/CodeGen/AMDGPU/memory-legalizer-local-agent.ll
+832-040 files not shown
+1,680-146 files

LLVM/project 0ab946bmlir/lib/ExecutionEngine CMakeLists.txt

hidden vis
DeltaFile
+6-1mlir/lib/ExecutionEngine/CMakeLists.txt
+6-11 files

LLVM/project 362119dllvm/utils/gn/secondary/libcxx/include BUILD.gn

[gn build] Port 5c3323a59fd2
DeltaFile
+0-1llvm/utils/gn/secondary/libcxx/include/BUILD.gn
+0-11 files

LLVM/project fcba304llvm/lib/Target/AMDGPU SIInstrInfo.cpp SIInstrInfo.h

AMDGPU: Remove override of TargetInstrInfo::getRegClass (#159886)

This should not be overridable and the special case hacks
have been replaced with RegClassByHwMode
DeltaFile
+0-12llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
+0-3llvm/lib/Target/AMDGPU/SIInstrInfo.h
+0-152 files

LLVM/project 95dfe79llvm/test/MC/MachO invalid-section-index.s

[MachO] Fix test failure. (#167598)

Add requires to not run `invalid-section-index.s` test in non aarch64
supported environments.
DeltaFile
+3-1llvm/test/MC/MachO/invalid-section-index.s
+3-11 files

LLVM/project 49ca1c9llvm/tools/llvm-lto2 llvm-lto2.cpp

Revert llvm-lto2 case
DeltaFile
+1-0llvm/tools/llvm-lto2/llvm-lto2.cpp
+1-01 files

LLVM/project 79423fdllvm/tools/opt optdriver.cpp

Revert opt case
DeltaFile
+1-0llvm/tools/opt/optdriver.cpp
+1-01 files

LLVM/project 7b7b462clang-tools-extra/clang-tidy/tool ClangTidyMain.cpp, lld/tools/lld lld.cpp

Revert failing test cases
DeltaFile
+1-0lld/tools/lld/lld.cpp
+1-0clang-tools-extra/clang-tidy/tool/ClangTidyMain.cpp
+2-02 files

LLVM/project df04182clang-tools-extra/clang-tidy/tool ClangTidyMain.cpp, lld/tools/lld lld.cpp

tools: Remove unused PluginLoader includes

As far as I can tell there are 2 parallel plugin mechanisms.
opt -load=plugin does not work, and is ignored. opt -load-pass-plugin
does work. The only user of PluginLoader appears to be bugpoint.
DeltaFile
+0-1llvm/tools/llvm-lto2/llvm-lto2.cpp
+0-1llvm/tools/opt/optdriver.cpp
+0-1llvm/tools/llc/llc.cpp
+0-1clang-tools-extra/clang-tidy/tool/ClangTidyMain.cpp
+0-1lld/tools/lld/lld.cpp
+0-1llvm/tools/lli/lli.cpp
+0-66 files

LLVM/project 90edf23mlir/include/mlir/Conversion/ArithToAPFloat ArithToAPFloat.h, mlir/lib/Conversion/ArithToAPFloat ArithToAPFloat.cpp

Reapply "Reapply "Reapply "[mlir] Add FP software implementation lowering pass: `arith-to-apfloat` (#166618)" (#167431)"" (#167549)

This reverts commit e67ac07881c215c91fe1ec714be6f3582178073c.
DeltaFile
+158-0mlir/lib/Conversion/ArithToAPFloat/ArithToAPFloat.cpp
+128-0mlir/test/Conversion/ArithToApfloat/arith-to-apfloat.mlir
+89-0mlir/lib/ExecutionEngine/APFloatWrappers.cpp
+36-0mlir/test/Integration/Dialect/Arith/CPU/test-apfloat-emulation.mlir
+25-0mlir/lib/Dialect/Func/Utils/Utils.cpp
+21-0mlir/include/mlir/Conversion/ArithToAPFloat/ArithToAPFloat.h
+457-011 files not shown
+542-017 files

LLVM/project ce17599mlir/python/mlir/dialects scf.py, mlir/test/python/dialects scf.py

[MLIR][Python] Add wrappers for scf.index_switch (#167458)

The C++ index switch op has utilities for `getCaseBlock(int i)` and
`getDefaultBlock()`, so these have been added.
Optional body builder args have been added: one for the default case and
one for the switch cases.
DeltaFile
+122-4mlir/test/python/dialects/scf.py
+75-0mlir/python/mlir/dialects/scf.py
+1-3mlir/test/python/ir/operation.py
+198-73 files

LLVM/project b07f8b0llvm/test/ExecutionEngine/JITLink/systemz ELF_systemz_reloc_pcdbl.s

[JITLINK] Fix large offset issue (#167600)

Removed large offset test. It caused issue with ARM 32-bit because of
large offset.

Co-authored-by: anoopkg6 <anoopkg6 at github.com>
DeltaFile
+10-20llvm/test/ExecutionEngine/JITLink/systemz/ELF_systemz_reloc_pcdbl.s
+10-201 files

LLVM/project 730fc7ellvm/lib/CodeGen/SelectionDAG SelectionDAG.cpp, llvm/test/CodeGen/AArch64 frem-power2.ll

DAG: exp opcodes cannotBeOrderedNegativeFP
DeltaFile
+3-15llvm/test/CodeGen/AArch64/frem-power2.ll
+11-1llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
+14-162 files

LLVM/project cd2660dllvm/include/llvm/IR RuntimeLibcalls.td

Remove __chkstk_ms
DeltaFile
+0-2llvm/include/llvm/IR/RuntimeLibcalls.td
+0-21 files

LLVM/project 0d91798llvm/include/llvm/IR RuntimeLibcalls.td, llvm/lib/Target/AArch64 AArch64ISelLowering.cpp AArch64PrologueEpilogue.cpp

RuntimeLibcalls: Add entries for stack probe functions

Stop hardcoding different variants of __chkstk and query the
name through RuntimeLibcalls.
DeltaFile
+35-2llvm/include/llvm/IR/RuntimeLibcalls.td
+9-3llvm/lib/Target/ARM/ARMFrameLowering.cpp
+7-2llvm/lib/Target/ARM/ARMISelLowering.cpp
+6-2llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+4-3llvm/lib/Target/X86/X86ISelLowering.cpp
+6-1llvm/lib/Target/AArch64/AArch64PrologueEpilogue.cpp
+67-131 files not shown
+67-197 files

LLVM/project cd456a2llvm/test/CodeGen/AArch64 frem-power2.ll

AArch64: Add baseline test for treating exp as known positive
DeltaFile
+92-0llvm/test/CodeGen/AArch64/frem-power2.ll
+92-01 files

LLVM/project 8196459mlir/lib/Dialect/Tensor/Transforms RuntimeOpVerification.cpp, mlir/test/Integration/Dialect/Tensor extract_slice-runtime-verification.mlir

[mlir][tensor] Fix runtime verification for tensor.extract_slice for empty tensor slices  (#166569)

I hit another runtime verification issue (similar to
https://github.com/llvm/llvm-project/pull/164878) while working with
TFLite models. The verifier is incorrectly rejecting
`tensor.extract_slice` operations when extracting an empty slice
(size=0) that starts exactly at the tensor boundary.

The current runtime verification unconditionally enforces `offset <
dim_size`. This makes sense for non-empty slices, but it's too strict
for empty slices, causing false positives that lead to spurious runtime
assertions.

**Simple example that demonstrates the issue:**

```mlir
func.func @extract_empty_slice(%tensor: tensor<?xf32>, %offset: index, %size: index) {
  // When called with: tensor size=10, offset=10, size=0
  // Runtime verification fails: "offset 0 is out-of-bounds"

    [111 lines not shown]
DeltaFile
+52-33mlir/lib/Dialect/Tensor/Transforms/RuntimeOpVerification.cpp
+9-0mlir/test/Integration/Dialect/Tensor/extract_slice-runtime-verification.mlir
+61-332 files

LLVM/project a664f58mlir/lib/Dialect/MemRef/Transforms RuntimeOpVerification.cpp, mlir/test/Integration/Dialect/MemRef subview-runtime-verification.mlir

[mlir][memref] Fix runtime verification for memref.subview for empty memref subviews  (#166581)

This PR applies the same fix from #166569 to `memref.subview`. That PR
fixed the issue for `tensor.extract_slice`, and this one addresses the
identical problem for `memref.subview`.

The runtime verification for `memref.subview` incorrectly rejects valid
empty subviews (size=0) starting at the memref boundary.

**Example that demonstrates the issue:**

```mlir
func.func @subview_with_empty_slice(%memref: memref<10x4x1xf32, strided<[?, ?, ?], offset: ?>>, 
                                     %dim_0: index, 
                                     %dim_1: index, 
                                     %dim_2: index,
                                     %offset: index) {
    // When called with: offset=10, dim_0=0, dim_1=4, dim_2=1
    // Runtime verification fails: "offset 0 is out-of-bounds"

    [22 lines not shown]
DeltaFile
+55-34mlir/lib/Dialect/MemRef/Transforms/RuntimeOpVerification.cpp
+15-0mlir/test/Integration/Dialect/MemRef/subview-runtime-verification.mlir
+70-342 files

LLVM/project b34bc0allvm/lib/Target/AMDGPU SIInstrInfo.cpp SIInstrInfo.h

AMDGPU: Remove override of TargetInstrInfo::getRegClass

This should not be overridable and the special case hacks
have been replaced with RegClassByHwMode
DeltaFile
+0-12llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
+0-3llvm/lib/Target/AMDGPU/SIInstrInfo.h
+0-152 files

LLVM/project e3a9ac5llvm/lib/Target/AMDGPU SIRegisterInfo.cpp SIFoldOperands.cpp

AMDGPU: Remove wrapper around TRI::getRegClass (#159885)

This shadows the member in the base class, but differs slightly
in behavior. The base method doesn't check for the invalid case.
DeltaFile
+0-11llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
+4-3llvm/lib/Target/AMDGPU/SIFoldOperands.cpp
+3-2llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
+0-2llvm/lib/Target/AMDGPU/SIRegisterInfo.h
+7-184 files

LLVM/project 441e511llvm/test/CodeGen/AMDGPU limit-coalesce.mir

AMDGPU: Update register class numbers in test (#167601)

DeltaFile
+2-2llvm/test/CodeGen/AMDGPU/limit-coalesce.mir
+2-21 files

LLVM/project c8a41c4llvm/lib/Target/AMDGPU SIInstrInfo.cpp SIInstrInfo.h

AMDGPU: Remove override of TargetInstrInfo::getRegClass

This should not be overridable and the special case hacks
have been replaced with RegClassByHwMode
DeltaFile
+0-12llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
+0-3llvm/lib/Target/AMDGPU/SIInstrInfo.h
+0-152 files

LLVM/project 5bfad60llvm/include/llvm/IR RuntimeLibcalls.td

Remove __chkstk_ms
DeltaFile
+0-2llvm/include/llvm/IR/RuntimeLibcalls.td
+0-21 files

LLVM/project 5b62057llvm/include/llvm/IR RuntimeLibcalls.td, llvm/lib/Target/AArch64 AArch64ISelLowering.cpp AArch64PrologueEpilogue.cpp

RuntimeLibcalls: Add entries for stack probe functions

Stop hardcoding different variants of __chkstk and query the
name through RuntimeLibcalls.
DeltaFile
+35-2llvm/include/llvm/IR/RuntimeLibcalls.td
+9-3llvm/lib/Target/ARM/ARMFrameLowering.cpp
+7-2llvm/lib/Target/ARM/ARMISelLowering.cpp
+6-2llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+6-1llvm/lib/Target/AArch64/AArch64PrologueEpilogue.cpp
+4-3llvm/lib/Target/X86/X86ISelLowering.cpp
+67-131 files not shown
+67-197 files

LLVM/project f5c71b3llvm/lib/Target/AMDGPU SIRegisterInfo.cpp SIFoldOperands.cpp

AMDGPU: Remove wrapper around TRI::getRegClass

This shadows the member in the base class, but differs slightly
in behavior. The base method doesn't check for the invalid case.
DeltaFile
+0-11llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
+4-3llvm/lib/Target/AMDGPU/SIFoldOperands.cpp
+3-2llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
+0-2llvm/lib/Target/AMDGPU/SIRegisterInfo.h
+7-184 files

LLVM/project 2bf9278llvm/lib/Target/AMDGPU SIRegisterInfo.td AMDGPU.td, llvm/test/CodeGen/AMDGPU local-stack-alloc-add-references.gfx8.mir coalesce-copy-to-agpr-to-av-registers.mir

AMDGPU: Start using RegClassByHwMode for wavesize operands
 (#159884)

This eliminates the pseudo registerclasses used to hack the
wave register class, which are now replaced with RegClassByHwMode,
so most of the diff is from register class ID renumbering.
DeltaFile
+180-180llvm/test/CodeGen/AMDGPU/local-stack-alloc-add-references.gfx8.mir
+120-120llvm/test/CodeGen/AMDGPU/coalesce-copy-to-agpr-to-av-registers.mir
+90-90llvm/test/CodeGen/AMDGPU/local-stack-alloc-add-references.gfx9.mir
+71-24llvm/lib/Target/AMDGPU/SIRegisterInfo.td
+18-18llvm/test/CodeGen/AMDGPU/rewrite-vgpr-mfma-to-agpr-subreg-src2-chain.mir
+33-2llvm/lib/Target/AMDGPU/AMDGPU.td
+512-43430 files not shown
+672-60336 files

LLVM/project 9821ae7llvm/lib/Transforms/Scalar MemCpyOptimizer.cpp, llvm/test/Transforms/MemCpyOpt memset-memcpy-dbgloc.ll

[MemCpyOpt][profcheck] Set `unknown` branch weights for certain selects
DeltaFile
+5-3llvm/test/Transforms/MemCpyOpt/memset-memcpy-dbgloc.ll
+3-0llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp
+8-32 files

LLVM/project 196ea57.github/workflows libclang-abi-tests.yml

workflows/libclang-abi-tests:  Use new container (#167459)

DeltaFile
+4-15.github/workflows/libclang-abi-tests.yml
+4-151 files