LLVM/project 8a8c0cdmlir/lib/Dialect/MemRef/Transforms FoldMemRefAliasOps.cpp, mlir/test/Dialect/MemRef fold-memref-alias-ops.mlir

[mlir][MemRef] Make fold-memref-alias-ops use memref interfaces

This replaces the large switch-cases and operation-specific patterns
in FoldMemRefAliashops with patterns that use the new
IndexedAccessOpInterface and IndexedMemCopyOpInterface, which will
allow us to remove the memref transforms' dependency on the NVGPU
dialect.

This does also resolve some bugs and potential unsoundnesses:
1. We will no longer fold in expand_shape into vector.load or
vector.transfer_read in cases where that would alter the strides
between dimensions in multi-dimensional loads. For example, if we have
a `vector.load %e[%i, %j, %k] : memref<8x8x9xf32>, vector<2x3xf32>`
where %e is
`expand_shape %m [[0], [1], [2. 3]] : memref<8x8x3x3xf32> to 8x8x9xf32,
we will no longer fold in that shape, since that would change which
value would be read (the previous patterns tried to account for this
but failed).
2. Subviews that have non-unit strides in positions that aren't being

    [15 lines not shown]
DeltaFile
+425-440mlir/lib/Dialect/MemRef/Transforms/FoldMemRefAliasOps.cpp
+294-3mlir/test/Dialect/MemRef/fold-memref-alias-ops.mlir
+719-4432 files

LLVM/project f665cf3llvm/lib/Target/AMDGPU AMDGPURegBankLegalizeHelper.cpp, llvm/test/CodeGen/AMDGPU/GlobalISel unmerge-sgpr-s16.mir

AMDGPU/GlobalISel: Fix sgpr s16 unmerge lowering in regbanklegalize

Used to fail EXPENSIVE_CHECKS because of type mismatch.
DeltaFile
+5-3llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeHelper.cpp
+4-4llvm/test/CodeGen/AMDGPU/GlobalISel/unmerge-sgpr-s16.mir
+9-72 files

LLVM/project 4cae644libc/config/baremetal config.json

[libc] Disable strong stack protector for baremetal (#179559)

Strong stack protector introduces references to __stack_chk_guard
symbols with GOT relocation in ARM 32 bit targets which is not supported
in typical baremetal environments. Turning this off for baremetal.
DeltaFile
+5-0libc/config/baremetal/config.json
+5-01 files

LLVM/project efee25dllvm/lib/Target/AMDGPU AMDGPURegBankLegalizeHelper.cpp, llvm/test/CodeGen/AMDGPU/GlobalISel unmerge-sgpr-s16.mir

AMDGPU/GlobalISel: Fix sgpr s16 unmerge lowering in regbanklegalize

Used to fail EXPENSIVE_CHECKS because of type mismatch.
DeltaFile
+5-3llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeHelper.cpp
+4-4llvm/test/CodeGen/AMDGPU/GlobalISel/unmerge-sgpr-s16.mir
+9-72 files

LLVM/project 65cc695llvm/include/llvm/CodeGen SelectionDAGISel.h, llvm/lib/CodeGen/SelectionDAG SelectionDAGISel.cpp

Reapply "[SelectionDAGISel] Separate the operand numbers in OPC_EmitNode/MorphNodeTo into their own table. (#178722)"

This includes a fix to use size_t instead of uint64_t in one place.
DeltaFile
+57-6llvm/utils/TableGen/DAGISelMatcherEmitter.cpp
+17-8llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
+6-2llvm/test/TableGen/CPtrWildcard.td
+7-0llvm/test/TableGen/RegClassByHwMode.td
+3-3llvm/test/TableGen/dag-isel-regclass-emit-enum.td
+1-1llvm/include/llvm/CodeGen/SelectionDAGISel.h
+91-206 files

LLVM/project 4f04770llvm/test/CodeGen/RISCV/GlobalISel/instruction-select/rvv sext.mir zext.mir, llvm/test/CodeGen/RISCV/rvv vl-opt-op-info.mir vl-opt.mir

[RISCV] Print MIR comments for AVL and VEC_RM operands (#179542)

Such that we can now have something like:
```
PseudoVFMACC_VV_M2_E64 %1, %28, %28, 7 /* frm=dyn */, %21 /* vl */, 6 /* e64 */, 0 /* tu, mu */
```
or
```
PseudoVFMACC_VV_M2_E64 %1, %28, %28, 7 /* frm=dyn */, -1 /* vl=VLMAX */, 6 /* e64 */, 0 /* tu, mu */
```
Hopefully this could make reading RISC-V MIR (a little) less painful.
DeltaFile
+414-414llvm/test/CodeGen/RISCV/rvv/vl-opt-op-info.mir
+115-115llvm/test/CodeGen/RISCV/rvv/vl-opt.mir
+60-60llvm/test/CodeGen/RISCV/rvv/subregister-undef-early-clobber.mir
+56-56llvm/test/CodeGen/RISCV/GlobalISel/instruction-select/rvv/sext.mir
+56-56llvm/test/CodeGen/RISCV/GlobalISel/instruction-select/rvv/zext.mir
+56-56llvm/test/CodeGen/RISCV/GlobalISel/instruction-select/rvv/anyext.mir
+757-75750 files not shown
+1,232-1,21856 files

LLVM/project b0b9046llvm/lib/Target/BPF BTFDebug.cpp BPFISelLowering.cpp

[BPF] Replace copy-assign by move-assign in llvm/lib/Target/BPF/ (#179462)

An SDLoc transitively contains a TrackingMDRef which have a specialized
move constructor. It's more efficient to move element to it instead of
copying them.

FileContent contains std::vector<...> values. It's more efficient to
move then to copy the whole vector.
DeltaFile
+1-1llvm/lib/Target/BPF/BTFDebug.cpp
+1-1llvm/lib/Target/BPF/BPFISelLowering.cpp
+2-22 files

LLVM/project d0ee00bllvm/lib/Target/AMDGPU AMDGPURegBankLegalizeHelper.cpp, llvm/test/CodeGen/AMDGPU/GlobalISel unmerge-sgpr-s16.mir

AMDGPU/GlobalISel: Fix sgpr s16 unmerge lowering in regbanklegalize

Used to fail EXPENSIVE_CHECKS because of type mismatch.
DeltaFile
+5-3llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeHelper.cpp
+4-4llvm/test/CodeGen/AMDGPU/GlobalISel/unmerge-sgpr-s16.mir
+9-72 files

LLVM/project 254b3b1mlir/lib/TableGen AttrOrTypeDef.cpp, mlir/test/IR test-verifiers-type.mlir

[mlir][tblgen] Add PredTypeTrait/PredAttrTrait support (#169153)

This patch adds support for `PredTypeTrait` and `PredAttrTrait` in type
and attribute definitions, enabling declarative predicate-based
verification similar to how `PredOpTrait` works for operations.

  ## Motivation

In 802bf02 (from 2021), `PredTypeTrait`/`PredAttrTrait` were defined in
TableGen but not implemented in the code generator. Using them causes
mlir-tblgen to crash with an assertion failure when trying to cast
`PredTrait` to `InterfaceTrait`. This patch fixes the crash and
implements the actual verification code generation.

  ## Usage

Use `$paramName` syntax in predicates to reference type/attribute
parameters:


    [15 lines not shown]
DeltaFile
+48-0mlir/test/IR/test-verifiers-type.mlir
+30-6mlir/tools/mlir-tblgen/AttrOrTypeDefGen.cpp
+30-0mlir/test/lib/Dialect/Test/TestTypeDefs.td
+5-3mlir/lib/TableGen/AttrOrTypeDef.cpp
+113-94 files

LLVM/project 43faefdllvm/lib/Transforms/IPO ArgumentPromotion.cpp, llvm/test/Transforms/ArgumentPromotion dbg.ll

[ArgPromotion] Add DW_CC_nocall to DISubprogram (#178973)

ArgumentPromotion pass may change function signatures. If this happens
and debuginfo is enabled, adding DW_CC_nocall allows dwarf to generate
    DW_AT_calling_convention        (DW_CC_nocall)
for DW_TAG_subprogram.
DeadArgumentElimination ([1]) already has similar implementation.

The pahole tool ([2]) is used in linux kernel build to generate vmlinux
BTF. One of its input is linux kernel dwarf. Currently, pahole
checks *all* DW_TAG_subprogram functions and find whether the source
signature matches the architecture ABI or not. If mismatch, pahole will
try to do some adjustment for those parameters. See [3]
and function parameter__new().

The linux kernel typically has ~65K functions and roughly 1100 functions
may have signature changed due to compile optimization. Without
DW_CC_nocall,
signatures of all of 64K functions will be checked in parameter__new().

    [34 lines not shown]
DeltaFile
+16-1llvm/test/Transforms/ArgumentPromotion/dbg.ll
+11-0llvm/lib/Transforms/IPO/ArgumentPromotion.cpp
+27-12 files

LLVM/project d835071mlir/lib/Conversion/GPUToROCDL LowerGpuOpsToROCDLOps.cpp, mlir/test/Conversion/GPUToROCDL gpu-to-rocdl-subgroup-id.mlir

[mlir] GPUToROCDL: lower `gpu.subgroup_id` to the intrinsic where possible (#179422)

Lower `gpu.subgroup_id` to `wave.id` intrinsic on gfx12+, lower to
`linearized_thread_id / subgroup_size` on older.
DeltaFile
+63-2mlir/lib/Conversion/GPUToROCDL/LowerGpuOpsToROCDLOps.cpp
+40-0mlir/test/Conversion/GPUToROCDL/gpu-to-rocdl-subgroup-id.mlir
+103-22 files

LLVM/project f9b5ab1lldb/include/lldb/DataFormatters FormatterBytecode.h

[lldb] Add missing include guard in FormatterBytecode.h (#179528)

DeltaFile
+5-0lldb/include/lldb/DataFormatters/FormatterBytecode.h
+5-01 files

LLVM/project f646131llvm/lib/Target/AMDGPU AMDGPURegBankLegalizeHelper.cpp, llvm/test/CodeGen/AMDGPU/GlobalISel unmerge-sgpr-s16.mir

AMDGPU/GlobalISel: Fix sgpr s16 unmerge lowering in regbanklegalize

Used to fail EXPENSIVE_CHECKS because of type mismatch.
DeltaFile
+5-3llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeHelper.cpp
+4-4llvm/test/CodeGen/AMDGPU/GlobalISel/unmerge-sgpr-s16.mir
+9-72 files

LLVM/project 7c29a09utils/bazel/llvm-project-overlay/lldb BUILD.bazel

[bazel][lldb] Port #179355: data formatters location (#179552)

DeltaFile
+1-2utils/bazel/llvm-project-overlay/lldb/BUILD.bazel
+1-21 files

LLVM/project 19cf75cllvm/include/llvm/CodeGen SelectionDAGISel.h, llvm/lib/CodeGen/SelectionDAG SelectionDAGISel.cpp

Revert "[SelectionDAGISel] Separate the operand numbers in OPC_EmitNode/MorphNodeTo into their own table. (#178722)"

This reverts commit caab98284166784459a2fb76df7bca3f1d35e41e.

This is failing some build bots.
DeltaFile
+6-57llvm/utils/TableGen/DAGISelMatcherEmitter.cpp
+7-16llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
+2-6llvm/test/TableGen/CPtrWildcard.td
+0-7llvm/test/TableGen/RegClassByHwMode.td
+3-3llvm/test/TableGen/dag-isel-regclass-emit-enum.td
+1-1llvm/include/llvm/CodeGen/SelectionDAGISel.h
+19-906 files

LLVM/project 3ce60c4utils/bazel/llvm-project-overlay/mlir BUILD.bazel

[bazel][mlir][NFC] Run buildifier (#179554)

DeltaFile
+0-3utils/bazel/llvm-project-overlay/mlir/BUILD.bazel
+0-31 files

LLVM/project 2dfd20dllvm/lib/Target/AMDGPU AMDGPURegBankLegalizeHelper.cpp, llvm/test/CodeGen/AMDGPU/GlobalISel unmerge-sgpr-s16.mir

AMDGPU/GlobalISel: Fix sgpr s16 unmerge lowering in regbanklegalize

Used to fail EXPENSIVE_CHECKS because of type mismatch.
DeltaFile
+4-4llvm/test/CodeGen/AMDGPU/GlobalISel/unmerge-sgpr-s16.mir
+5-3llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeHelper.cpp
+9-72 files

LLVM/project a6c926bclang/lib/Analysis ThreadSafety.cpp, clang/test/SemaCXX warn-thread-safety-analysis.cpp

[Thread Safety Analysis] Fix a bug of context saving in alias-analysis (#178825)

The commit b4c98fcbe1504841203e610c351a3227f36c92a4 introduces
alias-analysis and conservatively invalidates variable definitions at
function calls. For each invalidated argument, it creates and pushes a
context. So if there are multiple arguments being invalidated, there are
more than one context being pushed. However, the analysis expects one
context at the program point of a call, causing context mismatch. This
issue could lead to false negatives.
For example,
```
    MyLock->Lock();               // 'MyLock' holds the lock
    Lock_t *Ptr = MyLock;      // 'Ptr' aliases with 'MyLock'
    // Before the fix, two contexts are saved and pushed at the call below, causing context mismatch later.
    escapeAliasMultiple(&Irrelevant, &Ptr);  
    Ptr->Unlock();                   // 'Ptr' may no longer hold the lock but the analyzer missed it due to context mismatch
```
This commit fixes the issue.


    [2 lines not shown]
DeltaFile
+10-0clang/test/SemaCXX/warn-thread-safety-analysis.cpp
+4-3clang/lib/Analysis/ThreadSafety.cpp
+14-32 files

LLVM/project 4b4c32cllvm/utils/gn/secondary/compiler-rt/lib/builtins sources.gni

[gn] port e1f69ee8e847
DeltaFile
+3-0llvm/utils/gn/secondary/compiler-rt/lib/builtins/sources.gni
+3-01 files

LLVM/project 72d86a5llvm/lib/Target/AMDGPU AMDGPURegBankLegalizeHelper.cpp, llvm/test/CodeGen/AMDGPU/GlobalISel unmerge-sgpr-s16.mir

AMDGPU/GlobalISel: Fix sgpr s16 unmerge lowering in regbanklegalize

Used to fail EXPENSIVE_CHECKS because of type mismatch.
DeltaFile
+5-3llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeHelper.cpp
+4-4llvm/test/CodeGen/AMDGPU/GlobalISel/unmerge-sgpr-s16.mir
+9-72 files

LLVM/project 6716acdllvm/test/TableGen TargetLibraryInfo.td, llvm/utils/TableGen/Basic TargetLibraryInfoEmitter.cpp

[NFC][TableGen] Adopt IfDefEmitter in TargetLibraryInfoEmitter (#179388)

DeltaFile
+29-34llvm/utils/TableGen/Basic/TargetLibraryInfoEmitter.cpp
+7-7llvm/test/TableGen/TargetLibraryInfo.td
+36-412 files

LLVM/project e4c7ef2llvm/utils/TableGen X86MnemonicTables.cpp

[NFC][TableGen] Adopt CodeGenHelpers in X86MnemonicEmitter (#179324)

Additionally, cleanup the code a bit to use nested namespace definition
and emit it per code section, and emit spaces instead of tabs.
DeltaFile
+24-24llvm/utils/TableGen/X86MnemonicTables.cpp
+24-241 files

LLVM/project 078f6bdllvm/include/llvm/TableGen CodeGenHelpers.h, llvm/test/TableGen bare-minimum-psets.td pset-enum.td

[NFC][TableGen] Adopt CodeGenHelpers in RegInfoEmitter (#179017)

- Change `NamespaceEmitter` to allow emitting anonymous namespaces.
- Adopt IfDef and namespace emitters in RegInfoEmitter.
DeltaFile
+68-93llvm/utils/TableGen/RegisterInfoEmitter.cpp
+11-0llvm/include/llvm/TableGen/CodeGenHelpers.h
+3-3llvm/test/TableGen/bare-minimum-psets.td
+2-2llvm/test/TableGen/pset-enum.td
+84-984 files

LLVM/project caab982llvm/include/llvm/CodeGen SelectionDAGISel.h, llvm/lib/CodeGen/SelectionDAG SelectionDAGISel.cpp

[SelectionDAGISel] Separate the operand numbers in OPC_EmitNode/MorphNodeTo into their own table. (#178722)

The operand lists for these opcode require 1 byte per operand and are
usually small values that fit in 3-4 bits. This makes their storage
inefficient. In addition, many EmitNode/MorphNodeTo in the isel table
will use the same list of operand numbers.

This patch proposes to separate the operand lists into their own table
where they can be de-duplicated. The OPC_EmitNode/MorphNodeTo in the
main table will only store an index into this smaller table.

This is a reduced version of a suggestion from this very old FIXME.
https://github.com/llvm/llvm-project/blob/d8d4096c0be0a6a3248c8deae96608913a85debf/llvm/utils/TableGen/DAGISelMatcherGen.cpp#L1070

For RISC-V this reduces the main table from 1437353 bytes to 1276015
bytes plus a 929 byte operand list table. A savings of about 11%.

For X86 this reduces the main table from 719237 bytes to 623612 bytes
plus a 1042 byte operand list table. A savings of about 11%.

I expect further savings could be had by moving more bytes over.
DeltaFile
+57-6llvm/utils/TableGen/DAGISelMatcherEmitter.cpp
+17-8llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
+6-2llvm/test/TableGen/CPtrWildcard.td
+7-0llvm/test/TableGen/RegClassByHwMode.td
+3-3llvm/test/TableGen/dag-isel-regclass-emit-enum.td
+1-1llvm/include/llvm/CodeGen/SelectionDAGISel.h
+91-206 files

LLVM/project 22c8344libc/src/__support ctype_utils.h, libc/src/__support/math common_constants.h range_reduction_double_common.h

[libc] Address size bloat issues (#179398)

This refactoring addresses bloat by removing static function specifiers.
DeltaFile
+17-17libc/src/__support/math/common_constants.h
+13-15libc/src/__support/ctype_utils.h
+2-2libc/src/__support/math/range_reduction_double_common.h
+1-1libc/src/__support/math/atan2.h
+1-1libc/src/__support/math/cos.h
+1-1libc/src/__support/math/sin.h
+35-376 files

LLVM/project 792f7b0llvm/lib/Transforms/Vectorize VPlanTransforms.cpp VPlanPatternMatch.h, llvm/test/Transforms/LoopVectorize/AArch64 partial-reduce-fdot-product.ll

[VPlan] Refine exit select check in transformtoPartialReduction.

Make sure we find the actual select for the exit users and only use it
for the final link in the chain. This fixes a miscompile after
90b3712d8a20efa2cbaadc177da576e485dce038.
DeltaFile
+54-0llvm/test/Transforms/LoopVectorize/AArch64/partial-reduce-fdot-product.ll
+10-6llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+4-0llvm/lib/Transforms/Vectorize/VPlanPatternMatch.h
+68-63 files

LLVM/project 19fab95llvm/lib/Target/AMDGPU AMDGPURegBankLegalizeHelper.cpp, llvm/test/CodeGen/AMDGPU/GlobalISel unmerge-sgpr-s16.mir

AMDGPU/GlobalISel: Fix sgpr s16 unmerge lowering in regbanklegalize

Used to fail EXPENSIVE_CHECKS because of type mismatch.
DeltaFile
+5-3llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeHelper.cpp
+4-4llvm/test/CodeGen/AMDGPU/GlobalISel/unmerge-sgpr-s16.mir
+9-72 files

LLVM/project d510c4cllvm/lib/Transforms/Vectorize VPlanCFG.h, llvm/unittests/Transforms/Vectorize VPlanTest.cpp

[VPlan] Generalize `VPAllSuccessorsIterator` to support predecessors (#178724)

To be used in Mel's https://github.com/llvm/llvm-project/pull/173265.

---------

Co-authored-by: Florian Hahn <flo at fhahn.com>
Co-authored-by: Luke Lau <luke_lau at icloud.com>
DeltaFile
+103-61llvm/lib/Transforms/Vectorize/VPlanCFG.h
+33-2llvm/unittests/Transforms/Vectorize/VPlanTest.cpp
+136-632 files

LLVM/project 3db2fd8clang/lib/CIR/CodeGen CIRGenCall.cpp, clang/test/CIR/CodeGen alloc-size.c

[CIR] Implement 'allocsize' function/call attribute lowering (#179342)

The alloc_size attribute takes the argument number(normalized to the
    index!) of the element size and count, for things like 'malloc' or
'calloc'.

This ends up being slightly more complicated than others, as this has
data that we have to decide on a format for. LLVM chooses to pack both
of these 32 bit values into a single i64, but unpacks it for the purpose
of input/output. The second value, the number of elements, is optional.

This patch uses a DenseI32ArrayAttr to store them for the LLVMIR
dialect, which gets us the packed nature, but doesn't require us doing
any work to unpack it.
DeltaFile
+79-0clang/test/CIR/CodeGen/alloc-size.c
+52-0mlir/test/Target/LLVMIR/llvmir.mlir
+24-0mlir/test/Target/LLVMIR/Import/instructions.ll
+21-0mlir/lib/Target/LLVMIR/ModuleImport.cpp
+14-5clang/lib/CIR/CodeGen/CIRGenCall.cpp
+18-0mlir/lib/Target/LLVMIR/ModuleTranslation.cpp
+208-511 files not shown
+263-1617 files

LLVM/project 414ec6allvm/lib/Target/AMDGPU AMDGPURegBankLegalizeHelper.cpp, llvm/test/CodeGen/AMDGPU/GlobalISel unmerge-sgpr-s16.mir

AMDGPU/GlobalISel: Fix sgpr s16 unmerge lowering in regbanklegalize

Used to fail EXPENSIVE_CHECKS because of type mismatch.
DeltaFile
+5-3llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeHelper.cpp
+4-4llvm/test/CodeGen/AMDGPU/GlobalISel/unmerge-sgpr-s16.mir
+9-72 files