LLVM/project 071d1fbllvm/lib/Transforms/Vectorize VPlanRecipes.cpp VPlan.h, llvm/test/Transforms/LoopVectorize/AArch64 vplan-printing.ll

[LV] Use VPReductionRecipe for partial reductions (#147513)

Partial reductions can easily be represented by the VPReductionRecipe
class by setting their scale factor to something greater than 1. This PR
merges the two together and gives VPReductionRecipe a VFScaleFactor so
that it can choose to generate the partial reduction intrinsic at
execute time.

Stacked PRs:
1. https://github.com/llvm/llvm-project/pull/147026
2. https://github.com/llvm/llvm-project/pull/147255
3. https://github.com/llvm/llvm-project/pull/156976
4. https://github.com/llvm/llvm-project/pull/160154
5. https://github.com/llvm/llvm-project/pull/147302
6. https://github.com/llvm/llvm-project/pull/162503
7. -> https://github.com/llvm/llvm-project/pull/147513

Replaces https://github.com/llvm/llvm-project/pull/146073 .
DeltaFile
+48-146llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
+74-101llvm/lib/Transforms/Vectorize/VPlan.h
+25-15llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+11-20llvm/unittests/Transforms/Vectorize/VPlanTest.cpp
+10-9llvm/lib/Transforms/Vectorize/VPlanAnalysis.cpp
+6-6llvm/test/Transforms/LoopVectorize/AArch64/vplan-printing.ll
+174-2973 files not shown
+180-3079 files

LLVM/project 96cbbebllvm/lib/Target/RISCV RISCVInstrInfoXAndes.td

[MC][RISCV] Add missing Predicates for NDS_FMV_BF16_X (#169662)

run 
```shell
build/bin/llvm-exegesis -mode=latency -mtriple=riscv64-unknown-linux-gnu --mcpu=generic --benchmark-phase=assemble-measured-code -opcode-index=-1
```

error:
```
---
mode:            latency
key:
  instructions:
    - 'NDS_FMV_BF16_X F2_H X11'
    - 'NDS_FMV_X_BF16 X26 F2_H'
  config:          ''
  register_initial_values:
    - 'X11=0x0'
cpu_name:        generic

    [8 lines not shown]
DeltaFile
+0-2llvm/lib/Target/RISCV/RISCVInstrInfoXAndes.td
+0-21 files

LLVM/project c333f7dmlir/lib/Dialect/XeGPU/Transforms XeGPUSubgroupDistribute.cpp, mlir/test/Dialect/XeGPU subgroup-distribute-unit.mlir

[mlir][xegpu] Add layout based SIMT distribution support for `vector.extract/insert_strided_slice` (#168626)

This PR adds general SIMT distribution support for
`vector.extract/insert_strided_slice`. Currently vector distribution
already have support for these operations but have restrictions to avoid
requiring layouts during distribution logic. For example,
`extract_stride_slice` require that distributed dimension is fully
extracted. However, more complex cases may require extracting partially
from distributed dimension (eg. 8x16xf16 extraction from 8x32xf16).
These types of cases need the layouts to reason about how the data is
spread across SIMT lanes.

Currently, we don't have layout access in vector distribution so these
new patterns are place in XeGPU side. They have higher pattern benefit
so that they will be tried first before trying regular vector
distribution based patterns.
DeltaFile
+558-289mlir/test/Dialect/XeGPU/subgroup-distribute-unit.mlir
+242-3mlir/lib/Dialect/XeGPU/Transforms/XeGPUSubgroupDistribute.cpp
+800-2922 files

LLVM/project c082a37llvm/test/CodeGen/AMDGPU constant-address-space-32bit.ll, llvm/test/CodeGen/RISCV/GlobalISel/rvv vluxei.ll vloxei.ll

rework

Created using spr 1.3.8-beta.1
DeltaFile
+4,734-0llvm/test/tools/llvm-mca/RISCV/tt-ascalon-d8/vlseg-vsseg.s
+1,560-19llvm/test/CodeGen/AMDGPU/constant-address-space-32bit.ll
+0-1,554llvm/test/CodeGen/RISCV/rvv/vloxei.ll
+0-1,554llvm/test/CodeGen/RISCV/rvv/vluxei.ll
+0-1,484llvm/test/CodeGen/RISCV/GlobalISel/rvv/vluxei.ll
+0-1,484llvm/test/CodeGen/RISCV/GlobalISel/rvv/vloxei.ll
+6,294-6,0951,664 files not shown
+26,853-132,3701,670 files

LLVM/project 71507d3llvm/test/CodeGen/AMDGPU constant-address-space-32bit.ll, llvm/test/CodeGen/RISCV/GlobalISel/rvv vloxei.ll vluxei.ll

[𝘀𝗽𝗿] changes introduced through rebase

Created using spr 1.3.8-beta.1

[skip ci]
DeltaFile
+4,734-0llvm/test/tools/llvm-mca/RISCV/tt-ascalon-d8/vlseg-vsseg.s
+1,560-19llvm/test/CodeGen/AMDGPU/constant-address-space-32bit.ll
+0-1,554llvm/test/CodeGen/RISCV/rvv/vluxei.ll
+0-1,554llvm/test/CodeGen/RISCV/rvv/vloxei.ll
+0-1,484llvm/test/CodeGen/RISCV/GlobalISel/rvv/vloxei.ll
+0-1,484llvm/test/CodeGen/RISCV/GlobalISel/rvv/vluxei.ll
+6,294-6,0951,658 files not shown
+26,808-132,3121,664 files

LLVM/project 7414daallvm/test/CodeGen/AMDGPU constant-address-space-32bit.ll, llvm/test/CodeGen/RISCV/GlobalISel/rvv vluxei.ll vloxei.ll

rework

Created using spr 1.3.8-beta.1
DeltaFile
+4,734-0llvm/test/tools/llvm-mca/RISCV/tt-ascalon-d8/vlseg-vsseg.s
+1,560-19llvm/test/CodeGen/AMDGPU/constant-address-space-32bit.ll
+0-1,554llvm/test/CodeGen/RISCV/rvv/vloxei.ll
+0-1,554llvm/test/CodeGen/RISCV/rvv/vluxei.ll
+0-1,484llvm/test/CodeGen/RISCV/GlobalISel/rvv/vluxei.ll
+0-1,484llvm/test/CodeGen/RISCV/GlobalISel/rvv/vloxei.ll
+6,294-6,0951,656 files not shown
+26,800-132,3031,662 files

LLVM/project 2ca8cd7llvm/lib/IR ReplaceConstant.cpp, llvm/test/CodeGen/AMDGPU lower-module-lds-constantexpr.ll lower-kernel-lds-constexpr.ll

fix dominance issue and comment
DeltaFile
+47-41llvm/test/CodeGen/AMDGPU/lower-module-lds-constantexpr.ll
+26-17llvm/test/CodeGen/AMDGPU/lower-kernel-lds-constexpr.ll
+9-6llvm/lib/IR/ReplaceConstant.cpp
+82-643 files

LLVM/project 398edfbllvm/lib/IR ReplaceConstant.cpp, llvm/test/CodeGen/AMDGPU same-lds-variable-multiple-use-in-one-phi-node.ll

[ReplaceConstant] Don't create instructions for the same constant multiple times in the same basic block

Fixes #167500.
DeltaFile
+51-0llvm/test/CodeGen/AMDGPU/same-lds-variable-multiple-use-in-one-phi-node.ll
+9-1llvm/lib/IR/ReplaceConstant.cpp
+60-12 files

LLVM/project 44c9d3aclang/lib/Driver/ToolChains Linux.cpp, compiler-rt/cmake/Modules AllSupportedArchDefs.cmake

[scudo] Add scudo_standalone support for SystemZ (#166187)

Add Support for scudo_standalone for SystemZ without enabling gwp_asan.

Co-authored-by: anoopkg6 <anoopkg6 at github.com>
DeltaFile
+1-1compiler-rt/cmake/Modules/AllSupportedArchDefs.cmake
+1-1clang/lib/Driver/ToolChains/Linux.cpp
+2-22 files

LLVM/project e8b9d42clang/lib/Driver/ToolChains Linux.cpp, compiler-rt/cmake/Modules AllSupportedArchDefs.cmake

[tysan] Type Sanitizer support for SystemZ (#162396)

Type Sanitizer support for SystemZ.

Co-authored-by: anoopkg6 <anoopkg6 at github.com>
DeltaFile
+6-0compiler-rt/lib/tysan/tysan_platform.h
+1-1compiler-rt/cmake/Modules/AllSupportedArchDefs.cmake
+1-1clang/lib/Driver/ToolChains/Linux.cpp
+8-23 files

LLVM/project 3d16bc8llvm/include/llvm/IR RuntimeLibcalls.td, llvm/test/Transforms/Util/DeclareRuntimeLibcalls aix.ll

PowerPC: Add vec_malloc functions to AIX in RuntimeLibcalls
DeltaFile
+7-0llvm/test/Transforms/Util/DeclareRuntimeLibcalls/aix.ll
+4-0llvm/include/llvm/IR/RuntimeLibcalls.td
+11-02 files

LLVM/project 5c91bballvm/include/llvm/IR RuntimeLibcalls.td, llvm/test/Transforms/Util/DeclareRuntimeLibcalls xcore.ll

XCore: Add iprintf to RuntimeLibcalls system library
DeltaFile
+6-0llvm/test/Transforms/Util/DeclareRuntimeLibcalls/xcore.ll
+1-0llvm/include/llvm/IR/RuntimeLibcalls.td
+7-02 files

LLVM/project cc5c05bllvm/include/llvm/IR RuntimeLibcalls.td, llvm/test/Transforms/Util/DeclareRuntimeLibcalls darwin.ll

RuntimeLibcalls: Add macos unlocked IO functions to systems

This is another of the easier to understand conditions from
TargetLibraryInfo
DeltaFile
+9-2llvm/test/Transforms/Util/DeclareRuntimeLibcalls/darwin.ll
+7-1llvm/include/llvm/IR/RuntimeLibcalls.td
+16-32 files

LLVM/project cb88ecdllvm/include/llvm/IR RuntimeLibcalls.td, llvm/test/Transforms/Util/DeclareRuntimeLibcalls emscripten.ll

RuntimeLibcalls: Add small_printf functions to emscripten
DeltaFile
+6-0llvm/test/Transforms/Util/DeclareRuntimeLibcalls/emscripten.ll
+4-0llvm/include/llvm/IR/RuntimeLibcalls.td
+10-02 files

LLVM/project 9e1d3callvm/include/llvm/IR RuntimeLibcalls.td RuntimeLibcalls.h, llvm/test/Transforms/Util/DeclareRuntimeLibcalls darwin.ll

RuntimeLibcalls: Add memset_pattern* calls to darwin systems (#167083)

This is one of the easier cases to comprehend in TargetLibraryInfo's
setup.
DeltaFile
+22-0llvm/test/Transforms/Util/DeclareRuntimeLibcalls/darwin.ll
+9-3llvm/include/llvm/IR/RuntimeLibcalls.td
+10-0llvm/include/llvm/IR/RuntimeLibcalls.h
+41-33 files

LLVM/project 820b0b6clang/include/clang/Analysis/Analyses/LifetimeSafety Origins.h, clang/lib/Analysis/LifetimeSafety FactsGenerator.cpp Origins.cpp

Multi-origin changes
DeltaFile
+357-30clang/test/Sema/warn-lifetime-safety.cpp
+216-87clang/lib/Analysis/LifetimeSafety/FactsGenerator.cpp
+119-64clang/lib/Analysis/LifetimeSafety/Origins.cpp
+89-20clang/include/clang/Analysis/Analyses/LifetimeSafety/Origins.h
+56-30clang/unittests/Analysis/LifetimeSafetyTest.cpp
+27-7clang/lib/Analysis/LifetimeSafety/LifetimeSafety.cpp
+864-2387 files not shown
+902-26513 files

LLVM/project 59b3d18llvm/include/llvm/IR RuntimeLibcalls.td

RuntimeLibcalls: Add more function entries from TargetLibraryInfo (#167082)

Script scraped dump of most functions in TargetLibraryInfo.def,
with existing entries and a few special cases removed. This only
adds the definitions, and doesn't add them to any system yet.

Adding them in the correct places is the hard part, since it's
all written as opt-out with manually written exemptions in
TargetLibraryInfo.
DeltaFile
+645-0llvm/include/llvm/IR/RuntimeLibcalls.td
+645-01 files

LLVM/project 43e69b1llvm/include/llvm/IR RuntimeLibcalls.td, llvm/lib/IR RuntimeLibcalls.cpp

RuntimeLibcalls: Add malloc and free entries (#167081)

Calloc was already here, but not the others. Also add
manual type information.
DeltaFile
+75-0llvm/lib/IR/RuntimeLibcalls.cpp
+9-0llvm/test/Transforms/Util/DeclareRuntimeLibcalls/basic.ll
+5-0llvm/include/llvm/IR/RuntimeLibcalls.td
+89-03 files

LLVM/project b343a44bolt/lib/Core BinaryFunction.cpp, bolt/lib/Rewrite RewriteInstance.cpp

[BOLT][BTI] Disassemble PLT entries when processing BTI binaries

PLT entries are PseudoFunctions, and are not disassembled or emitted.
For BTI, we need to check the first MCInst of PLT entries, to see
if indirectly calling them is safe or not.

This patch disassembles PLTs for binaries using BTI, while not changing
the behaviour for binaries without BTI.

The PLTs are only disassembled, not emitted.
DeltaFile
+31-0bolt/test/runtime/AArch64/disassemble-plts.c
+6-0bolt/lib/Rewrite/RewriteInstance.cpp
+5-0bolt/lib/Core/BinaryFunction.cpp
+42-03 files

LLVM/project 9b88cd9llvm/include/llvm/MC MCInstrDesc.h, llvm/include/llvm/Target Target.td

CodeGen: Remove PointerLikeRegClass handling from codegen (#159883)

All uses have been migrated to RegClassByHwMode. This is now
an implementation detail of InstrInfoEmitter for pseudoinstructions.
DeltaFile
+13-13llvm/include/llvm/Target/Target.td
+1-12llvm/include/llvm/MC/MCInstrDesc.h
+1-12llvm/utils/TableGen/Common/CodeGenDAGPatterns.cpp
+1-5llvm/utils/TableGen/InstrInfoEmitter.cpp
+0-4llvm/lib/CodeGen/TargetInstrInfo.cpp
+0-3llvm/utils/TableGen/Common/InstructionEncoding.cpp
+16-492 files not shown
+17-518 files

LLVM/project 7bf459bllvm/test/TableGen target-specialized-pseudos.td RegClassByHwMode.td, llvm/utils/TableGen InstrInfoEmitter.cpp

CodeGen: Make target overrides of PointerLikeRegClass mandatory (#159882)

Most targets should now use the convenience multiclass to fixup
the operand definitions of pointer-using pseudoinstructions:

defm : RemapAllTargetPseudoPointerOperands<target_ptr_regclass>;
DeltaFile
+26-8llvm/test/TableGen/target-specialized-pseudos.td
+15-3llvm/utils/TableGen/InstrInfoEmitter.cpp
+14-1llvm/test/TableGen/RegClassByHwMode.td
+2-0llvm/test/TableGen/def-multiple-operands.td
+2-0llvm/test/TableGen/get-named-operand-idx.td
+2-0llvm/test/TableGen/get-operand-type-no-expand.td
+61-122 files not shown
+64-128 files

LLVM/project e7bcd80llvm/lib/Target/SPIRV SPIRVInstructionSelector.cpp, llvm/test/CodeGen/SPIRV/llvm-intrinsics logical-memcpy.ll

[SPIRV] Use OpCopyMemory for logical SPIRV memcpy (#169348)

This commit modifies the SPIRV instruction selector to emit
`OpCopyMemory`
instead of `OpCopyMemorySized` when generating SPIRV for logical
addressing.

Previously, `G_MEMCPY` was translated to `OpCopyMemorySized`, which
requires an
explicit size operand. However, for logical SPIRV, the size of the
pointee type
is implicitly known. This change ensures that `OpCopyMemory` is used,
which is
more appropriate for logical SPIRV and aligns with the SPIR-V
specification for
logical addressing.
DeltaFile
+97-44llvm/lib/Target/SPIRV/SPIRVInstructionSelector.cpp
+32-0llvm/test/CodeGen/SPIRV/llvm-intrinsics/logical-memcpy.ll
+129-442 files

LLVM/project 1cd2273bolt/lib/Core BinaryFunction.cpp, bolt/lib/Rewrite RewriteInstance.cpp

[BOLT][BTI] Disassemble PLT entries when processing BTI binaries

PLT entries are PseudoFunctions, and are not disassembled or emitted.
For BTI, we need to check the first MCInst of PLT entries, to see
if indirectly calling them is safe or not.

This patch disassembles PLTs for binaries using BTI, while not changing
the behaviour for binaries without BTI.

The PLTs are only disassembled, not emitted.
DeltaFile
+30-0bolt/test/runtime/AArch64/disassemble-plts.c
+6-0bolt/lib/Rewrite/RewriteInstance.cpp
+5-0bolt/lib/Core/BinaryFunction.cpp
+41-03 files

LLVM/project 6e983e3llvm/lib/Target/SPIRV SPIRVUtils.cpp SPIRVGlobalRegistry.cpp, llvm/test/CodeGen/SPIRV/hlsl-resources cbuffer-peeled-array-minimal.ll cbuffer-peeled-array.ll

[SPIRV] Support Peeled Array Layouts for HLSL CBuffers (#169078)

This commit adds support for 'peeled arrays' in HLSL constant buffers.
HLSL CBuffers may have padding between array elements but not after the
last element. This is represented in LLVM IR as {[N-1 x {T, pad}], T}.

Changes include:
- Recognition of the peeled array pattern.
- Logic to reconstitute these into SPIR-V compatible arrays.
- Support for spirv.Padding type in GlobalRegistry and Builtins.
- Updates to SPIRVCBufferAccess to correctly calculate member offsets
  in these padded structures.

Depends on https://github.com/llvm/llvm-project/pull/169076
DeltaFile
+90-0llvm/test/CodeGen/SPIRV/hlsl-resources/cbuffer-peeled-array-minimal.ll
+74-0llvm/test/CodeGen/SPIRV/hlsl-resources/cbuffer-peeled-array.ll
+69-0llvm/lib/Target/SPIRV/SPIRVUtils.cpp
+23-0llvm/lib/Target/SPIRV/SPIRVGlobalRegistry.cpp
+15-0llvm/lib/Target/SPIRV/SPIRVUtils.h
+8-3llvm/lib/Target/SPIRV/SPIRVCBufferAccess.cpp
+279-34 files not shown
+289-310 files

LLVM/project 0c2701fllvm/lib/Target/AMDGPU R600.td SIInstructions.td, llvm/lib/Target/ARM ARM.td

CodeGen: Make all targets override pseudos with pointers (#159881)

This eliminates the need to have PointerLikeRegClass handling in
codegen.
DeltaFile
+12-9llvm/lib/Target/AMDGPU/R600.td
+11-0llvm/lib/Target/AMDGPU/SIInstructions.td
+10-0llvm/lib/Target/NVPTX/NVPTX.td
+8-0llvm/lib/Target/ARM/ARM.td
+8-0llvm/lib/Target/WebAssembly/WebAssembly.td
+4-0llvm/lib/Target/PowerPC/PPCRegisterInfo.td
+53-920 files not shown
+94-926 files

LLVM/project ddead38llvm/test/Assembler thinlto-summary.ll

fix test
DeltaFile
+10-10llvm/test/Assembler/thinlto-summary.ll
+10-101 files

LLVM/project 98456c8llvm/lib/Target/AMDGPU SIInsertWaitcnts.cpp, llvm/test/CodeGen/AMDGPU waitcnt-global-inv-wb.mir

address review
DeltaFile
+14-0llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
+1-1llvm/test/CodeGen/AMDGPU/waitcnt-global-inv-wb.mir
+15-12 files

LLVM/project 35dfeb7llvm/lib/Target/SPIRV SPIRVInstructionSelector.cpp, llvm/test/CodeGen/SPIRV logical-struct-access.ll freeze.ll

[SPIRV] Enable DCE in instruction selection and update tests (#168428)

The instruction selection pass for SPIR-V now performs dead code
elimination (DCE).
This change removes unused instructions, leading to more optimized
SPIR-V output.

As a consequence of this, several tests were updated to ensure their
continued
correctness and to prevent previously tested code from being optimized
away.
Specifically:
- Many tests now store computed values into global variables to ensure
they are
  not eliminated by DCE, allowing their code generation to be verified.
- The test `keep-tracked-const.ll` was removed because it no longer
tested
its original intent. The check statements in this test were for
constants

    [12 lines not shown]
DeltaFile
+195-8llvm/lib/Target/SPIRV/SPIRVInstructionSelector.cpp
+182-0llvm/test/CodeGen/SPIRV/transcoding/fcmp.ll
+75-26llvm/test/CodeGen/SPIRV/logical-struct-access.ll
+94-0llvm/test/CodeGen/SPIRV/extensions/SPV_KHR_float_controls2/decoration.ll
+39-14llvm/test/CodeGen/SPIRV/llvm-intrinsics/bitreverse_small_type.ll
+33-13llvm/test/CodeGen/SPIRV/freeze.ll
+618-6148 files not shown
+1,133-14254 files

LLVM/project ff0c347llvm/include/llvm/MC MCTargetOptionsCommandFlags.h, llvm/lib/MC MCTargetOptionsCommandFlags.cpp

opt: Try to respect target-abi command line option (#169604)

Mips seems kind of broken with these options. n32 seems to
override the 64-bit arch with 32-bit pointers, and trying
to use any 32-bit mips triple also just errors with any
options.
DeltaFile
+9-2llvm/lib/MC/MCTargetOptionsCommandFlags.cpp
+9-0llvm/test/tools/opt/infer-data-layout-target-abi.ll
+5-4llvm/tools/opt/optdriver.cpp
+3-2llvm/include/llvm/MC/MCTargetOptionsCommandFlags.h
+26-84 files

LLVM/project 4aeaa1ellvm/lib/Target/AMDGPU SIInsertWaitcnts.cpp, llvm/test/CodeGen/AMDGPU waitcnt-global-inv-wb.mir

address review
DeltaFile
+9-13llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
+1-1llvm/test/CodeGen/AMDGPU/waitcnt-global-inv-wb.mir
+10-142 files