LLVM/project f92034fclang/lib/CIR/Dialect/Transforms/TargetLowering CIRABIRewriteContext.cpp, clang/test/CIR/Transforms/abi-lowering direct-flatten.cir

[CIR] Load flattened struct args from coerce slot

At the call site, a struct argument that flattens into scalar wire
arguments was coerced to the ABI struct as a whole value and then
decomposed with cir.extract_member.  When the coercion goes through
memory, read each field from the coerced slot with cir.get_member +
cir.load instead, so the lowering takes pointers to the members it
wants rather than loading the entire structure and extracting from the
value.  The shared memory half of the coercion is factored into
emitCoercionToMemory, which returns the destination-typed pointer to
the coerce slot; emitCoercion now builds on it and loads the whole
value, so its existing callers are unchanged.  The no-coercion call
site (the operand already has the coerced type) keeps cir.extract_member
because that value has no backing slot to take member pointers from.

The remaining changes are mechanical: llvm::append_range and
SmallVector::append for the per-field loops, spelling out cir::RecordType
instead of auto at the getFlattenedCoercedType call sites, an enumerate
loop over the coerced members, and renaming the builder parameter from

    [5 lines not shown]
DeltaFile
+97-71clang/lib/CIR/Dialect/Transforms/TargetLowering/CIRABIRewriteContext.cpp
+33-12clang/test/CIR/Transforms/abi-lowering/direct-flatten.cir
+130-832 files

LLVM/project fd8096bllvm/test/CodeGen/AMDGPU splitkit-copy-live-lanes.mir ra-inserted-scalar-instructions.mir, llvm/test/CodeGen/X86 statepoint-invoke-ra-inline-spiller.mir

[MIR] Serialize/Deserialize MachineInstr::LRSplit attribute

The LRSplit MachineInstr flag is set by SplitKit on copies inserted for
live-range splitting.
Until now the flag had no MIR-text representation.

This patch fixes that so that it gets easier to reproduce/capture issues
that involves SplitKit.

Round-trip coverage in
llvm/test/CodeGen/MIR/AMDGPU/lr-split-flag.mir.
DeltaFile
+168-168llvm/test/CodeGen/AMDGPU/splitkit-copy-live-lanes.mir
+36-36llvm/test/CodeGen/AMDGPU/ra-inserted-scalar-instructions.mir
+32-32llvm/test/CodeGen/AMDGPU/splitkit-copy-bundle.mir
+27-27llvm/test/CodeGen/AMDGPU/ran-out-of-sgprs-allocation-failure.mir
+22-22llvm/test/CodeGen/AMDGPU/inflated-reg-class-snippet-copy-use-after-free.mir
+22-22llvm/test/CodeGen/X86/statepoint-invoke-ra-inline-spiller.mir
+307-30732 files not shown
+439-40538 files

LLVM/project c3a146allvm/lib/Target/AMDGPU AMDGPURegBankLegalizeRules.cpp, llvm/test/CodeGen/AMDGPU llvm.amdgcn.sched.group.barrier.ll llvm.amdgcn.iglp.opt.ll

AMDGPU/GlobalISel: RegBankLegalize rules for sched barriers intrinsics (#203425)

Add rules for sched barrier intrinsics. Note, there are regressions due
to AGPR results being copied back to VGPR un-necessarily. That will be
addressed in a future follow-up patch.
DeltaFile
+1,813-654llvm/test/CodeGen/AMDGPU/llvm.amdgcn.sched.group.barrier.ll
+784-230llvm/test/CodeGen/AMDGPU/llvm.amdgcn.iglp.opt.ll
+2-1llvm/test/CodeGen/AMDGPU/llvm.amdgcn.sched.barrier.ll
+3-0llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeRules.cpp
+2,602-8854 files

LLVM/project 664a595llvm/lib/Target/AMDGPU SIISelLowering.cpp

Rebase for new isVALU calls

Change-Id: Id2280498a63994268e902d90b787e32fdccc912a
DeltaFile
+2-2llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+2-21 files

LLVM/project 044c9b7llvm/lib/IR AutoUpgrade.cpp, llvm/test/CodeGen/AArch64 ptrauth-init-fini-autoupgrade.ll

Fix module flag name spelling in AutoUpgrade.cpp
DeltaFile
+2-2llvm/lib/IR/AutoUpgrade.cpp
+1-1llvm/test/CodeGen/AArch64/ptrauth-init-fini-autoupgrade.ll
+3-32 files

LLVM/project a1811b8llvm/lib/Target/AMDGPU GCNHazardRecognizer.cpp SIInstrInfo.cpp

Formatting

Change-Id: I0fbcad129f96986d2a448bfa4b5a027a2a5c07bd
DeltaFile
+27-16llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp
+6-3llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
+4-4llvm/lib/Target/AMDGPU/AMDGPUIGroupLP.cpp
+2-2llvm/lib/Target/AMDGPU/SILowerControlFlow.cpp
+2-1llvm/lib/Target/AMDGPU/AMDGPUSetWavePriority.cpp
+2-1llvm/lib/Target/AMDGPU/SIInstrInfo.h
+43-271 files not shown
+45-287 files

LLVM/project f4ae9c3llvm/lib/Target/AMDGPU GCNHazardRecognizer.cpp SIInstrInfo.h

[AMDGPU] NFC: Obviously show isVALU includes LDSDMA instructions

Change-Id: I3854fe397cafad4484c5af53c739e2117287d2c9
DeltaFile
+41-41llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp
+13-7llvm/lib/Target/AMDGPU/SIInstrInfo.h
+8-8llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
+3-3llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
+2-2llvm/lib/Target/AMDGPU/AMDGPUSetWavePriority.cpp
+2-2llvm/lib/Target/AMDGPU/AMDGPUInsertDelayAlu.cpp
+69-637 files not shown
+78-7213 files

LLVM/project 329b484llvm/lib/Target/AMDGPU AMDGPURegBankLegalizeRules.cpp, llvm/test/CodeGen/AMDGPU packed-fp64.ll

AMDGPU/GlobalISel: RegBankLegalize rules for pk_f64 fadd, fmul and fma
DeltaFile
+238-22llvm/test/CodeGen/AMDGPU/packed-fp64.ll
+9-2llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeRules.cpp
+247-242 files

LLVM/project deb0c89llvm/lib/Target/AMDGPU AMDGPURegBankLegalizeRules.cpp, llvm/test/CodeGen/AMDGPU llvm.amdgcn.wmma.gfx1251.w32.ll llvm.amdgcn.wmma.imod.gfx1251.w32.ll

AMDGPU/GlobalISel: RegBankLegalize rules for gfx1251 wmma intrinsics
DeltaFile
+1-1llvm/test/CodeGen/AMDGPU/llvm.amdgcn.wmma.gfx1251.w32.ll
+1-1llvm/test/CodeGen/AMDGPU/llvm.amdgcn.wmma.imod.gfx1251.w32.ll
+1-1llvm/test/CodeGen/AMDGPU/llvm.amdgcn.wmma.imm.gfx1251.w32.ll
+2-0llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeRules.cpp
+5-34 files

LLVM/project 3f51f92llvm/test/CodeGen/AMDGPU llvm.log.ll llvm.log10.ll, llvm/test/CodeGen/AMDGPU/GlobalISel sdiv.i64.ll srem.i64.ll

AMDGPU/GlobalISel: Switch some tests to -new-reg-bank-select
DeltaFile
+1,134-744llvm/test/CodeGen/AMDGPU/llvm.log.ll
+1,134-744llvm/test/CodeGen/AMDGPU/llvm.log10.ll
+808-685llvm/test/CodeGen/AMDGPU/GlobalISel/sdiv.i64.ll
+806-681llvm/test/CodeGen/AMDGPU/GlobalISel/srem.i64.ll
+555-426llvm/test/CodeGen/AMDGPU/llvm.exp10.ll
+554-425llvm/test/CodeGen/AMDGPU/llvm.exp.ll
+4,991-3,7057 files not shown
+6,219-4,96913 files

LLVM/project 52751a0clang/lib/Driver/ToolChains Clang.cpp, clang/test/Driver hip-toolchain-no-rdc.hip

[AMDGPU][SPIR-V] Fix treating SPIR-V input as the wrong LLVM-IR (#202986)

Summary:
This hack is intended for non-RDC AMDGCN.
DeltaFile
+15-8clang/test/Driver/hip-toolchain-no-rdc.hip
+12-10clang/lib/Driver/ToolChains/Clang.cpp
+27-182 files

LLVM/project d46513abolt/lib/Profile Heatmap.cpp, bolt/test/X86 heatmap-preagg.test

[BOLT] Fix heatmap with external addresses (#203479)

External samples (X:0) were breaking heatmap printing, e.g.
```
  0x00000000: O0x00000000: ........
```
Explicitly track `IsFirst` instead of relying on zero.

Test Plan:
updated heatmap-preagg.test
DeltaFile
+13-0bolt/test/X86/heatmap-preagg.test
+7-5bolt/lib/Profile/Heatmap.cpp
+20-52 files

LLVM/project 6c3d7edllvm/lib/Target/AMDGPU AMDGPURegBankLegalizeRules.cpp, llvm/test/CodeGen/AMDGPU llvm.amdgcn.cvt.sr.ll

[AMDGPU][GISel] Add register bank legalization rules for amdgcn_cvt_sr_f16_f32. (#203253)
DeltaFile
+3-0llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeRules.cpp
+1-1llvm/test/CodeGen/AMDGPU/llvm.amdgcn.cvt.sr.ll
+4-12 files

LLVM/project 593568ellvm/lib/IR AutoUpgrade.cpp

Improve readability
DeltaFile
+35-25llvm/lib/IR/AutoUpgrade.cpp
+35-251 files

LLVM/project 30c451ellvm/lib/Target/AMDGPU GCNHazardRecognizer.cpp SIInstrInfo.cpp

Formatting

Change-Id: I0fbcad129f96986d2a448bfa4b5a027a2a5c07bd
DeltaFile
+27-16llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp
+6-3llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
+4-4llvm/lib/Target/AMDGPU/AMDGPUIGroupLP.cpp
+2-2llvm/lib/Target/AMDGPU/SILowerControlFlow.cpp
+2-1llvm/lib/Target/AMDGPU/SIFixSGPRCopies.cpp
+2-1llvm/lib/Target/AMDGPU/AMDGPUSetWavePriority.cpp
+43-271 files not shown
+45-287 files

LLVM/project 09e3e00llvm/utils/gn/secondary/lldb/test BUILD.gn

[gn] port 127a4c1a883d333 (LLVM_TARGETS_TO_BUILD for lldb shell tests) (#203547)
DeltaFile
+13-0llvm/utils/gn/secondary/lldb/test/BUILD.gn
+13-01 files

LLVM/project 3255d4dllvm/lib/Bitcode/Reader BitcodeReader.cpp, llvm/test/Bitcode byte-constants.ll

[Bitcode] Decode small byte constants as signed values (#203408)

Decode small byte constants the same way we encode them. The bitcode
writer stores ConstantByte values as signed integers, so the reader must
rebuild them using the signed ConstantByte::get path. This has high-bit
values like b8 255 round-trip as their canonical signed form, b8 -1,
instead of tripping the APInt width assertion. This matches current i8
behavior.

Before the fix, the new test crashes in llvm-dis with: "APInt.h:
Assertion `llvm::isUIntN(BitWidth, val) && "Value is not an N-bit
unsigned value"' failed."

Bug found while investigating this PR
(https://github.com/llvm/llvm-project/pull/177908), which transitions
the LSV to emitting the byte type. Fix assisted by AI.
DeltaFile
+7-0llvm/test/Bitcode/byte-constants.ll
+2-1llvm/lib/Bitcode/Reader/BitcodeReader.cpp
+9-12 files

LLVM/project 999e754llvm/lib/Target/AMDGPU GCNHazardRecognizer.cpp SIInstrInfo.h

[AMDGPU] NFC: Obviously show isVALU includes LDSDMA instructions

Change-Id: I3854fe397cafad4484c5af53c739e2117287d2c9
DeltaFile
+41-41llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp
+13-7llvm/lib/Target/AMDGPU/SIInstrInfo.h
+8-8llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
+3-3llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
+2-2llvm/lib/Target/AMDGPU/AMDGPUSetWavePriority.cpp
+2-2llvm/lib/Target/AMDGPU/AMDGPUHazardLatency.cpp
+69-637 files not shown
+78-7213 files

LLVM/project 8b18543llvm/test/CodeGen/AMDGPU splitkit-copy-live-lanes.mir ra-inserted-scalar-instructions.mir, llvm/test/CodeGen/X86 statepoint-invoke-ra-inline-spiller.mir

[MIR] Serialize/Deserialize MachineInstr::LRSplit attribute

The LRSplit MachineInstr flag is set by SplitKit on copies inserted for
live-range splitting.
Until now the flag had no MIR-text representation.

This patch fixes that so that it gets easier to reproduce/capture issues
that involves SplitKit.

Round-trip coverage in
llvm/test/CodeGen/MIR/AMDGPU/lr-split-flag.mir.
DeltaFile
+168-168llvm/test/CodeGen/AMDGPU/splitkit-copy-live-lanes.mir
+36-36llvm/test/CodeGen/AMDGPU/ra-inserted-scalar-instructions.mir
+32-32llvm/test/CodeGen/AMDGPU/splitkit-copy-bundle.mir
+27-27llvm/test/CodeGen/AMDGPU/ran-out-of-sgprs-allocation-failure.mir
+22-22llvm/test/CodeGen/X86/statepoint-invoke-ra-inline-spiller.mir
+22-22llvm/test/CodeGen/AMDGPU/inflated-reg-class-snippet-copy-use-after-free.mir
+307-30731 files not shown
+437-40337 files

LLVM/project 305faf4llvm/utils/gn/secondary/clang-tools-extra/clang-tidy/readability BUILD.gn

[gn build] Port fc1f754c397b (#203542)
DeltaFile
+1-0llvm/utils/gn/secondary/clang-tools-extra/clang-tidy/readability/BUILD.gn
+1-01 files

LLVM/project 422d559llvm/utils/gn/secondary/llvm/lib/Target/X86 BUILD.gn, llvm/utils/gn/secondary/llvm/unittests/MC BUILD.gn

[gn build] Port df75b5d458b9 (#203541)
DeltaFile
+1-0llvm/utils/gn/secondary/llvm/lib/Target/X86/BUILD.gn
+1-0llvm/utils/gn/secondary/llvm/unittests/MC/BUILD.gn
+2-02 files

LLVM/project 71ff21allvm/utils/gn/secondary/lldb/source/Plugins/ObjectFile/Mach-O BUILD.gn

[gn build] Port d0a1f86e7890 (#203540)
DeltaFile
+4-1llvm/utils/gn/secondary/lldb/source/Plugins/ObjectFile/Mach-O/BUILD.gn
+4-11 files

LLVM/project e302e85llvm/utils/gn/secondary/libcxx/include BUILD.gn

[gn build] Port caea95990515 (#203539)
DeltaFile
+1-1llvm/utils/gn/secondary/libcxx/include/BUILD.gn
+1-11 files

LLVM/project c8711e5llvm/utils/gn/secondary/lldb/source/Plugins/Process/Windows/Common BUILD.gn

[gn build] Port b57c32db810b (#203538)
DeltaFile
+1-0llvm/utils/gn/secondary/lldb/source/Plugins/Process/Windows/Common/BUILD.gn
+1-01 files

LLVM/project d6ddc21llvm/utils/gn/secondary/clang/lib/Headers BUILD.gn

[gn build] Port b000f9032911 (#203537)
DeltaFile
+1-0llvm/utils/gn/secondary/clang/lib/Headers/BUILD.gn
+1-01 files

LLVM/project ad6449fllvm/utils/gn/secondary/compiler-rt/lib/builtins BUILD.gn

[gn] "port" 93e03fc2666e (#203536)
DeltaFile
+7-0llvm/utils/gn/secondary/compiler-rt/lib/builtins/BUILD.gn
+7-01 files

LLVM/project 43dc65dllvm/lib/Analysis ValueTracking.cpp, llvm/test/Analysis/ValueTracking known-non-zero-shr-add.ll

[ValueTracking] Infer non-zero from shr (add nuw A, B), C  (#203039)

...if either A or B has a known-one bit at position >= C.

https://alive2.llvm.org/ce/z/ELYTjh

This eliminates null checks in some internal workloads.

Assisted-by: claude
DeltaFile
+119-0llvm/test/Analysis/ValueTracking/known-non-zero-shr-add.ll
+15-0llvm/lib/Analysis/ValueTracking.cpp
+134-02 files

LLVM/project 7125490mlir/lib/Conversion/ArithToSPIRV ArithToSPIRV.cpp

[mlir][SPIR-V] Collapse duplicated i1-extension patterns in ArithToSPIRV (NFC) (#203247)
DeltaFile
+16-59mlir/lib/Conversion/ArithToSPIRV/ArithToSPIRV.cpp
+16-591 files

LLVM/project d426ccamlir/lib/Dialect/SPIRV/IR SPIRVCanonicalization.cpp, mlir/test/Dialect/SPIRV/Transforms canonicalize.mlir

[mlir][SPIR-V] Guard UMod canonicalization against zero divisor (#203513)

Chained `spirv.UMod` with a zero outer divisor reached `APInt::urem`
which causes UB
DeltaFile
+30-0mlir/test/Dialect/SPIRV/Transforms/canonicalize.mlir
+5-0mlir/lib/Dialect/SPIRV/IR/SPIRVCanonicalization.cpp
+35-02 files

LLVM/project 0e704a0llvm/lib/Target/AMDGPU GCNSchedStrategy.cpp GCNSchedStrategy.h, llvm/test/CodeGen/AMDGPU sched_mfma_rewrite_copies.mir

[AMDGPU] Fix illegal AGPR reclassification in RewriteMFMAFormStage (#200972)

If src2 escapes rewrite group then bridge copy AGPR -> VGPR must be
inserted.

Fixes a regression after
https://github.com/llvm/llvm-project/pull/198555
DeltaFile
+196-0llvm/test/CodeGen/AMDGPU/sched_mfma_rewrite_copies.mir
+41-9llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp
+11-1llvm/lib/Target/AMDGPU/GCNSchedStrategy.h
+248-103 files