LLVM/project fdd3b84llvm/lib/Transforms/Vectorize SLPVectorizer.cpp, llvm/test/Transforms/SLPVectorizer/RISCV vec3-base.ll

[SLP] Fix FMA regression in FMA-candidate retry

When tryToVectorize is called with AllowFMACandidates=true, falling
through to tryToVectorizeList vectorizes the fmul operands of an
FMA-candidate fadd without accounting for the lost FMA opportunity.
canConvertToFMA requires those fmuls to have one use, so vectorizing
them always breaks FMA formation. The cost model for tryToVectorizeList
omits the fadd from the tree and compares "2 fmuls vs 1 vfmul", missing
the scalar FMA savings entirely.
Block tryToVectorizeList when AllowFMACandidates=true. TryToReduce is
safe because computeReductionCost accounts for FMA in the scalar
baseline via canConvertToFMA on the fadd user.
Fixes a 4.5% regression in SPEC17 imagemagick on AArch64 introduced by

Reviewers: sushgokh, bababuck

Pull Request: https://github.com/llvm/llvm-project/pull/199706
DeltaFile
+24-15llvm/test/Transforms/SLPVectorizer/RISCV/vec3-base.ll
+16-10llvm/test/Transforms/SLPVectorizer/X86/dot-product.ll
+4-8llvm/test/Transforms/SLPVectorizer/X86/slp-fma-loss.ll
+6-4llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+50-374 files

LLVM/project 5157be7llvm/docs SandboxIR.md, llvm/include/llvm/SandboxIR Tracker.h Context.h

[SandboxIR][Tracker] Implement accept(/*AcceptAll*/) and revert(/*RevertAll*/) (#197289)

In the context of nested checkpoints the tracker's API was somewhat
inconsistent. Tracker::revert() would revert to the last checkpoint but
accept() would accept all changes.

This patch fixes this, and introduces `accept(bool AcceptAll)` and
`revert(bool RevertAll)`.
DeltaFile
+54-0llvm/unittests/SandboxIR/TrackerTest.cpp
+18-6llvm/include/llvm/SandboxIR/Tracker.h
+13-4llvm/lib/SandboxIR/Tracker.cpp
+2-2llvm/include/llvm/SandboxIR/Context.h
+2-1llvm/docs/SandboxIR.md
+89-135 files

LLVM/project 3dd3b6fllvm/docs/AMDGPU AMDGPUAsmGFX950.rst

[AMDGPU][docs][NFC] Fix some instruction names in gfx950 doc (#199094)

In the GFX950 documentation, some instructions that should have the
_sdwa suffix were incorrectly given the _dpp suffix.
DeltaFile
+536-536llvm/docs/AMDGPU/AMDGPUAsmGFX950.rst
+536-5361 files

LLVM/project 5e7b1f5libc/config/baremetal config.json

Revert "Revert "[libc] Enable baremetal float printf using modular format" (#…"

This reverts commit 0f679999aae135b388c25fb1acbb030109c6418f.
DeltaFile
+3-3libc/config/baremetal/config.json
+3-31 files

LLVM/project 3212caautils/bazel/llvm-project-overlay/llvm BUILD.bazel

[bazel] Use `additional_compiler_inputs` to handle include scanning for TargetPassRegistry.inc (#199201)

This use of using `#include` with a macro breaks include scanning, for
example:

* `GET_PASS_REGISTRY` defined here:
https://github.com/llvm/llvm-project/blob/5c853423f4f9e7296b7596b7f3ccade481686bfd/llvm/lib/Target/AArch64/AArch64TargetMachine.cpp#L603
* `GET_PASS_REGISTRY` included here:
https://github.com/llvm/llvm-project/blob/5c853423f4f9e7296b7596b7f3ccade481686bfd/llvm/include/llvm/Passes/TargetPassRegistry.inc#L60

When include scanning is enabled, the `PassRegistry.def` gets omitted
because it the include scanner does not handle this case. Providing it
via `additional_compiler_inputs` ensures it is included even in that
case.
DeltaFile
+14-0utils/bazel/llvm-project-overlay/llvm/BUILD.bazel
+14-01 files

LLVM/project 46bb9d1clang/include/clang/AST DeclTemplate.h, clang/lib/AST DeclTemplate.cpp

[clang] fix getTemplateInstantiationArgs

This implements a new strategy for collecting the template arguments, by
relying on the qualifiers and template parameter lists to navigate the template
context of out-of-line definitions.

This greatly simplifies the signature of that function, by removing a bunch
of workarounds, and simpliffying a couple that weren't removed yet.

Since this now relies on qualifiers and template parameter lists,
this patch expends most of its effort making sure these are placed,
transformed and propagated to template instantiations.

Also makes the explicit specialization AST nodes stop abusing the template
parameter lists by storing it's own template parameter list, creating a
dedicated field for them, similar to partial specializations.
DeltaFile
+194-429clang/lib/Sema/SemaTemplateInstantiate.cpp
+257-164clang/lib/Sema/SemaTemplateInstantiateDecl.cpp
+154-150clang/lib/Sema/SemaTemplate.cpp
+96-95clang/include/clang/AST/DeclTemplate.h
+59-129clang/lib/Sema/SemaConcept.cpp
+60-92clang/lib/AST/DeclTemplate.cpp
+820-1,05951 files not shown
+1,461-1,72157 files

LLVM/project 3ce7b40clang/lib/Basic FileManager.cpp, clang/lib/Lex HeaderSearch.cpp

Revert "[clang] Use FileError in FileManager::getFileRef, getDirectoryRef" (#199721)

Reverts llvm/llvm-project#199126

This caused a small compile time regression.
DeltaFile
+0-69clang/unittests/Basic/FileManagerTest.cpp
+10-13clang/lib/Basic/FileManager.cpp
+6-2clang/lib/Lex/HeaderSearch.cpp
+16-843 files

LLVM/project 593a238llvm/lib/Target/AMDGPU AMDGPUInstructionSelector.cpp SIISelLowering.cpp, llvm/test/CodeGen/AMDGPU llvm.amdgcn.fma.legacy.ll llvm.amdgcn.sudot4.ll

[AMDGPU] Diagnose unsupported fma_legacy/sudot4/sudot8 intrinsics on some subtargets (#198464)

Add proper diagnostics for `llvm.amdgcn.fma.legacy`,
`llvm.amdgcn.sudot4` and `llvm.amdgcn.sudot8` on subtargets where they
are unsupported
DeltaFile
+21-8llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
+9-0llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+9-0llvm/test/CodeGen/AMDGPU/llvm.amdgcn.fma.legacy.ll
+9-0llvm/test/CodeGen/AMDGPU/llvm.amdgcn.sudot4.ll
+9-0llvm/test/CodeGen/AMDGPU/llvm.amdgcn.sudot8.ll
+2-0llvm/lib/Target/AMDGPU/GCNSubtarget.h
+59-86 files

LLVM/project 7404f58llvm/lib/CodeGen MachineFunction.cpp, llvm/lib/MC MCDwarf.cpp

[AMDGPU][MC] Replace shifted registers in CFI instructions

Change-Id: I0d99e9fe43ec3b6fecac20531119956dca2e4e5c
DeltaFile
+67-67llvm/test/CodeGen/AMDGPU/sgpr-spill-overlap-wwm-reserve.mir
+33-0llvm/lib/MC/MCDwarf.cpp
+15-15llvm/test/CodeGen/AMDGPU/dwarf-multi-register-use-crash.ll
+10-0llvm/lib/CodeGen/MachineFunction.cpp
+4-4llvm/test/CodeGen/AMDGPU/debug-frame.ll
+2-2llvm/test/CodeGen/AMDGPU/pei-vgpr-block-spill-csr.mir
+131-885 files not shown
+143-9011 files

LLVM/project 93baf03llvm/lib/Target/AMDGPU SIFrameLowering.cpp SIMachineFunctionInfo.h, llvm/test/CodeGen/AMDGPU amdgpu-spill-cfi-saved-regs.ll

[AMDGPU] Implement -amdgpu-spill-cfi-saved-regs

These spills need special CFI anyway, so implementing them directly
where CFI is emitted avoids the need to invent a mechanism to track them
from ISel.

Change-Id: If4f34abb3a8e0e46b859a7c74ade21eff58c4047
Co-authored-by: Scott Linder scott.linder at amd.com
Co-authored-by: Venkata Ramanaiah Nalamothu VenkataRamanaiah.Nalamothu at amd.com
DeltaFile
+2,926-0llvm/test/CodeGen/AMDGPU/amdgpu-spill-cfi-saved-regs.ll
+12-0llvm/lib/Target/AMDGPU/SIFrameLowering.cpp
+10-0llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.h
+9-0llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
+2-0llvm/lib/Target/AMDGPU/SIRegisterInfo.h
+2,959-05 files

LLVM/project dc7a7d4llvm/test/CodeGen/AMDGPU amdgcn.bitcast.1024bit.ll gfx-callable-argument-types.ll

[AMDGPU] Implement CFI for CSR spills

Introduce new SPILL pseudos to allow CFI to be generated for only CSR
spills, and to make ISA-instruction-level accurate information.

Other targets either generate slightly incorrect information or rely on
conventions for how spills are placed within the entry block. The
approach in this change produces larger unwind tables, with the
increased size being spent on additional DW_CFA_advance_location
instructions needed to describe the unwinding accurately.

Change-Id: I9b09646abd2ac4e56eddf5e9aeca1a5bebbd43dd
Co-authored-by: Scott Linder <scott.linder at amd.com>
Co-authored-by: Venkata Ramanaiah Nalamothu <VenkataRamanaiah.Nalamothu at amd.com>
DeltaFile
+3,568-2,598llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.1024bit.ll
+1,912-1,913llvm/test/CodeGen/AMDGPU/gfx-callable-argument-types.ll
+2,700-12llvm/test/CodeGen/AMDGPU/accvgpr-spill-scc-clobber.mir
+631-631llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.512bit.ll
+505-510llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.960bit.ll
+394-399llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.896bit.ll
+9,710-6,063108 files not shown
+14,819-9,521114 files

LLVM/project bd8d788llvm/test/CodeGen/AMDGPU accvgpr-spill-scc-clobber.mir pei-build-av-spill.mir

[AMDGPU] Implement CFI for non-kernel functions

This does not implement CSR spills other than those AMDGPU handles
during PEI. The remaining spills are handled in a subsequent patch.

Change-Id: I5e3a9a62cf9189245011a82a129790d813d49373
Co-authored-by: Scott Linder <scott.linder at amd.com>
Co-authored-by: Venkata Ramanaiah Nalamothu <VenkataRamanaiah.Nalamothu at amd.com>
DeltaFile
+5,568-0llvm/test/CodeGen/AMDGPU/accvgpr-spill-scc-clobber.mir
+3,000-96llvm/test/CodeGen/AMDGPU/pei-build-av-spill.mir
+2,208-72llvm/test/CodeGen/AMDGPU/pei-build-spill.mir
+2,196-0llvm/test/CodeGen/AMDGPU/eliminate-frame-index-s-mov-b32.mir
+2,136-0llvm/test/CodeGen/AMDGPU/vgpr-spill-scc-clobber.mir
+1,671-1llvm/test/CodeGen/AMDGPU/debug-frame.ll
+16,779-16993 files not shown
+22,925-1,04999 files

LLVM/project ab04794llvm/test/CodeGen/AMDGPU amdgcn.bitcast.1024bit.ll amdgcn.bitcast.960bit.ll

[AMDGPU] Use register pair for PC spill

Change-Id: Ibedeef926f7ff235a06de65a83087c151f66a416
DeltaFile
+4,331-4,331llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.1024bit.ll
+1,742-1,740llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.960bit.ll
+1,562-1,560llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.512bit.ll
+1,462-1,460llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.896bit.ll
+1,238-1,236llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.832bit.ll
+1,030-1,028llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.768bit.ll
+11,365-11,35589 files not shown
+18,153-18,04495 files

LLVM/project d73760f

[AMDGPU] Emit entry function Dwarf CFI

Entry functions represent the end of unwinding, as they are the
outer-most frame. This implies they can only have a meaningful
definition for the CFA, which AMDGPU defines using a memory location
description with a literal private address space address. The return
address is set to undefined as a sentinel value to signal the end of
unwinding.

Change-Id: I21580f6a24f4869ba32939c9c6332506032cc654
Co-authored-by: Scott Linder <scott.linder at amd.com>
Co-authored-by: Venkata Ramanaiah Nalamothu <VenkataRamanaiah.Nalamothu at amd.com>
DeltaFile
+0-00 files

LLVM/project 83d53f3

[MC][Dwarf] Add custom CFI pseudo-ops for use in AMDGPU

While these can be represented with .cfi_escape, using these pseudo-cfi
instructions makes .s/.mir files more readable, and it is necessary to
support updating registers in CFI instructions (something that the
AMDGPU backend requires).

Change-Id: I763d0cabe5990394670281d4afb5a170981e55d0
DeltaFile
+0-00 files

LLVM/project 4358864

[Clang] Default to async unwind tables for amdgcn

To avoid codegen changes when enabling debug-info (see
https://bugs.llvm.org/show_bug.cgi?id=37240) we want to
enable unwind tables by default.

There is some pessimization in post-prologepilog scheduling, and a
general solution to the problem of CFI_INSTRUCTION-as-scheduling-barrier
should be explored.

Change-Id: I83625875966928c7c4411cd7b95174dc58bda25a
DeltaFile
+0-00 files

LLVM/project 3e7fb7b

[MIR] Error on signed integer in getUnsigned

Previously we effectively took the absolute value of the APSInt, instead
diagnose the unexpected negative value.

Change-Id: I4efe961e7b29fdf1d5f97df12f8139aac12c9219
DeltaFile
+0-00 files

LLVM/project 13f9f46llvm/lib/Target/AMDGPU SIFrameLowering.cpp, llvm/test/CodeGen/AMDGPU debug-frame.ll eliminate-frame-index-v-add-u32.mir

[AMDGPU] Emit entry function Dwarf CFI (#183152)

Entry functions represent the end of unwinding, as they are the
outer-most frame. This implies they can only have a meaningful
definition for the CFA, which AMDGPU defines using a memory location
description with a literal private address space address. The return
address is set to undefined as a sentinel value to signal the end of
unwinding.

Change-Id: I21580f6a24f4869ba32939c9c6332506032cc654
Co-authored-by: Scott Linder <scott.linder at amd.com>
Co-authored-by: Venkata Ramanaiah Nalamothu <VenkataRamanaiah.Nalamothu at amd.com>
DeltaFile
+1,405-0llvm/test/CodeGen/AMDGPU/debug-frame.ll
+204-12llvm/test/CodeGen/AMDGPU/eliminate-frame-index-v-add-u32.mir
+134-6llvm/test/CodeGen/AMDGPU/eliminate-frame-index-v-add-co-u32.mir
+114-10llvm/test/CodeGen/AMDGPU/eliminate-frame-index-s-add-i32.mir
+42-5llvm/lib/Target/AMDGPU/SIFrameLowering.cpp
+34-0llvm/test/CodeGen/AMDGPU/entry-function-cfi.mir
+1,933-3322 files not shown
+2,044-5028 files

LLVM/project 7a66e99llvm/lib/Target/AMDGPU AMDGPURegBankCombiner.cpp, llvm/test/CodeGen/AMDGPU global-saddr-load.ll

[AMDGPU][True16] Add regbank combiner cases to fix regression around G_SEXTLOAD (#198671)

<sub>Stack created with <a
href="https://github.com/github/gh-stack">GitHub Stacks CLI</a> • <a
href="https://gh.io/stacks-feedback">Give Feedback 💬</a></sub>

Stack PRs:
https://github.com/llvm/llvm-project/pull/198669
https://github.com/llvm/llvm-project/pull/198670

See https://github.com/llvm/llvm-project/pull/195289 for previous
discussion
DeltaFile
+63-165llvm/test/CodeGen/AMDGPU/GlobalISel/load-d16.ll
+24-90llvm/test/CodeGen/AMDGPU/global-saddr-load.ll
+15-2llvm/lib/Target/AMDGPU/AMDGPURegBankCombiner.cpp
+102-2573 files

LLVM/project 6519c04llvm/lib/CodeGen InlineSpiller.cpp

[1/3][RegAlloc][LiveRegMatrix] Fix inconsistency in HoistSpillHelper delegates (#197773)

HoistSpillHelper's LiveRangeEdit delegate callbacks did not keep the
LiveRegMatrix consistent when eliminateDeadDefs triggered interval
shrinking and splitting during spill hoisting.

Three issues:

1. No LRE_WillShrinkVirtReg override: when eliminateDeadDefs shrinks a
vreg's interval via shrinkToUses, the matrix was not updated. Add an
override that unassigns the vreg from the matrix and records it in
PendingReassignments for later re-assignment.

2. LRE_DidCloneVirtReg called VRM.assignVirt2Phys without
Matrix->assign: when splitSeparateComponents creates new vregs, the
clones got VRM entries but were never inserted into the matrix. Fix by
consuming PendingReassignments and properly assigning both Old (shrunk)
and New (split) intervals to the matrix.


    [15 lines not shown]
DeltaFile
+62-4llvm/lib/CodeGen/InlineSpiller.cpp
+62-41 files

LLVM/project 08e83a5.github/workflows release-tasks.yml

workflows/release-tasks: Remove template expansion (#199444)

https://github.com/llvm/llvm-project/security/code-scanning/1737
https://github.com/llvm/llvm-project/security/code-scanning/1738
https://github.com/llvm/llvm-project/security/code-scanning/1739
https://github.com/llvm/llvm-project/security/code-scanning/1740
https://github.com/llvm/llvm-project/security/code-scanning/1741
https://github.com/llvm/llvm-project/security/code-scanning/1742
DeltaFile
+7-5.github/workflows/release-tasks.yml
+7-51 files

LLVM/project ef59dbeflang/lib/Lower ConvertConstant.cpp ConvertVariable.cpp, flang/lib/Optimizer/Builder FIRBuilder.cpp IntrinsicCall.cpp

[flang][cuda] Lower c_devptr value arguments in bind(c) like c_ptr (#199316)

Treat `type(c_devptr), value` arguments in BIND(C) interfaces like
`type(c_ptr), value` by passing the nested raw address value instead of
the outer derived type ABI. This keeps call signatures consistent for
CUDA Fortran generic specifics that share a C binding label and avoids
argument misclassification at the x86_64 register/stack boundary.
DeltaFile
+31-16flang/lib/Lower/ConvertConstant.cpp
+8-19flang/lib/Optimizer/Builder/FIRBuilder.cpp
+22-0flang/test/HLFIR/c_devptr_byvalue.cuf
+10-11flang/lib/Lower/ConvertVariable.cpp
+6-6flang/lib/Lower/ConvertCall.cpp
+4-8flang/lib/Optimizer/Builder/IntrinsicCall.cpp
+81-605 files not shown
+90-7211 files

LLVM/project 84f75f7llvm/lib/Support UnicodeNameToCodepointGenerated.cpp, llvm/test/CodeGen/AArch64 bf16-v8-instructions.ll

Rebase

Created using spr 1.3.6-beta.1
DeltaFile
+23,873-20,923llvm/lib/Support/UnicodeNameToCodepointGenerated.cpp
+8,633-8,584llvm/test/CodeGen/Thumb2/mve-clmul.ll
+12,365-0llvm/test/CodeGen/AMDGPU/llvm.amdgcn.av.load.b128.ll
+1,243-8,768llvm/test/CodeGen/X86/vector-replicaton-i1-mask.ll
+7,616-740llvm/test/CodeGen/AArch64/bf16-v8-instructions.ll
+8,195-0llvm/test/MC/AMDGPU/gfx13_asm_vop3.s
+61,925-39,0159,609 files not shown
+597,775-258,3149,615 files

LLVM/project 29aabb6llvm/lib/Support UnicodeNameToCodepointGenerated.cpp, llvm/test/CodeGen/AArch64 bf16-v8-instructions.ll

[𝘀𝗽𝗿] changes introduced through rebase

Created using spr 1.3.6-beta.1

[skip ci]
DeltaFile
+23,873-20,923llvm/lib/Support/UnicodeNameToCodepointGenerated.cpp
+8,633-8,584llvm/test/CodeGen/Thumb2/mve-clmul.ll
+12,365-0llvm/test/CodeGen/AMDGPU/llvm.amdgcn.av.load.b128.ll
+1,243-8,768llvm/test/CodeGen/X86/vector-replicaton-i1-mask.ll
+7,616-740llvm/test/CodeGen/AArch64/bf16-v8-instructions.ll
+8,195-0llvm/test/MC/AMDGPU/gfx13_asm_vop3.s
+61,925-39,0159,609 files not shown
+597,775-258,3149,615 files

LLVM/project 9516040llvm/lib/Support UnicodeNameToCodepointGenerated.cpp, llvm/test/CodeGen/AArch64 bf16-v8-instructions.ll

Address review comments

Created using spr 1.3.6-beta.1
DeltaFile
+23,873-20,923llvm/lib/Support/UnicodeNameToCodepointGenerated.cpp
+8,633-8,584llvm/test/CodeGen/Thumb2/mve-clmul.ll
+12,365-0llvm/test/CodeGen/AMDGPU/llvm.amdgcn.av.load.b128.ll
+1,243-8,768llvm/test/CodeGen/X86/vector-replicaton-i1-mask.ll
+7,616-740llvm/test/CodeGen/AArch64/bf16-v8-instructions.ll
+8,195-0llvm/test/MC/AMDGPU/gfx13_asm_vop3.s
+61,925-39,0159,609 files not shown
+597,775-258,3149,615 files

LLVM/project 148cb0ellvm/lib/Support UnicodeNameToCodepointGenerated.cpp, llvm/test/CodeGen/AArch64 bf16-v8-instructions.ll

[𝘀𝗽𝗿] changes introduced through rebase

Created using spr 1.3.6-beta.1

[skip ci]
DeltaFile
+23,873-20,923llvm/lib/Support/UnicodeNameToCodepointGenerated.cpp
+8,633-8,584llvm/test/CodeGen/Thumb2/mve-clmul.ll
+12,365-0llvm/test/CodeGen/AMDGPU/llvm.amdgcn.av.load.b128.ll
+1,243-8,768llvm/test/CodeGen/X86/vector-replicaton-i1-mask.ll
+7,616-740llvm/test/CodeGen/AArch64/bf16-v8-instructions.ll
+8,195-0llvm/test/MC/AMDGPU/gfx13_asm_vop3.s
+61,925-39,0159,606 files not shown
+597,768-258,2989,612 files

LLVM/project 584b596.github/workflows release-doxygen.yml

workflows/release-doxygen: Remove template expansions (#199456)

https://github.com/llvm/llvm-project/security/code-scanning/1725
https://github.com/llvm/llvm-project/security/code-scanning/1726
https://github.com/llvm/llvm-project/security/code-scanning/1838
DeltaFile
+5-2.github/workflows/release-doxygen.yml
+5-21 files

LLVM/project 88fbc06llvm/lib/Transforms/Scalar DeadStoreElimination.cpp, llvm/test/Transforms/DeadStoreElimination merge-stores.ll

[DSE] Restrict partial-overlap store merging to matching orderings. (#199728)

Partial-overlap store merging folds the later killing store into the
earlier dead store and erases the killing store. That is invalid if the
killing store is volatile or has stronger-than-unordered atomic
ordering, because erasing it drops an observable write. It is also invalid
if the killing and dead stores have different atomic orderings, because
the bytes originally written by the killing store would inherit the dead
store's atomicity after the merge -- silently dropping (or adding)
atomicity for those bytes.

Require both stores to be unordered (i.e. non-volatile with ordering at
most unordered) and to share the same ordering. This preserves the
existing fold for two simple stores or two unordered-atomic stores
(e.g. simple.ll's test43a) while leaving volatile, ordered-atomic, and
atomicity-mismatched cases in place.

This bug was found by a large run of Opus 4.7 looking for bugs in LLVM.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply at anthropic.com>
DeltaFile
+51-36llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
+52-8llvm/test/Transforms/DeadStoreElimination/merge-stores.ll
+103-442 files

LLVM/project 5df91f6.github/workflows sycl-tests.yml

workflows/sycl-tests: Pin container image reference (#199466)

I pinned the image to the version that was used in the last successful
workflow run.

https://github.com/llvm/llvm-project/security/code-scanning/1808
https://docs.zizmor.sh/audits/#unpinned-images
DeltaFile
+1-1.github/workflows/sycl-tests.yml
+1-11 files

LLVM/project 9d751a2llvm/lib/Object OffloadBundle.cpp, llvm/test/tools/llvm-objdump/Offloading fatbin-coff.test fatbin-coff-compress.test

[llvm][Object] Add COFF support to extractOffloadBundleFatBinary (#199574)

Use PointerToRawData from the COFF section header to compute the section
offset, replacing the previous stub that returned an error for all COFF
object files.

This enables llvm-objdump --offloading and llvm-readobj --offloading to
work on COFF fatbins produced by HIP on Windows.

---------

Co-authored-by: James Henderson <James.Henderson at sony.com>
Co-authored-by: Matt Arsenault <arsenm2 at gmail.com>
DeltaFile
+38-0llvm/test/tools/llvm-objdump/Offloading/fatbin-coff.test
+28-0llvm/test/tools/llvm-objdump/Offloading/fatbin-coff-compress.test
+21-0llvm/test/tools/llvm-readobj/COFF/AMDGPU/offloading.test
+8-3llvm/lib/Object/OffloadBundle.cpp
+95-34 files