LLVM/project 0629650llvm/lib/CodeGen/SelectionDAG DAGCombiner.cpp, llvm/test/CodeGen/AMDGPU dagcombine-freeze-extract-subvector-loop.ll

[SelectionDAG] Fold extracts of subvector inserts

Fold extract_subvector(insert_subvector(...)) when the extraction is
outside the inserted subvector or the inserted subvector only amends
the extracted

In particular,
1. vA extract_subvector (vB insert_subvector(vB X, vC Y, C1), C2) =>
vA extract_subvector(X, C2) when [C2, C2 + A) intersect [C1, C1 + C)
is the empty set
2. ... => extract_subvector(Y, C2 - C1) if [C2, C2 + Y) is a subset of
[C1, C1 + C) - an existing simplification
3. ... => vA insert_subvector(vA extract_subvector(vB X, C2), vC Y, C1 - C2)
if [C1, C1 + C) is a subset of [C2, C2 + A) - that is, if you're only
updating the extracted sub-part.

Adds a regresssion tests for an infinite SelectionDAG cycle that is
fixed by a stack of commits that ends with this one.


    [3 lines not shown]
DeltaFile
+72-56llvm/test/CodeGen/X86/vector-interleaved-store-i64-stride-6.ll
+44-48llvm/test/CodeGen/X86/vector-interleaved-store-i16-stride-6.ll
+25-22llvm/test/CodeGen/X86/dagcombine-extract-insert.ll
+45-0llvm/test/CodeGen/AMDGPU/dagcombine-freeze-extract-subvector-loop.ll
+28-7llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+15-17llvm/test/CodeGen/X86/vector-replicaton-i1-mask.ll
+229-1502 files not shown
+237-1668 files

LLVM/project c1857e2llvm/lib/CodeGen/SelectionDAG DAGCombiner.cpp, llvm/test/CodeGen/AMDGPU dagcombine-freeze-extract-subvector-loop.ll

[SelectionDAG] Fold extracts of subvector inserts

Fold extract_subvector(insert_subvector(...)) when the extraction is
outside the inserted subvector or the inserted subvector only amends
the extracted

In particular,
1. vA extract_subvector (vB insert_subvector(vB X, vC Y, C1), C2) =>
vA extract_subvector(X, C2) when [C2, C2 + A) intersect [C1, C1 + C)
is the empty set
2. ... => extract_subvector(Y, C2 - C1) if [C2, C2 + Y) is a subset of
[C1, C1 + C) - an existing simplification
3. ... => vA insert_subvector(vA extract_subvector(vB X, C2), vC Y, C1 - C2)
if [C1, C1 + C) is a subset of [C2, C2 + A) - that is, if you're only
updating the extracted sub-part.

Adds a regresssion tests for an infinite SelectionDAG cycle that is
fixed by a stack of commits that ends with this one.


    [3 lines not shown]
DeltaFile
+72-56llvm/test/CodeGen/X86/vector-interleaved-store-i64-stride-6.ll
+44-48llvm/test/CodeGen/X86/vector-interleaved-store-i16-stride-6.ll
+25-22llvm/test/CodeGen/X86/dagcombine-extract-insert.ll
+45-0llvm/test/CodeGen/AMDGPU/dagcombine-freeze-extract-subvector-loop.ll
+32-7llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+15-17llvm/test/CodeGen/X86/vector-replicaton-i1-mask.ll
+233-1502 files not shown
+241-1668 files

LLVM/project bfdfeffllvm/lib/CodeGen/SelectionDAG DAGCombiner.cpp, llvm/test/CodeGen/X86 dagcombine-insert-concat.ll

[SelectionDAG] Fold subvector inserts into concat operands

Push insert_subvector into the containing CONCAT_VECTORS operand when the insertion is wholly contained there.

AI note: an LLM generated the code and the test, I've read them

Co-Authored-By: OpenAI Codex <codex at openai.com>
DeltaFile
+34-10llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+4-18llvm/test/CodeGen/X86/dagcombine-insert-concat.ll
+38-282 files

LLVM/project 8cbb390llvm/test/CodeGen/AArch64 sve-fixed-vector-llrint.ll sve-fixed-vector-lrint.ll, llvm/test/CodeGen/AMDGPU bf16.ll

[SelectionDAG] Fold extracts spanning concat operands

Factor the extract_subvector-of-CONCAT_VECTORS logic and handle
extracts that cover multiple whole concat operands by rebuilding a
smaller concat directly.

AI note: an LLM generated the code and the test, I've read them

Co-Authored-By: OpenAI Codex <codex at openai.com>
DeltaFile
+992-904llvm/test/CodeGen/AMDGPU/bf16.ll
+187-229llvm/test/CodeGen/AArch64/sve-fixed-vector-llrint.ll
+187-229llvm/test/CodeGen/AArch64/sve-fixed-vector-lrint.ll
+196-176llvm/test/CodeGen/X86/vector-interleaved-store-i8-stride-6.ll
+142-140llvm/test/CodeGen/X86/vector-interleaved-store-i16-stride-6.ll
+120-120llvm/test/CodeGen/X86/vector-interleaved-store-i32-stride-6.ll
+1,824-1,79812 files not shown
+2,204-2,27918 files

LLVM/project fe68411llvm/lib/CodeGen/SelectionDAG SelectionDAG.cpp, llvm/test/CodeGen/X86 dagcombine-freeze-select-demanded-elts.ll

[SelectionDAG] Track demanded select elements in noundef checks

Propagate demanded elements through to the two arms of a select, and
check the condition with or without demanded elements depending on if
it's a vector or not.

AI note: an LLM generated the code and the test, I've read them

Co-Authored-By: OpenAI Codex <codex at openai.com>
DeltaFile
+17-2llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
+0-10llvm/test/CodeGen/X86/dagcombine-freeze-select-demanded-elts.ll
+17-122 files

LLVM/project a739b77llvm/lib/CodeGen/SelectionDAG DAGCombiner.cpp

[SelectionDAG] Fold nonzero extract-of-extract indices

Generalize the extract_subvector-of-extract_subvector fold to compose
nonzero indices instead of only handling an outer index of zero.

AI note: an LLM generated the code and the test, I've read them

Co-Authored-By: OpenAI Codex <codex at openai.com>
DeltaFile
+8-8llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+8-81 files

LLVM/project fa98608llvm/lib/CodeGen/SelectionDAG SelectionDAG.cpp, llvm/test/CodeGen/X86 dagcombine-freeze-concat-demanded-elts.ll

[SelectionDAG] Track demanded concat elements in noundef checks

Teach isGuaranteedNotToBeUndefOrPoison to distribute fixed-length
demanded element masks across CONCAT_VECTORS operands. This is part of
the series of fixes needed to resolve a SelectionDAG hang by making it
possible to prove certain values don't need to be frozen.

AI note: an LLM generated the code and the test, I've read them

Co-Authored-By: OpenAI Codex <codex at openai.com>
DeltaFile
+23-0llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
+1-4llvm/test/CodeGen/X86/dagcombine-freeze-concat-demanded-elts.ll
+24-42 files

LLVM/project 8c5f91fllvm/lib/CodeGen/SelectionDAG SelectionDAG.cpp, llvm/test/CodeGen/X86 dagcombine-freeze-bitcast-demanded-elts.ll

[SelectionDAG] Track bitcast demanded elements in noundef tests

Bitcasts preserve undef/poison status, but vector bitcasts can change
which source lanes cover a demanded result lane. Map the demanded
element mask through fixed-length vector bitcasts before checking the
source where possible.

AI note: an LLM generated the code and the test, I've read them

Co-Authored-By: OpenAI Codex <codex at openai.com>
DeltaFile
+41-0llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
+1-4llvm/test/CodeGen/X86/dagcombine-freeze-bitcast-demanded-elts.ll
+42-42 files

LLVM/project 59e5e3allvm/lib/CodeGen/SelectionDAG TargetLowering.cpp, llvm/test/CodeGen/X86 dagcombine-freeze-undef-demanded-elts.ll pr91005.ll

[SelectionDAG] Look through freeze in undef demanded checks

There were cycles where the freeze combiner and thet
demanded-elements simplification code would get into fights about
whethere the operands to a shuffle or a concat should be
`freeze undef` or `undef` once the simplifier had concluded zero
elements were demanded from some operation. This PR prevents such
cases.

AI note: an LLM generated the code and the test, I've read them

Co-Authored-By: OpenAI Codex <codex at openai.com>
DeltaFile
+11-7llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
+2-1llvm/test/CodeGen/X86/dagcombine-freeze-undef-demanded-elts.ll
+2-1llvm/test/CodeGen/X86/pr91005.ll
+15-93 files

LLVM/project 53ad1d1llvm/test/CodeGen/X86 dagcombine-extract-insert.ll dagcombine-freeze-select-demanded-elts.ll

[SelectionDAG] Pre-commit tests for dagcombine improvements

I've got a stack of dagcombine improvements that together make an
infinite cycle relating to freeze insertion in vector-manipulation IR.
Here we have

- Handling freeze(undef) in demanded-elts for shufflevector
- Improvements to noundef checks for bitcast, concat, and select
- Improvements to extract(concat), extract(extract), and
- extract(insert) nadling
DeltaFile
+51-0llvm/test/CodeGen/X86/dagcombine-extract-insert.ll
+51-0llvm/test/CodeGen/X86/dagcombine-freeze-select-demanded-elts.ll
+38-0llvm/test/CodeGen/X86/dagcombine-insert-concat.ll
+36-0llvm/test/CodeGen/X86/dagcombine-freeze-undef-demanded-elts.ll
+25-0llvm/test/CodeGen/X86/dagcombine-extract-concat.ll
+21-0llvm/test/CodeGen/X86/dagcombine-freeze-bitcast-demanded-elts.ll
+222-01 files not shown
+242-07 files

LLVM/project c7b4b4allvm/test/CodeGen/NVPTX lower-aggr-copies.ll

[NVPTX] Fix the build after ce465594e239. (#201268)

ce465594e239 (#201177) added sm_90 / PTX ISA 7.8 instructions to
lower-aggr-copies.ll, so we need to guard the RUN line appropriately.
DeltaFile
+1-1llvm/test/CodeGen/NVPTX/lower-aggr-copies.ll
+1-11 files

LLVM/project f91f589llvm/lib/Target/AMDGPU/AsmParser AMDGPUAsmParser.cpp, llvm/lib/Target/AMDGPU/MCTargetDesc AMDGPUMCExpr.cpp AMDGPUMCExpr.h

[AMDGPU] Added min operation for MCExprs (#199746)

The min operation is needed in MC Expressions for a future change that
caps the max number of registers used for indirect calls.

---------

Co-authored-by: JoshuaGrindstaff <jgrindst at amd.com>
DeltaFile
+106-0llvm/unittests/Target/AMDGPU/AMDGPUMCExprTest.cpp
+62-1llvm/test/MC/AMDGPU/mcexpr_amd.s
+16-1llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUMCExpr.cpp
+9-2llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUMCExpr.h
+11-0llvm/test/MC/AMDGPU/mcexpr_amd_err.s
+3-1llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp
+207-52 files not shown
+211-58 files

LLVM/project e0b580allvm/lib/CodeGen/SelectionDAG DAGCombiner.cpp

[DAGCombiner] Remove untested visitVP_FADD and visitVP_FSUB. (#201247)

RISC-V no longer uses VP_FADD/FSUB in SelectionDAG.
DeltaFile
+116-168llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+116-1681 files

LLVM/project 866c39bllvm/lib/CodeGen/SelectionDAG DAGCombiner.cpp

[DAGCombiner] Remove no longer tested VP_MUL handling. (#201238)

We no longer use VP_MUL in SelectionDAG on RISC-V so this code isn't
tested.

This effectively reverts db6de1a20f75cbfe1024f41e64ad39def91fa70f
DeltaFile
+45-62llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+45-621 files

LLVM/project 70bca0fllvm/lib/Target/RISCV RISCVInsertVSETVLI.cpp, llvm/test/CodeGen/RISCV/rvv xsfmm-insert-vsetvl-TMTK.mir

[RISCV] Make VSETTM/VSETTK not affect the VSETVL emit (#197890)

VSETTM/TK will modify VTYPE, but it only affects the TM/TK bits. This
modification is safe for other RVV operations. The TM/TK value will be
maintained in insertVSETMTK.
DeltaFile
+195-0llvm/test/CodeGen/RISCV/rvv/xsfmm-insert-vsetvl-TMTK.mir
+6-0llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp
+201-02 files

LLVM/project a47bddcclang-tools-extra/clangd InlayHints.cpp, clang-tools-extra/clangd/unittests InlayHintTests.cpp

[clangd] Handle dependent call to function with explicit object parameter in InlayHintVisitor (#201264)

Dependent calls do not yet have the implicit object argument preprended
to the CallExpr's argument list, so the first argument should not be
expected to be present and dropped in this case.

Fixes https://github.com/llvm/llvm-project/issues/198588
DeltaFile
+15-0clang-tools-extra/clangd/unittests/InlayHintTests.cpp
+2-1clang-tools-extra/clangd/InlayHints.cpp
+17-12 files

LLVM/project 49bf4fallvm/lib/Target/LoongArch LoongArchISelLowering.cpp, llvm/test/CodeGen/LoongArch pr198339.ll

[𝘀𝗽𝗿] initial version

Created using spr 1.3.7
DeltaFile
+49-0llvm/test/CodeGen/LoongArch/pr198339.ll
+5-0llvm/lib/Target/LoongArch/LoongArchISelLowering.cpp
+54-02 files

LLVM/project 6d7962dclang/docs ReleaseNotes.rst, clang/lib/Sema SemaOverload.cpp

[clang][CUDA] Avoid ambiguity in host/device template specializations (#201049)

This commit changes SemaOverload to resolve an otherwise diagnosed
ambiguity between addresses of template specializations of functions
that are overloaded for both device and host. Similar to how it works
for non-templated function overloads, these changes prioritizes the
specializations that corresponds to the target of the owning function,
i.e. if compiling for host, the address of the host specialization takes
precedence over the device specialization and vice versa.

Fixes https://github.com/llvm/llvm-project/issues/199299

---------

Signed-off-by: Steffen Holst Larsen <sholstla at amd.com>
DeltaFile
+28-0clang/test/SemaCUDA/addr-of-overloaded-template-fn.cu
+3-3clang/lib/Sema/SemaOverload.cpp
+4-0clang/docs/ReleaseNotes.rst
+2-0clang/test/SemaCUDA/addr-of-overloaded-fn.cu
+37-34 files

LLVM/project ee20b10bolt/include/bolt/Core BinaryContext.h DIEBuilder.h, bolt/lib/Core DIEBuilder.cpp

[BOLT] Fix data race in multi-threaded DWP type unit processing and DWP type unit duplication (#197359)

## Summary
This PR fixes a race condition in LLVM BOLT's
DIEBuilder::buildTypeUnits() that is triggered when DWARF5 split-DWARF
(.dwo/.dwp) inputs are processed with multi-threaded CU processing.
Concurrent invocations from different worker threads share the same DWP
type-unit state, which results in duplicated DIE extraction, assertion
failures, and intermittent crashes. The fix serializes buildTypeUnits()
for DWP inputs via a function-local static std::mutex, leaving the
non-DWO fast path unchanged.
## Problem Description
When BOLT processes DWARF debug info with --debug-thread-count=4
--cu-processing-batch-size=4 on testcase
dwarf5-df-types-dup-dwp-input.test, multiple threads concurrently call
DIEBuilder::buildTypeUnits() on shared DWP type units. Since type units
within a DWP file are shared across compilation units, multiple threads
may attempt to extract DIEs from the same type unit simultaneously,
violating the assertion.

    [5 lines not shown]
DeltaFile
+76-1bolt/lib/Core/DIEBuilder.cpp
+26-0bolt/test/X86/dwarf5-dwp-tsan-data-race.test
+7-0bolt/include/bolt/Core/BinaryContext.h
+4-3bolt/test/X86/dwarf5-df-types-dup-dwp-input.test
+1-5bolt/lib/Rewrite/DWARFRewriter.cpp
+3-0bolt/include/bolt/Core/DIEBuilder.h
+117-91 files not shown
+118-107 files

LLVM/project 45c4ebblldb/docs conf.py

[lldb] Enable MyST colon_fence and deflist extensions (NFC) (#201250)

Enable the colon_fence and deflist MyST parser extensions in the LLDB
docs configuration. This is a preparatory step for converting the
remaining reStructuredText documentation pages to Markdown, where these
two extensions are needed to translate RST admonition directives
(:::{note}) and definition lists.

Context:
https://discourse.llvm.org/t/rfc-make-myst-markdown-the-llvm-docs-format-rip-rest/
DeltaFile
+1-1lldb/docs/conf.py
+1-11 files

LLVM/project 83318d0llvm/docs/tutorial/MyFirstLanguageFrontend LangImpl04.rst

[docs][Kaleidoscope] fix function name InitializeModuleAndManagers in Kaleidoscope (#199601)

### Description
resloves #199477 

The Kaleidoscope tutorial was not fully updated with the new Pass
Manager. This pr aligns the tutorial doc with the example code.

### Changes
- Use `InitializeModuleAndManagers` instead of
`InitializeModuleAndPassManager`.
- Remove `TheModule->setDataLayout(TheJIT->getDataLayout());` in line
141, as the `setDataLayout` was introduced later.
- Use `KaleidoscopeJIT` instead of `my cool jit` as the ModuleName, to
align with the final code.
DeltaFile
+11-13llvm/docs/tutorial/MyFirstLanguageFrontend/LangImpl04.rst
+11-131 files

LLVM/project 53938ballvm/test/CodeGen/RISCV/rvv vector-interleave.ll vector-interleave-fixed.ll

[RISCV] Remove experimental XRivosVizip support (#200761)

Remove experimental XRivosVizip support which will not be maintained by
RVIOS any more.
DeltaFile
+0-1,898llvm/test/CodeGen/RISCV/rvv/vector-interleave.ll
+0-682llvm/test/CodeGen/RISCV/rvv/vector-interleave-fixed.ll
+0-422llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-deinterleave2.ll
+0-318llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-int-interleave.ll
+0-278llvm/test/CodeGen/RISCV/rvv/vector-deinterleave.ll
+0-146llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-zipeven-zipodd.ll
+0-3,74417 files not shown
+23-4,23823 files

LLVM/project 8763a68llvm/lib/Target/M68k/AsmParser CMakeLists.txt

[M68k] Add to LINK_COMPONENTS to fix BUILD_SHARED_LIBS build (#201248)

Fixes: 6897c5e24ce5 ("[M68k][MC] Add MC support for PCI w/ base
displacement addressing mode (#200696)")
DeltaFile
+1-0llvm/lib/Target/M68k/AsmParser/CMakeLists.txt
+1-01 files

LLVM/project f48e6b8llvm/lib/Target/NVPTX NVVMIntrRange.cpp, llvm/test/CodeGen/NVPTX intr-range.ll

[NVPTX] NVVMIntrRange: Handle maxntid > UINT32_MAX. (#201245)

Previously we computed the overall maxntid and downcast it to unsigned
int.  This is not correct; it can be larger than UINT32_MAX.

This would cause reads of tid.xyz and ntid.xyz to have incorrect range
information.  Also if maxntid was an exact multiple of 2^32, we'd get an
ICE (because we'd incorrectly think that maxntid is 0).
DeltaFile
+47-1llvm/test/CodeGen/NVPTX/intr-range.ll
+7-6llvm/lib/Target/NVPTX/NVVMIntrRange.cpp
+54-72 files

LLVM/project 19c7fdbclang/lib/CIR/CodeGen CIRGenExpr.cpp CIRGenModule.cpp, clang/test/CIR/CodeGen global-temp-dtor.cpp self-ref-temporaries.cpp

[CIR] Implement destruction of TLS and static global references (#200227)

This implements destruction of lifetime-extended reference temporaries
used to initialize TLS or static duration reference variables.

Assisted-by: Cursor / claude-opus-4.7
DeltaFile
+265-0clang/test/CIR/CodeGen/global-temp-dtor.cpp
+48-6clang/lib/CIR/CodeGen/CIRGenExpr.cpp
+11-3clang/lib/CIR/CodeGen/CIRGenModule.cpp
+3-3clang/test/CIR/CodeGen/self-ref-temporaries.cpp
+4-2clang/lib/CIR/CodeGen/CIRGenModule.h
+2-2clang/test/CIR/CodeGenCXX/global-refs.cpp
+333-166 files

LLVM/project 45bddcaclang/lib/CIR/CodeGen CIRGenStmt.cpp CIRGenCleanup.cpp, clang/test/CIR/CodeGen switch-cleanup.cpp

[CIR] Fix insertion point tracking for switch with cleanups (#201210)

We had some problems where we would incorrectly maintain the insertion
point for switch statements that contained cleanup scopes. This resulted
in cir.scope statements without a terminator, tripping a verification
error.

This change adds a RunCleanupsScope RAII object for the switch statement
and adds a check inside popCleanup() to avoid moving the insertion point
to the point after the now-closed cleanup scope if the insertion point
had previously been somewhere other than inside the cleanup scope.

Assisted-by: Cursor / claude-opus-4.8
DeltaFile
+198-0clang/test/CIR/CodeGen/switch-cleanup.cpp
+11-5clang/lib/CIR/CodeGen/CIRGenStmt.cpp
+6-1clang/lib/CIR/CodeGen/CIRGenCleanup.cpp
+215-63 files

LLVM/project f132e92clang/lib/CIR/Dialect/Transforms/TargetLowering CIRABIRewriteContext.cpp, clang/test/CIR/Transforms/abi-lowering coerce-record-return-larger.cir coerce-int-to-record.cir

[CIR] Coerce Direct args and returns in CallConvLowering (#195879)

Fourth PR in the split of #192119/#192124. Implements the
Direct-with-coercion path in CallConvLowering.

Every Direct argument or return whose ABI type differs from its source
type is now coerced through a store/reload roundtrip via an entry-block
alloca, mirroring classic codegen's CreateCoercedLoad/CreateCoercedStore.
The temporary alloca uses max(srcAlign, dstAlign) from the DataLayout and
is hoisted into the entry block so it composes with HoistAllocas
regardless of pipeline order. When the coerced type is larger than the
source -- e.g. a 12-byte aggregate returned as { i64, i64 } -- the slot is
sized to the larger type and accessed through a source-typed view for the
store and a destination-typed view for the load, so neither side
over-reads.

CallConvLowering is split into three phases (function-definition
coercion, call-site rewriting, and Ignore cleanup) because in-place
block-argument type changes from Direct-with-coerce otherwise confused the

    [3 lines not shown]
DeltaFile
+189-25clang/lib/CIR/Dialect/Transforms/TargetLowering/CIRABIRewriteContext.cpp
+63-0clang/test/CIR/Transforms/abi-lowering/coerce-record-return-larger.cir
+57-0clang/test/CIR/Transforms/abi-lowering/coerce-int-to-record.cir
+57-0clang/test/CIR/Transforms/abi-lowering/coerce-record-to-record-via-memory.cir
+56-0clang/test/CIR/Transforms/abi-lowering/coerce-record-to-int.cir
+42-0clang/test/CIR/Transforms/abi-lowering/coerce-vector-to-complex.cir
+464-253 files not shown
+486-309 files

LLVM/project 0e40e9eclang/test/OffloadTools/clang-sycl-linker basic.ll triple.ll, clang/tools/clang-sycl-linker ClangSYCLLinker.cpp

[clang-sycl-linker][test] Improve dry-run mode and tighten test coverage (#200513)

- Rework `--dry-run` in `clang-sycl-linker` so it skips all real output
    (writing bitcode, executing tools, etc.).
  - The `link:`, `sycl-module-split:`, and a new `sycl-bundle:` summary
    line are now gated on `-v` alone.
  - Tighten `sycl-bundle:` checks in `basic.ll`, `split-mode.ll`, and
    `triple.ll` to pin kind, triple, and arch (instead of just kind),
    and add `-NOT: {{.+}}` after fully-covered dry-run check groups.
  - replace the `clang-sycl-linker` + `llvm-objdump --offloading`
    round-trip with a single `--dry-run -v` invocation.
- add dedicated `non-dry-run` mode test to verify code paths not exposed
in `dry-run`.

    Assisted by Claude.
DeltaFile
+45-27clang/tools/clang-sycl-linker/ClangSYCLLinker.cpp
+22-11clang/test/OffloadTools/clang-sycl-linker/basic.ll
+6-6clang/test/OffloadTools/clang-sycl-linker/triple.ll
+6-0clang/test/OffloadTools/clang-sycl-linker/split-mode.ll
+79-444 files

LLVM/project a6745c9llvm/lib/CodeGen InlineSpiller.cpp, llvm/test/CodeGen/X86/apx memfold-origVNI-crash.ll

[X86][APX] Extend original LI to the same range as DstReg (#199182)

The #189222 folds NDD+Load to non-NDD when NDD memory variant not
preferred. However, this will changes DstReg from regular def to
early-clobber def, which causes "corrupted sub-interval" in
reMaterializeFor, because the OrigLI is not updated at the same time.

Fixes: https://godbolt.org/z/7n8ozz1EG

Assisted-by: Claude Sonnet 4.6
DeltaFile
+214-0llvm/test/CodeGen/X86/apx/memfold-origVNI-crash.ll
+14-0llvm/lib/CodeGen/InlineSpiller.cpp
+228-02 files

LLVM/project 243ddf6libc/src/__support freelist_heap.h block.h, libc/test/src/__support freelist_heap_test.cpp block_test.cpp

[libc] add shrink in-place support for reallocations (#200272)

This PR adds shrinking in-place for the freelist heap. This allows the
heap to reuse the place if the reallocation shrinks the size larger than
a minimal block unit.

Synthesized random action tests show that that increase heap utilization
rate from 87% to 97% percent, basically aligns with the expectation of
dlmalloc.

Assisted-by: AI tools, manually checked.
DeltaFile
+46-1libc/test/src/__support/freelist_heap_test.cpp
+37-3libc/src/__support/freelist_heap.h
+8-2libc/test/src/__support/block_test.cpp
+5-4libc/src/__support/block.h
+2-0libc/src/__support/freestore.h
+1-0libc/test/src/__support/CMakeLists.txt
+99-106 files