LLVM/project 6519c04llvm/lib/CodeGen InlineSpiller.cpp

[1/3][RegAlloc][LiveRegMatrix] Fix inconsistency in HoistSpillHelper delegates (#197773)

HoistSpillHelper's LiveRangeEdit delegate callbacks did not keep the
LiveRegMatrix consistent when eliminateDeadDefs triggered interval
shrinking and splitting during spill hoisting.

Three issues:

1. No LRE_WillShrinkVirtReg override: when eliminateDeadDefs shrinks a
vreg's interval via shrinkToUses, the matrix was not updated. Add an
override that unassigns the vreg from the matrix and records it in
PendingReassignments for later re-assignment.

2. LRE_DidCloneVirtReg called VRM.assignVirt2Phys without
Matrix->assign: when splitSeparateComponents creates new vregs, the
clones got VRM entries but were never inserted into the matrix. Fix by
consuming PendingReassignments and properly assigning both Old (shrunk)
and New (split) intervals to the matrix.


    [15 lines not shown]
DeltaFile
+62-4llvm/lib/CodeGen/InlineSpiller.cpp
+62-41 files

LLVM/project 08e83a5.github/workflows release-tasks.yml

workflows/release-tasks: Remove template expansion (#199444)

https://github.com/llvm/llvm-project/security/code-scanning/1737
https://github.com/llvm/llvm-project/security/code-scanning/1738
https://github.com/llvm/llvm-project/security/code-scanning/1739
https://github.com/llvm/llvm-project/security/code-scanning/1740
https://github.com/llvm/llvm-project/security/code-scanning/1741
https://github.com/llvm/llvm-project/security/code-scanning/1742
DeltaFile
+7-5.github/workflows/release-tasks.yml
+7-51 files

LLVM/project ef59dbeflang/lib/Lower ConvertConstant.cpp ConvertVariable.cpp, flang/lib/Optimizer/Builder FIRBuilder.cpp IntrinsicCall.cpp

[flang][cuda] Lower c_devptr value arguments in bind(c) like c_ptr (#199316)

Treat `type(c_devptr), value` arguments in BIND(C) interfaces like
`type(c_ptr), value` by passing the nested raw address value instead of
the outer derived type ABI. This keeps call signatures consistent for
CUDA Fortran generic specifics that share a C binding label and avoids
argument misclassification at the x86_64 register/stack boundary.
DeltaFile
+31-16flang/lib/Lower/ConvertConstant.cpp
+8-19flang/lib/Optimizer/Builder/FIRBuilder.cpp
+22-0flang/test/HLFIR/c_devptr_byvalue.cuf
+10-11flang/lib/Lower/ConvertVariable.cpp
+6-6flang/lib/Lower/ConvertCall.cpp
+4-8flang/lib/Optimizer/Builder/IntrinsicCall.cpp
+81-605 files not shown
+90-7211 files

LLVM/project 84f75f7llvm/lib/Support UnicodeNameToCodepointGenerated.cpp, llvm/test/CodeGen/AArch64 bf16-v8-instructions.ll

Rebase

Created using spr 1.3.6-beta.1
DeltaFile
+23,873-20,923llvm/lib/Support/UnicodeNameToCodepointGenerated.cpp
+8,633-8,584llvm/test/CodeGen/Thumb2/mve-clmul.ll
+12,365-0llvm/test/CodeGen/AMDGPU/llvm.amdgcn.av.load.b128.ll
+1,243-8,768llvm/test/CodeGen/X86/vector-replicaton-i1-mask.ll
+7,616-740llvm/test/CodeGen/AArch64/bf16-v8-instructions.ll
+8,195-0llvm/test/MC/AMDGPU/gfx13_asm_vop3.s
+61,925-39,0159,609 files not shown
+597,775-258,3149,615 files

LLVM/project 29aabb6llvm/lib/Support UnicodeNameToCodepointGenerated.cpp, llvm/test/CodeGen/AArch64 bf16-v8-instructions.ll

[𝘀𝗽𝗿] changes introduced through rebase

Created using spr 1.3.6-beta.1

[skip ci]
DeltaFile
+23,873-20,923llvm/lib/Support/UnicodeNameToCodepointGenerated.cpp
+8,633-8,584llvm/test/CodeGen/Thumb2/mve-clmul.ll
+12,365-0llvm/test/CodeGen/AMDGPU/llvm.amdgcn.av.load.b128.ll
+1,243-8,768llvm/test/CodeGen/X86/vector-replicaton-i1-mask.ll
+7,616-740llvm/test/CodeGen/AArch64/bf16-v8-instructions.ll
+8,195-0llvm/test/MC/AMDGPU/gfx13_asm_vop3.s
+61,925-39,0159,609 files not shown
+597,775-258,3149,615 files

LLVM/project 9516040llvm/lib/Support UnicodeNameToCodepointGenerated.cpp, llvm/test/CodeGen/AArch64 bf16-v8-instructions.ll

Address review comments

Created using spr 1.3.6-beta.1
DeltaFile
+23,873-20,923llvm/lib/Support/UnicodeNameToCodepointGenerated.cpp
+8,633-8,584llvm/test/CodeGen/Thumb2/mve-clmul.ll
+12,365-0llvm/test/CodeGen/AMDGPU/llvm.amdgcn.av.load.b128.ll
+1,243-8,768llvm/test/CodeGen/X86/vector-replicaton-i1-mask.ll
+7,616-740llvm/test/CodeGen/AArch64/bf16-v8-instructions.ll
+8,195-0llvm/test/MC/AMDGPU/gfx13_asm_vop3.s
+61,925-39,0159,609 files not shown
+597,775-258,3149,615 files

LLVM/project 148cb0ellvm/lib/Support UnicodeNameToCodepointGenerated.cpp, llvm/test/CodeGen/AArch64 bf16-v8-instructions.ll

[𝘀𝗽𝗿] changes introduced through rebase

Created using spr 1.3.6-beta.1

[skip ci]
DeltaFile
+23,873-20,923llvm/lib/Support/UnicodeNameToCodepointGenerated.cpp
+8,633-8,584llvm/test/CodeGen/Thumb2/mve-clmul.ll
+12,365-0llvm/test/CodeGen/AMDGPU/llvm.amdgcn.av.load.b128.ll
+1,243-8,768llvm/test/CodeGen/X86/vector-replicaton-i1-mask.ll
+7,616-740llvm/test/CodeGen/AArch64/bf16-v8-instructions.ll
+8,195-0llvm/test/MC/AMDGPU/gfx13_asm_vop3.s
+61,925-39,0159,606 files not shown
+597,768-258,2989,612 files

LLVM/project 584b596.github/workflows release-doxygen.yml

workflows/release-doxygen: Remove template expansions (#199456)

https://github.com/llvm/llvm-project/security/code-scanning/1725
https://github.com/llvm/llvm-project/security/code-scanning/1726
https://github.com/llvm/llvm-project/security/code-scanning/1838
DeltaFile
+5-2.github/workflows/release-doxygen.yml
+5-21 files

LLVM/project 88fbc06llvm/lib/Transforms/Scalar DeadStoreElimination.cpp, llvm/test/Transforms/DeadStoreElimination merge-stores.ll

[DSE] Restrict partial-overlap store merging to matching orderings. (#199728)

Partial-overlap store merging folds the later killing store into the
earlier dead store and erases the killing store. That is invalid if the
killing store is volatile or has stronger-than-unordered atomic
ordering, because erasing it drops an observable write. It is also invalid
if the killing and dead stores have different atomic orderings, because
the bytes originally written by the killing store would inherit the dead
store's atomicity after the merge -- silently dropping (or adding)
atomicity for those bytes.

Require both stores to be unordered (i.e. non-volatile with ordering at
most unordered) and to share the same ordering. This preserves the
existing fold for two simple stores or two unordered-atomic stores
(e.g. simple.ll's test43a) while leaving volatile, ordered-atomic, and
atomicity-mismatched cases in place.

This bug was found by a large run of Opus 4.7 looking for bugs in LLVM.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply at anthropic.com>
DeltaFile
+51-36llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
+52-8llvm/test/Transforms/DeadStoreElimination/merge-stores.ll
+103-442 files

LLVM/project 5df91f6.github/workflows sycl-tests.yml

workflows/sycl-tests: Pin container image reference (#199466)

I pinned the image to the version that was used in the last successful
workflow run.

https://github.com/llvm/llvm-project/security/code-scanning/1808
https://docs.zizmor.sh/audits/#unpinned-images
DeltaFile
+1-1.github/workflows/sycl-tests.yml
+1-11 files

LLVM/project 9d751a2llvm/lib/Object OffloadBundle.cpp, llvm/test/tools/llvm-objdump/Offloading fatbin-coff.test fatbin-coff-compress.test

[llvm][Object] Add COFF support to extractOffloadBundleFatBinary (#199574)

Use PointerToRawData from the COFF section header to compute the section
offset, replacing the previous stub that returned an error for all COFF
object files.

This enables llvm-objdump --offloading and llvm-readobj --offloading to
work on COFF fatbins produced by HIP on Windows.

---------

Co-authored-by: James Henderson <James.Henderson at sony.com>
Co-authored-by: Matt Arsenault <arsenm2 at gmail.com>
DeltaFile
+38-0llvm/test/tools/llvm-objdump/Offloading/fatbin-coff.test
+28-0llvm/test/tools/llvm-objdump/Offloading/fatbin-coff-compress.test
+21-0llvm/test/tools/llvm-readobj/COFF/AMDGPU/offloading.test
+8-3llvm/lib/Object/OffloadBundle.cpp
+95-34 files

LLVM/project 010faf1llvm/lib/TableGen TGParser.cpp

[TableGen] Add missing grammar comment for !cond(NFC) (#199663)
DeltaFile
+3-0llvm/lib/TableGen/TGParser.cpp
+3-01 files

LLVM/project fafc2b3llvm/lib/TableGen TGParser.cpp

[TableGen] Fix wrong op name in a grammar comment(NFC) (#199661)
DeltaFile
+1-1llvm/lib/TableGen/TGParser.cpp
+1-11 files

LLVM/project 80207c2flang/test/Lower/OpenMP metadirective-device-arch.f90

Fix device arch negative check coverage
DeltaFile
+4-2flang/test/Lower/OpenMP/metadirective-device-arch.f90
+4-21 files

LLVM/project 429bbe6flang/lib/Lower/OpenMP OpenMP.cpp, flang/test/Lower/OpenMP metadirective-implementation.f90 metadirective-device-isa.f90

Support begin/end metadirective
DeltaFile
+89-0flang/test/Lower/OpenMP/metadirective-implementation.f90
+55-0flang/test/Lower/OpenMP/metadirective-device-isa.f90
+41-0flang/test/Lower/OpenMP/metadirective-device-kind.f90
+36-0flang/test/Lower/OpenMP/metadirective-user.f90
+35-0flang/test/Lower/OpenMP/metadirective-construct.f90
+14-7flang/lib/Lower/OpenMP/OpenMP.cpp
+270-76 files

LLVM/project f6633d5flang/lib/Lower/OpenMP OpenMP.cpp, flang/test/Semantics/OpenMP metadirective-common.f90

Require a selector when lowering WHEN

This patch ensure we don't have missing selector as unconditional in
lowering since `WHEN` requires a context-selector.

Added negative test to replace the positive test testing against missing
selector.
DeltaFile
+8-18flang/lib/Lower/OpenMP/OpenMP.cpp
+5-0flang/test/Semantics/OpenMP/metadirective-common.f90
+13-182 files

LLVM/project 94069edflang/lib/Lower/OpenMP OpenMP.cpp

Fix for construct
DeltaFile
+41-15flang/lib/Lower/OpenMP/OpenMP.cpp
+41-151 files

LLVM/project 9e4dbb2flang/lib/Lower/OpenMP Utils.cpp, flang/test/Lower/OpenMP metadirective-user.f90

Preserve metadirective user condition scores

The user-condition path returned before getTraitScore was called, so a
score on condition(...) was silently ignored during variant selection.

Extract the score before dispatching user-condition handling, pass it to
the condition traits, and add a test where a scored true condition wins
over an unscored candidate.
DeltaFile
+9-8flang/lib/Lower/OpenMP/Utils.cpp
+15-0flang/test/Lower/OpenMP/metadirective-user.f90
+24-82 files

LLVM/project e123d30flang/lib/Lower/OpenMP Utils.cpp, flang/test/Lower/OpenMP metadirective-device-arch.f90 metadirective-device-isa.f90

Fix trait-property mapping and improve metadirective tests

- In processTraitProperties, restrict the device_isa___ANY fallback to
  only isa selectors. Unknown properties under arch, kind, or vendor
  now produce an invalid trait so the variant does not match. Previously,
  device={arch("neon")} would incorrectly match via ISA target-feature
  checking.
- Add metadirective-nothing tests for OpenMP version >= 5.1.
- Add explicit -triple flags to ISA tests so AArch64 features run
  under an aarch64 triple and x86 features under an x86_64 triple.
- Split device={arch()} tests into metadirective-device-arch.f90
- Add omp.terminator checks for begin/end metadirective match cases.
- Remove begin-metadirective.f90 TODO test (now supported).

Assisted with copilot
DeltaFile
+103-0flang/test/Lower/OpenMP/metadirective-device-arch.f90
+21-70flang/test/Lower/OpenMP/metadirective-device-isa.f90
+62-0flang/test/Lower/OpenMP/metadirective-nothing.f90
+12-22flang/test/Lower/OpenMP/metadirective-implementation.f90
+16-8flang/lib/Lower/OpenMP/Utils.cpp
+0-12flang/test/Lower/OpenMP/Todo/begin-metadirective.f90
+214-1124 files not shown
+236-12710 files

LLVM/project 7425592flang/lib/Lower/OpenMP OpenMP.cpp, flang/test/Lower/OpenMP metadirective-construct.f90

Use selected variants in metadirective construct context

Construct traits gathered only from PFT parents miss constructs introduced
by an enclosing selected begin-metadirective variant. This can cause an
inner construct selector to reject a matching variant.

Use already-emitted enclosing OpenMP operations as the effective context,
falling back to PFT parents when no such operation exists. Add a test for
an inner selector matching target plus an outer-selected parallel.
DeltaFile
+19-12flang/lib/Lower/OpenMP/OpenMP.cpp
+19-0flang/test/Lower/OpenMP/metadirective-construct.f90
+38-122 files

LLVM/project 5823414flang/lib/Lower/OpenMP OpenMP.cpp

Remove redundant MetadirectiveVariant
DeltaFile
+9-12flang/lib/Lower/OpenMP/OpenMP.cpp
+9-121 files

LLVM/project be13726flang/lib/Lower/OpenMP OpenMP.cpp, flang/test/Lower/OpenMP metadirective-implementation.f90

Fix metadirective implicit-nothing candidate ordering

Preserve whether a metadirective variant was explicitly
specified so selection can distinguish explicit nothing
from an omitted directive variant. Order explicit candidates
before implicit nothing candidates when invoking the OpenMP
context scorer, matching the metadirective tie-break rule.

Add standalone and begin/end metadirective regression tests
where an implicit nothing candidate appears before an
otherwise-tied explicit directive variant.

Reference:
OpenMP 5.0 [2.3.4] says that if multiple when clauses have
compatible context selectors with the same highest score, and
at least one of them specifies a directive variant, "the first
directive variant specified in the lexical order of those when
clauses" replaces the metadirective.
DeltaFile
+48-16flang/lib/Lower/OpenMP/OpenMP.cpp
+23-0flang/test/Lower/OpenMP/metadirective-implementation.f90
+71-162 files

LLVM/project 95948e1flang/lib/Lower/OpenMP Utils.cpp OpenMP.cpp, flang/test/Lower/OpenMP metadirective-device-isa.f90 metadirective-implementation.f90

[flang][OpenMP] Support lowering of metadirective (part 1)

This patch implements following feature in metadirective:
- implementation={vendor(...)}
- device={kind(...), isa(...), arch(...)}
- user={condition(<constant-expr>)}
- construct={parallel, target, teams}
- default, nothing, and otherwise clause

Dynamic user conditions, target_device, and loop-associated
variants are deferred to follow-up patches.

This patch is part of the feature work for #188820.

Assisted with copilot and GPT-5.4
DeltaFile
+213-0flang/test/Lower/OpenMP/metadirective-device-isa.f90
+210-0flang/lib/Lower/OpenMP/Utils.cpp
+186-1flang/lib/Lower/OpenMP/OpenMP.cpp
+121-0flang/test/Lower/OpenMP/metadirective-implementation.f90
+33-0flang/test/Lower/OpenMP/metadirective-user.f90
+30-0flang/test/Lower/OpenMP/metadirective-construct.f90
+793-17 files not shown
+860-1913 files

LLVM/project 3aa913flibc/test/src/math/smoke NextTowardTest.h NextAfterTest.h

[libc][math] Temporarily disable exception tests for NextAfter and NextToward tests on Windows. (#199740)

They are a bit flaky on Windows.
See https://github.com/llvm/llvm-project/issues/199738
DeltaFile
+13-0libc/test/src/math/smoke/NextTowardTest.h
+13-0libc/test/src/math/smoke/NextAfterTest.h
+26-02 files

LLVM/project 282e907.github/workflows libc-fullbuild-tests.yml

[libc][ci] Improve full build precommit CIs caching keys. (#199742)

Currently full build precommit CIs only uses c_compiler as the cache's
key which will be the same for many of them listed in the matrix list.
We change to use the combination of (target + build_type + c_compiler)
as keys to uniquely distinguish them and the future gcc builds.
DeltaFile
+1-1.github/workflows/libc-fullbuild-tests.yml
+1-11 files

LLVM/project 92cfd0aclang-tools-extra/clang-doc Representation.cpp JSONGenerator.cpp, clang-tools-extra/clang-doc/tool ClangDocMain.cpp

[clang-doc] Add option for compact JSON (#190822)

By default all JSON is serialized "pretty" with whitespace. This patch
adds an option to serialize JSON without whitespace. This trims the size
of the JSON folder for clang from around 1.3 GB to 785 MB, which is a
39.6% decrease.
DeltaFile
+31-0clang-tools-extra/test/clang-doc/compact.cpp
+7-9clang-tools-extra/clang-doc/Representation.cpp
+8-2clang-tools-extra/clang-doc/JSONGenerator.cpp
+3-4clang-tools-extra/test/clang-doc/json/nested-namespace.cpp
+5-1clang-tools-extra/clang-doc/tool/ClangDocMain.cpp
+2-2clang-tools-extra/test/clang-doc/json/inheritance.cpp
+56-1816 files not shown
+73-3422 files

LLVM/project 8d2f190clang/include/clang/Basic DiagnosticSemaKinds.td, clang/lib/Sema SemaOpenACCClause.cpp

[clang][SemaOpenACC] Reject VLA reduction (#199178)

`GenerateReductionInitRecipeExpr` only handled `ConstantArrayType` when
walking the operand type to build an InitListExpr. A VariableArrayType
`(int arr[i+1])` fell through to the final else branch and tripped
`assert(Ty->isScalarType())`.

Rather than silently accepting VLAs (which have no reasonable lowering.
unlike pointers, there is an expectation of initialized values,but we
cannot statically enumerate elements), reject them outright in
`CheckVarType` with a new diagnostic err_acc_reduction_vla. This is
consistent with the fact that neither codegen nor lowering currently
supports VLA reductions, and other compilers (GCC crashes, NVC++
silently ignores) do not meaningfully handle them either. The fix
upgrades the existing warning path for non-constant arrays in
`CheckVarType` to an error when the clause kind is Reduction, so VLAs
never reach `GenerateReductionInitRecipeExpr`.
Reproducer:


    [18 lines not shown]
DeltaFile
+41-0clang/test/SemaOpenACC/compute-construct-reduction-vla.cpp
+26-0clang/test/SemaOpenACC/compute-construct-reduction-vla.c
+17-9clang/lib/Sema/SemaOpenACCClause.cpp
+2-1clang/include/clang/Basic/DiagnosticSemaKinds.td
+86-104 files

LLVM/project 53d1fcdllvm/docs ProgrammersManual.rst ReleaseNotes.md, llvm/include/llvm/ADT DenseMap.h

Revert "[DenseMap] Invalidate iterators on erase (#199369)"

This reverts commit a225aafbd1a40be0dd9c31e2d0b0b7c42b9d36e3.
DeltaFile
+7-8llvm/docs/ProgrammersManual.rst
+0-11llvm/unittests/ADT/DenseMapTest.cpp
+0-11llvm/unittests/ADT/DenseSetTest.cpp
+0-5llvm/docs/ReleaseNotes.md
+0-2llvm/include/llvm/ADT/DenseMap.h
+7-375 files

LLVM/project ff24386lld/MachO LTO.cpp, lld/test/MachO lto-object-path.ll

[lld][MachO] Fix SIGBUS crash in saveOrHardlinkBuffer (#198381)

This change removes a hardlink in saveOrHardlinkBuffer if the
hardlink already exists.

On Mac, -object_path_lto files are hardlinked to the cache when
possible. If the hardlink fails, the saveOrHardlinkBuffer method
falls back to saveBuffer instead.

saveBuffer() opens the file that is being written to as a
raw_fd_ostream object, which truncates a file when opening if the
file already exists.

Most of the time this is not an issue, however, if the hardlink
fails because it actually already exists, AND the hardlink exists
specifically between the -object_path_lto file and the cache file,
then when the file is opened and truncated, we also accidentally
truncate the file we are trying to read from.


    [8 lines not shown]
DeltaFile
+8-1lld/test/MachO/lto-object-path.ll
+5-0lld/MachO/LTO.cpp
+4-0lld/test/MachO/Inputs/large-lto-object.ll
+17-13 files

LLVM/project 5fcf11cllvm/lib/CodeGen MachineFunction.cpp, llvm/lib/MC MCDwarf.cpp

[AMDGPU][MC] Replace shifted registers in CFI instructions

Change-Id: I0d99e9fe43ec3b6fecac20531119956dca2e4e5c
DeltaFile
+67-67llvm/test/CodeGen/AMDGPU/sgpr-spill-overlap-wwm-reserve.mir
+33-0llvm/lib/MC/MCDwarf.cpp
+15-15llvm/test/CodeGen/AMDGPU/dwarf-multi-register-use-crash.ll
+10-0llvm/lib/CodeGen/MachineFunction.cpp
+4-4llvm/test/CodeGen/AMDGPU/debug-frame.ll
+2-2llvm/test/CodeGen/AMDGPU/pei-vgpr-block-spill-csr.mir
+131-885 files not shown
+143-9011 files