LLVM/project 2e5f8b2libc/config/linux/x86_64 headers.txt, libc/include CMakeLists.txt

[libc] Add sys/ucontext.h header (#194329)

POSIX historically provided <sys/ucontext.h> as an alias for
<ucontext.h>. Some software still includes the sys/ path. Added the
header as a simple wrapper that includes <ucontext.h>, gated to x86_64
alongside the existing ucontext support.
DeltaFile
+14-0libc/include/sys/ucontext.h
+6-0libc/include/CMakeLists.txt
+1-0libc/config/linux/x86_64/headers.txt
+21-03 files

LLVM/project 800ea4dllvm/test/CodeGen/AMDGPU llvm.amdgcn.av.global.load.b128.ll llvm.amdgcn.av.global.store.b128.ll

add comments to tests
DeltaFile
+32-0llvm/test/CodeGen/AMDGPU/llvm.amdgcn.av.global.load.b128.ll
+8-0llvm/test/CodeGen/AMDGPU/llvm.amdgcn.av.global.store.b128.ll
+40-02 files

LLVM/project a4597a7llvm/docs AMDGPUUsage.rst, llvm/include/llvm/IR IntrinsicsAMDGPU.td

[AMDGPU] Add flat variants of av.load/store.b128 intrinsics

Add llvm.amdgcn.av.flat.load.b128 and llvm.amdgcn.av.flat.store.b128
intrinsics using flat_ptr_ty, following the established flat_/global_
naming convention used by other AMDGPU intrinsics (e.g., flat_prefetch
vs global_prefetch, flat_load_monitor vs global_load_monitor).

These select to FLAT_LOAD_DWORDX4 / FLAT_STORE_DWORDX4 with the same
cache policy bits as the global variants.

Assisted-By: Claude Opus 4.6 (1M context)
DeltaFile
+865-110llvm/test/CodeGen/AMDGPU/amdgcn-av-scopes.ll
+31-14llvm/docs/AMDGPUUsage.rst
+25-0llvm/include/llvm/IR/IntrinsicsAMDGPU.td
+16-2llvm/test/Verifier/AMDGPU/intrinsics-av.ll
+11-0llvm/lib/Target/AMDGPU/FLATInstructions.td
+7-2llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+955-1282 files not shown
+961-1308 files

LLVM/project 4d6a575llvm/docs AMDGPUUsage.rst

simplify documentation
DeltaFile
+65-105llvm/docs/AMDGPUUsage.rst
+65-1051 files

LLVM/project ef18c25clang/include/clang/AST RecursiveASTVisitor.h ASTTypeTraits.h, clang/lib/AST DynamicRecursiveASTVisitor.cpp ASTTypeTraits.cpp

[Clang][RAV] Visit components of __builtin_offsetof designators (#194122)

`RecursiveASTVisitor` previously only traversed the type operand of an
`OffsetOfExpr,` ignoring the field/identifier/base/array-index
components of the designator. This meant tools built on RAV (clang-tidy,
clangd, indexers, ...) silently missed every field reference inside
`__builtin_offsetof(T, a.b.c)`.

Add a `TraverseOffsetOfNode / VisitOffsetOfNode` pair following the same
pattern used for `ConceptReference`, `ObjCProtocolLoc`, and friends. The
`DEF_TRAVERSE_STMT` for `OffsetOfExpr` now invokes
`TraverseOffsetOfNode` for each component; array index expressions
continue to be reached via the existing children() traversal. Default
visitation is a no-op, so the change is opt-in for consumers and
behavior-preserving otherwise.

Also expose `OffsetOfNode` as a `DynTypedNode` kind via `ASTTypeTraits`
so that downstream machinery (`SelectionTree`, parent maps, matchers)
can reference these nodes uniformly.

    [6 lines not shown]
DeltaFile
+162-0clang/unittests/Tooling/RecursiveASTVisitorTests/OffsetOfExpr.cpp
+22-4clang/include/clang/AST/RecursiveASTVisitor.h
+9-0clang/lib/AST/DynamicRecursiveASTVisitor.cpp
+6-0clang/include/clang/AST/ASTTypeTraits.h
+5-0clang/include/clang/AST/DynamicRecursiveASTVisitor.h
+3-0clang/lib/AST/ASTTypeTraits.cpp
+207-42 files not shown
+209-48 files

LLVM/project 39e73ebllvm/lib/CodeGen/SelectionDAG DAGCombiner.cpp, llvm/test/CodeGen/AArch64 partial-reduction-sub-fp.ll

[DAGcombine] Recognize fneg on RHS for partial_reduce_fmla (#193994)

PR #186809 recognized the negation on the fmul() expression, but after
instcombine the fneg is moved to the RHS operand, so with #186809 the
negation would not be recognized by the combine.

https://godbolt.org/z/YfoYshz78
DeltaFile
+12-12llvm/test/CodeGen/AArch64/partial-reduction-sub-fp.ll
+9-1llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+21-132 files

LLVM/project 67deb54clang/include/clang/Basic arm_sve.td, clang/test/CodeGen/AArch64/sve2p3-intrinsics acle_sve2p3_svabal.c

[Clang][AArch64][SVE2p3][SME2p3] Add intrinsics for v9.7a Two-way signed/unsigned absolute difference sum and accumulate long ops (#188972)

Add the following new clang intrinsics based on the ACLE specification
https://github.com/ARM-software/acle/pull/428 (Add alpha support for 9.7
data processing intrinsics)

SABAL (Two-way signed absolute difference sum and accumulate long)
- svint16_t svabal[_s16](svint16_t, svint8_t, svint8_t) / svint16_t
svabal[_n_s16](svint16_t, svint8_t, int8_t)
- svint32_t svabal[_s32](svint32_t, svint16_t, svint16_t) / svint32_t
svabal[_n_s32](svint32_t, svint16_t, int16_t)
- svint64_t svabal[_s64](svint64_t, svint32_t, svint32_t) / svint64_t
svabal[_n_s64](svint64_t, svint32_t, int32_t)

UABAL (Two-way unsigned absolute difference sum and accumulate long )
- svuint16_t svabal[_u16](svuint16_t, svuint8_t, svuint8_t) / svuint16_t
svabal[_n_u16](svuint16_t, svuint8_t, uint8_t)
- svuint32_t svabal[_u32](svuint32_t, svuint16_t, svuint16_t) /
svuint32_t svabal[_n_u32](svuint32_t, svuint16_t, uint16_t)
- svuint64_t svabal[_u64](svuint64_t, svuint32_t, svuint32_t) /
svuint64_t svabal[_n_u64](svuint64_t, svuint32_t, uint32_t)
DeltaFile
+479-0clang/test/CodeGen/AArch64/sve2p3-intrinsics/acle_sve2p3_svabal.c
+138-0clang/test/Sema/AArch64/arm_sve_feature_dependent_sve_AND_LP_sve2p3_OR_sme2p3_RP___sme_AND_LP_sve2p3_OR_sme2p3_RP.c
+63-0clang/test/Sema/aarch64-sve2p3-intrinsics/acle_sve2p3.cpp
+58-0llvm/test/CodeGen/AArch64/sve2p3-intrinsics/sve2p3-intrinsics-abal.ll
+11-0clang/include/clang/Basic/arm_sve.td
+6-1llvm/lib/Target/AArch64/SVEInstrFormats.td
+755-12 files not shown
+759-38 files

LLVM/project 311dd5allvm/test/TableGen directive1.td directive2.td, llvm/utils/TableGen/Basic DirectiveEmitter.cpp

[TableGen] Emit constexpr versions of some directive/clause functions

Reland https://github.com/llvm/llvm-project/pull/176253 with a change
to reduce compile-time impact.

Several of the functions that TableGen emits into the .cpp files for
OpenACC or OpenMP could be constexpr. They can't just be emitted into
the header files as constexpr as they are because they use "assert" and
"llvm_unreachable".
To preserve the existing functionality, this patch will cause TableGen
to emit the constexpr variants that return the value as std::optional,
where std::nullopt indicates an error. The exisiting functions will
invoke the constexpr versions and call assert/llvm_unreachable if
nullopt is returned. E.g.

```
// .h
constexpr std::optional<Association>
getDirectiveAssociationOpt(Directive D) {

    [20 lines not shown]
DeltaFile
+83-28llvm/utils/TableGen/Basic/DirectiveEmitter.cpp
+38-14llvm/test/TableGen/directive1.td
+38-14llvm/test/TableGen/directive2.td
+159-563 files

LLVM/project 8e8113fflang/lib/Semantics check-omp-structure.cpp, flang/test/Semantics/OpenMP workshare06.f90

[Flang][OpenMP] Allow Fortran BLOCK construct inside WORKSHARE region (#193352)

**Problem**
Flang incorrectly rejects Fortran BLOCK constructs inside OpenMP
WORKSHARE regions. This fixes the semantic check to recursively validate
the contents of BLOCK constructs instead of rejecting them.

The Fortran BLOCK construct (F2008) is a transparent scoping wrapper
that does not affect execution semantics. When a BLOCK appears inside a
WORKSHARE region, the restriction on allowed statements should apply to
the contents of the BLOCK, not the BLOCK construct itself.

**Fix**
The function CheckWorkshareBlockStmts (check-omp-structure.cpp) loops
through each statement inside a WORKSHARE region and checks if it's
allowed.

Before this fix, it only recognized:


    [15 lines not shown]
DeltaFile
+12-0flang/test/Semantics/OpenMP/workshare06.f90
+6-0flang/lib/Semantics/check-omp-structure.cpp
+18-02 files

LLVM/project f5bb397mlir/lib/Dialect/XeGPU/IR XeGPUDialect.cpp, mlir/test/Dialect/XeGPU propagate-layout-inst-data.mlir

[MLIR][XeGPU] Fix Layout collapse dims out of bounds (#193661)

Fix a bug in LayoutAttr::collapseDims() implementation.

---------

Co-authored-by: Claude Sonnet 4.5 <noreply at anthropic.com>
DeltaFile
+26-0mlir/test/Dialect/XeGPU/propagate-layout-inst-data.mlir
+1-1mlir/lib/Dialect/XeGPU/IR/XeGPUDialect.cpp
+27-12 files

LLVM/project f8bf224flang/include/flang/Parser parse-tree.h, flang/lib/Parser openmp-parsers.cpp

[flang][OpenMP] Frontend support for BEGIN/END METADIRECTIVE

This implements parsing of BEGIN/END METADIRECTIVE, plus a minimal
semantic check for the association of a directive in a WHEN/OTHERWISE
clauses.

The same semantic checks for the context selectors apply here as in
the case of a standalone METADIRECTIVE.
DeltaFile
+84-24flang/lib/Parser/openmp-parsers.cpp
+80-0flang/test/Parser/OpenMP/begin-metadirective.f90
+49-0flang/lib/Semantics/check-omp-metadirective.cpp
+18-19flang/lib/Semantics/check-omp-structure.cpp
+13-2flang/lib/Semantics/resolve-directives.cpp
+7-2flang/include/flang/Parser/parse-tree.h
+251-474 files not shown
+267-4710 files

LLVM/project 6df5cedmlir/lib/Dialect/OpenMP/Transforms StackToShared.cpp

[MLIR][OpenMP] Fix sanitizer issue related to stack-to-shared pass (#194397)

The OpenMP dialect stack-to-shared pass could try to access attributes
from a deleted operation. This updates it to get that information from
the operation created to replace it.
DeltaFile
+3-2mlir/lib/Dialect/OpenMP/Transforms/StackToShared.cpp
+3-21 files

LLVM/project bd47069llvm/lib/Target/AMDGPU AMDGPUMCInstLower.cpp SIInstrInfo.cpp

Reapply "AMDGPU: Implement getInstSizeVerifyMode" (#194026) (#194362)

This reverts commit 72ca372fa7c9029d2b7a77c59a4cc24530e99e43.
DeltaFile
+0-22llvm/lib/Target/AMDGPU/AMDGPUMCInstLower.cpp
+7-0llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
+3-0llvm/lib/Target/AMDGPU/SIInstrInfo.h
+10-223 files

LLVM/project 77a3606llvm/lib/CodeGen Rematerializer.cpp

[CodeGen] Fix incorrect index in rematerialization tracking (#194387)

When deleting the last rematerialization of a register, we should delete
the rematerializer's remat tracking map's entry that corresponds to the
index of the *original* register, not the rematerialized register.

The existing typo has no impact on correctness at the moment because
entries with rematerialized register indices are never created (so there
is nothing to erase), and having an empty set in a value does not break
any code invariant; it just wastes memory.

Assisted-by: Claude Code
DeltaFile
+3-2llvm/lib/CodeGen/Rematerializer.cpp
+3-21 files

LLVM/project ec59f15clang/lib/Driver/ToolChains BareMetal.cpp, clang/test/Driver baremetal.cpp

[clang][Driver][BareMetal] Add profile library to the command line when needed (#191847)

Now that libclang_rt.profile.a supports bare-metal targets, follow other
drivers and add libclang_rt.profile.a in the BareMetal driver to the
command line automatically when needed, e.g. when
-fprofile-instr-generate is provided.
DeltaFile
+12-0clang/test/Driver/baremetal.cpp
+1-0clang/lib/Driver/ToolChains/BareMetal.cpp
+13-02 files

LLVM/project 4974ae5lldb/source/Plugins/Process/gdb-remote GDBRemoteCommunicationServerLLGS.cpp

[lldb-server] Fix constexpr-if-else static assert (#194394)

Some old compilers complained about the `static_assert(false)` pattern.

Fixes https://lab.llvm.org/buildbot/#/builders/163/builds/39139
DeltaFile
+5-3lldb/source/Plugins/Process/gdb-remote/GDBRemoteCommunicationServerLLGS.cpp
+5-31 files

LLVM/project b629f86llvm/lib/Target/ARM ARMISelLowering.cpp ARMISelLowering.h, llvm/test/CodeGen/ARM vbits.ll

[ARM] hasAndNot in ARM supports vectors (#193614)

NEON and MVE have vector bic.
DeltaFile
+6-12llvm/test/CodeGen/Thumb2/mve-vselect-constants.ll
+11-0llvm/lib/Target/ARM/ARMISelLowering.cpp
+11-0llvm/test/CodeGen/ARM/vbits.ll
+2-0llvm/lib/Target/ARM/ARMISelLowering.h
+30-124 files

LLVM/project 6b81cdbllvm/lib/Transforms/Vectorize SLPVectorizer.cpp, llvm/test/Transforms/SLPVectorizer/X86 identity-reuses-with-poisons.ll

[SLP]Fix crash in getReorderingData on all-poison reuse-mask slice

When the reuse-shuffle mask is iterated in Sz-sized parts and a part is
entirely PoisonMaskElem, `Val` stays at PoisonMaskElem (-1) and the
subsequent `UsedVals.test(Val)` trips the SmallBitVector OOB assertion.
Bail out of reordering in that case.

Fixes #194315

Reviewers: 

Pull Request: https://github.com/llvm/llvm-project/pull/194392
DeltaFile
+114-0llvm/test/Transforms/SLPVectorizer/X86/identity-reuses-with-poisons.ll
+3-3llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+117-32 files

LLVM/project c183492clang/lib/Format UnwrappedLineParser.cpp, clang/unittests/Format FormatTestComments.cpp TokenAnnotatorTest.cpp

[clang-format] Recognize more braced initializers (#192299)

new

```C++
a = {x * x, x * x};
```

old

```C++
a = {x * x, x *x};
```

Fixes #57442.

The patch makes the program treat a brace following an equal sign a
braced initializer.


    [30 lines not shown]
DeltaFile
+12-15clang/unittests/Format/FormatTestComments.cpp
+3-3clang/lib/Format/UnwrappedLineParser.cpp
+6-0clang/unittests/Format/TokenAnnotatorTest.cpp
+5-1clang/unittests/Format/FormatTest.cpp
+26-194 files

LLVM/project 5e45150offload/test/offloading ctor_dtor.cpp

[offload][lit] Enable ctor_dtor.cpp on Intel GPUs (#194389)

It was fixed with https://github.com/llvm/llvm-project/pull/192725 and
https://github.com/llvm/llvm-project/pull/192730.

Signed-off-by: Nick Sarnie <nick.sarnie at intel.com>
DeltaFile
+0-1offload/test/offloading/ctor_dtor.cpp
+0-11 files

LLVM/project 6cbbea7lldb/include/lldb/Utility StringExtractorGDBRemote.h, lldb/packages/Python/lldbsuite/test/tools/lldb-server gdbremote_testcase.py

[lldb-server] Implement support for MultiBreakpoint packet

This is fairly straightforward, thanks to the helper functions created
in the previous commit.

https://github.com/llvm/llvm-project/pull/192910
DeltaFile
+66-0lldb/source/Plugins/Process/gdb-remote/GDBRemoteCommunicationServerLLGS.cpp
+2-0lldb/source/Plugins/Process/gdb-remote/GDBRemoteCommunicationServerLLGS.h
+2-0lldb/source/Utility/StringExtractorGDBRemote.cpp
+0-1lldb/test/API/functionalities/multi-breakpoint/TestMultiBreakpoint.py
+1-0lldb/include/lldb/Utility/StringExtractorGDBRemote.h
+1-0lldb/packages/Python/lldbsuite/test/tools/lldb-server/gdbremote_testcase.py
+72-16 files

LLVM/project 8593524flang/lib/Lower OpenACC.cpp, flang/lib/Semantics resolve-directives.cpp canonicalize-acc.cpp

[flang][semantics][openacc] Allow collapse clauses on do concurrent (#192488)

This PR generalizes the semantic checking for collapse clauses to work
on `do concurrent` and fixes two bugs exposed along the way:
- The first was that `collapse (n)` where n < the number of nested loops
was giving an assertion violation.
- The second was do concurrent index variables were causing an assertion
violation because they hadn't been declared before looking them up.

The lowering is implemented as a TODO which will happen in a following
diff.
DeltaFile
+91-0flang/test/Semantics/OpenACC/acc-loop.f90
+55-21flang/lib/Semantics/resolve-directives.cpp
+33-0flang/test/Lower/OpenACC/Todo/do-loops-to-acc-loops-todo.f90
+9-2flang/lib/Lower/OpenACC.cpp
+0-5flang/lib/Semantics/canonicalize-acc.cpp
+0-2flang/test/Semantics/OpenACC/acc-canonicalization-validity.f90
+188-306 files

LLVM/project 57754e0clang/test/CodeGen/AArch64 v9.7a-neon-mmla-intrinsics.c, clang/test/CodeGen/AArch64/sve-intrinsics acle_sve_mmla-f16.c acle_sve_mmla-bf16.c

[AArch64][clang][llvm] Add ACLE Armv9.7 matrix multiply-accumulate intrinsics (#193017)

Implement new ACLE matrix multiply-accumulate intrinsics for Armv9.7:

```c
  // 16-bit floating-point matrix multiply-accumulate.
  // Only if __ARM_FEATURE_SVE_B16MM
  // Variant also available for _f16 if (__ARM_FEATURE_SVE2p2 && __ARM_FEATURE_F16MM).
  svbfloat16_t svmmla[_bf16](svbfloat16_t zda, svbfloat16_t zn, svbfloat16_t zm);

  // Half-precision matrix multiply accumulating to single-precision instruction.
  // Requires the +f16f32mm architecture extension.
  float32x4_t vmmlaq_f32_f16(float32x4_t r, float16x8_t a, float16x8_t b);

  // Non-widening half-precision matrix multiply instruction.
  // Requires the +f16mm architecture extension.
  float16x8_t vmmlaq_f16_f16(float16x8_t r, float16x8_t a, float16x8_t b);
```
DeltaFile
+45-0clang/test/CodeGen/AArch64/v9.7a-neon-mmla-intrinsics.c
+32-0clang/test/CodeGen/AArch64/sve-intrinsics/acle_sve_mmla-f16.c
+32-0clang/test/Sema/AArch64/arm_sve_non_streaming_only_sve_AND_sve2p2_AND_f16mm.c
+32-0clang/test/Sema/AArch64/arm_sve_non_streaming_only_sve_AND_sve-b16mm.c
+32-0clang/test/CodeGen/AArch64/sve-intrinsics/acle_sve_mmla-bf16.c
+14-1clang/test/Sema/aarch64-neon-target.c
+187-112 files not shown
+275-1218 files

LLVM/project 86be9bcllvm/lib/Target/AMDGPU AMDGPUMCInstLower.cpp SIInstrInfo.cpp

Reapply "AMDGPU: Implement getInstSizeVerifyMode" (#194026)

This reverts commit 72ca372fa7c9029d2b7a77c59a4cc24530e99e43.
DeltaFile
+0-22llvm/lib/Target/AMDGPU/AMDGPUMCInstLower.cpp
+7-0llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
+3-0llvm/lib/Target/AMDGPU/SIInstrInfo.h
+10-223 files

LLVM/project c2ab7f2lldb/source/Plugins/Process/gdb-remote GDBRemoteCommunicationServerLLGS.cpp GDBRemoteCommunicationServerLLGS.h

[lldbremote][NFC] Factor out code handling breakpoint packets (#192915)

This commit extracts the code handling breakpoint packets into a helper
function that can be used by a future implementation of the
MultiBreakpointPacket.

It is meant to be purely NFC.

There are two functions handling breakpoint packets (`handle_Z` and
`handle_z`) with a lot of repeated code. This commit did not attempt to
merge the two, as that would make the diff much larger due to subtle
differences in the error message produced by the two. The only
deduplication done is in the code processing a GDBStoppointType, where a
helper struct (`BreakpointKind`) and function
(`std::optional<BreakpointKind> getBreakpointKind(GDBStoppointType
stoppoint_type)`) was created.

The following PRs are related to the MultiBreakpoint feature:


    [7 lines not shown]
DeltaFile
+128-107lldb/source/Plugins/Process/gdb-remote/GDBRemoteCommunicationServerLLGS.cpp
+22-0lldb/source/Plugins/Process/gdb-remote/GDBRemoteCommunicationServerLLGS.h
+150-1072 files

LLVM/project bc7e916llvm/lib/Target/AMDGPU VOP2Instructions.td, llvm/test/CodeGen/AMDGPU v_mac_f16-fpdp-rounding-mode.ll

AMDGPU: Address fixme for v_mac_f16 rounding mode (#194360)

This should use the f16/f64 rounding mode
DeltaFile
+27-0llvm/test/CodeGen/AMDGPU/v_mac_f16-fpdp-rounding-mode.ll
+1-1llvm/lib/Target/AMDGPU/VOP2Instructions.td
+28-12 files

LLVM/project 78eccecmlir/include/mlir/Dialect/LLVMIR NVVMOps.td, mlir/lib/Dialect/LLVMIR/IR NVVMDialect.cpp

[MLIR][NVVM] Add `nvvm.log2` OP (#193789)

Implement `nvvm.log2` with ftz flag
DeltaFile
+16-2mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td
+14-0mlir/test/Dialect/LLVMIR/nvvm-transcendentals.mlir
+14-0mlir/test/Target/LLVMIR/nvvm/transcendentals.mlir
+10-0mlir/lib/Dialect/LLVMIR/IR/NVVMDialect.cpp
+54-24 files

LLVM/project 5ee1495clang/docs ReleaseNotes.rst, clang/lib/Parse ParseDeclCXX.cpp

[Clang] fix parser recovery for invalid static_assert string messages (#187859)

Fixes #187690

--- 

This PR fixes parser recovery for invalid `static_assert` declarations
with string literal messages. The parser now stops the message lookahead
on `;` and `eof`, so invalid inputs are diagnosed as parse errors.
DeltaFile
+12-8clang/lib/Parse/ParseDeclCXX.cpp
+11-5clang/test/Parser/static_assert.cpp
+1-0clang/docs/ReleaseNotes.rst
+24-133 files

LLVM/project a94c116llvm/test/Transforms/GlobalOpt ctor-memset.ll pr54572.ll

[GlobalOpt] Regenerate test checks (NFC) (#194385)
DeltaFile
+12-12llvm/test/Transforms/GlobalOpt/ctor-memset.ll
+2-2llvm/test/Transforms/GlobalOpt/pr54572.ll
+14-142 files

LLVM/project 8e0011allvm/test/Transforms/FunctionAttrs nosync.ll

[FunctionAttrs] Remove declaration check lines (NFC) (#194384)

These are annoying, because they get dropped by UTC. We're not
inferring attributes on declarations anyway.
DeltaFile
+0-6llvm/test/Transforms/FunctionAttrs/nosync.ll
+0-61 files