[AMDGPU] Handle GFX1250 hazards between WMMA and VOPD (#183573)
Hazards between WMMA and VALU were handled in #149865 but this only
worked for regular VOP* VALU encodings, not for VOPD.
Fixes: #183546
[alpha.webkit.NoDeleteChecker] Check if each field is trivially destructive (#183711)
This PR fixes the bug that NoDeleteChecker and trivial function analysis
were not detecting any non-trivial destruction of class member
variables.
When evaluating a delete expression or calling a destructor directly for
triviality, check if each field in the class and its base classes is
trivially destructive.
[libc][math] Refactor bf16sub family to header-only (#182115)
Refactors the bf16sub math family to be header-only.
Closes https://github.com/llvm/llvm-project/issues/182114
Target Functions:
- bf16sub
- bf16subf
- bf16subf128
[clang] stop error recovery in SFINAE for narrowing in converted constant expressions (#183614)
A narrowing conversion in a converted constant expression should produce
an invalid expression so that [temp.deduct.general]p7 is satisfied, by
stopping substitution at this point.
This regression was introduced in #164703, and this will be backported
to clang-22, so no release notes.
Fixes #167709
[flang] Fix explanatory messages for generic resolution error (#183565)
The compiler emits messages to explain why each of a generic procedure's
specific procedures is not a match for a given set of actual arguments.
In the case of specific procedures with PASS arguments in derived type
procedure bindings or procedure components, these explanatory messages
are often bogus, because the re-analysis didn't adjust the actual
arguments to account for the PASS argument. Fix.
[Driver][SYCL] Add tests for -Xarch_<arch> option forwarding to SYCL JIT compilation. (#178025)
This change adds test coverage to verify that options passed via
`-Xarch_<arch> <option>` are correctly forwarded to SYCL JIT
compilations.
[clang-format] Fix SpaceBeforeParens with explicit template instantiations (#183183)
This fixes explicit template instantiated functions not having spaces
added/removed based on the value of `SpaceBeforeParens`.
Attribution Note - I have been authorized to contribute this change on
behalf of my company: ArenaNet LLC
[CIR] Implement TryOp flattening (#183591)
This updates the FlattenCFG pass to add flattening for cir::TryOp in
cases where the TryOp contains catch or unwind handlers.
Substantial amounts of this PR were created using agentic AI tools, but
I have carefully reviewed the code, comments, and tests and made changes
as needed. I've left intermediate commits in the initial PR if you'd
like to see the progression.
[Clang][ItaniumMangle] Fix recursive mangling for lambda init-captures (#182667)
[Clang][ItaniumMangle] Fix recursive mangling for lambda init-captures
Mangle computation for lambda signatures can recurse when a call
operator type
references an init-capture (for example via decltype(init-capture)). In
these
cases, mangling can re-enter the init-capture declaration and cycle back
through
operator() mangling.
Make lambda context publication explicit and independent from numbering
state,
then use that context uniformly during mangling:
* Publish lambda `ContextDecl` in `Sema::handleLambdaNumbering()` before
numbering, so dependent type mangling can resolve the lambda context
without
recursing through the call operator.
[19 lines not shown]
[InstCombine] Replace alloca with undef size with poison instead of null (#182919)
InstCombine previously replaced an alloca instruction with a null
pointer when the array size operand was undef. While this replacement
may be legal, it still caused invalid IR in cases where the original
alloca was used by `@llvm.lifetime` intrinsics.
The spec requires that the pointer operand of `@llvm.lifetime.*` must be
either:
- a pointer to an alloca instruction, or
- a poison value.
Replacing the pointer with null violated this requirement and triggered
verifier errors.
These new changes update InstCombine so that in this scenario the alloca
is replaced with poison instead of null.
[SystemZ] Emit external aliases for indirect function descriptors in the ADA section (#183443)
This is the last of the three patches aimed to support indirect symbol
handling for the SystemZ backend.
An external alias is emitted for indirect function descriptors within
the ADA section, rather than a temporary alias, while also setting all
of the appropriate symbol attributes that are needed for the HLASM
streamer to emit the correct XATTR and ALIAS instructions for the
indirect symbols.
Moreover, this patch updates the
`CodeGen/SystemZ/zos-ada-relocations.ll` test as the ADA section is
currently the only user of indirect symbols on z/OS.
Depends on https://github.com/llvm/llvm-project/pull/183442.
[SLP] Reject duplicate shift amounts in matchesShlZExt reorder path (#183627)
In the reordered RHS path of matchesShlZExt, the code never checked that
each shift amount (0, Stride, 2×Stride, …) appears at most once. When
the same shift appeared in multiple lanes, it still filled Order,
producing a non-permutation (e.g. Order = [0,0,0,1]). That led to bad
shuffle masks and miscompilation (e.g. shuffles with poison).
The patch adds an explicit duplicate check: before setting Order[Idx] =
Pos, it ensures Pos has not been seen before, using a SmallBitVector
SeenPositions(VF). If a position is seen twice, the function returns
false and the optimization is not applied.
[clang][ssaf] Add `JSONFormat` support for `TUSummaryEncoding`
This PR adds `JSONFormat` support for reading and writing
`TUSummaryEncoding`. The implementation exploits similarities in the
structures of `TUSummary` and `TUSummaryEncoding` by reusing existing
`JSONFormat` support for `TUSummary`. Duplication of tests has been
avoided by parameterizing the test fixture that runs all relevant
read/write tests against `TUSummary`, for `TUSummaryEncoding`. This
ensures that the two serialization paths remain in lockstep.
[SlotIndexes] Further pack indices to improve spill placement time (#182640)
This patch makes it so that renumbering indices when inserting
instructions into the SlotIndexes analysis renumbers the entire list if
the list is otherwise densely packed. This fixes a case we saw on
AArch64 with a lot of spills where every single spill instruction
insertion required a renumbering of most of the instructions in a large
function, making the operation approximately quadratic.
This is not NFC as heuristics depend on the SlotIndex numbers, although
this should mostly be a wash as LRs should be extended ~equally.
[OpenMP] Enable internalization of 'ockl.bc' for OpenMP (#183685)
Fix linking of 'ockl.bc' for OpenMP by switching from
`-mlink-bitcode-file` to `-mlink-builtin-bitcode`
[WebAssembly] Incorporate SCCs into WebAssemblyFixIrreducibleControlFlow (#181755)
Rather than mapping out full "reachability" between blocks in a region
to find loops and using `LoopBlocks` to find the bodies of said loops,
use SCCs (strongly-connected components) to provide this information.
This brings in LLVM's generic `SCCIterator` (which uses Tarjan's
algorithm) as the implementation for sorting the basic blocks of the CFG
into their SCCs.
This PR greatly reduces the compile-time footprint of the pass, making
memory use and time taken negliable where it might have previously
caused stalls and OOM before (e.g. #47793,
usagi-coffee/tree-sitter-abl#114)
------
Supersedes #179722
[10 lines not shown]
[mlir][LLVM] Let decomposeValue/composeValue handle aggregates (#183405)
This commit updates the LLVM::decomposeValue and LLVM::composeValue
methods to handle aggregate types - LLVM arrays and structs, and to have
different behaviors on dealing with types like pointers that can't be
bitcast to fixed-size integers. This allows the "any type" on
gpu.subgroup_broadcast to be more comprehensive - you can broadcast a
memref to a subgroup by decomposing it, for example.
(This branched off of getting an LLM to implement
ValueuboundsOpInterface on subgroup_broadcast, having it add handling
for the dimensions of shaped types, and realizing that there's no
fundamental reason you can't broadcast a memref or the like)
---------
Co-authored-by: Claude Opus 4.6 <noreply at anthropic.com>