[LifetimeSafety] Fix false negative for GSL Owner methods inherited from a non-Owner base (#197864)
- Take the implicit object's actual type (e.g., the type before any
`DerivedToBase` cast) into account when checking for GSL Owner. Other
`isGslOwnerType` call sites with the same pattern (e.g.,
`isGslOwnerType(MCE->getImplicitObjectArgument()->getType())` in
`VisitCXXMemberCallExpr`) lack a real-world trigger today and are
deferred to a follow-up.
- Unify the GSL Owner checks inside `shouldTrackImplicitObjectArg` so
they share a single source of truth.
Fixes: #188832
[VPlan] Collect FOR PHIs before sinking/hoisting recurrence users (#194671)
Avoid iterating over HeaderVPBB->phis() while potentially mutating the
underlying VPBasicBlock. Collect all VPFirstOrderRecurrencePHIRecipe
instances first, then process them in a separate loop.
This prevents iterator invalidation when sinking or hoisting recurrence
users, and makes the transformation more robust.
Fixes: https://github.com/llvm/llvm-project/issues/194618
Fixes https://github.com/llvm/llvm-project/issues/198589
---------
Co-authored-by: Zile Xiong <xiongzile99 at gmail.com>
[LV] Move constant folding test to VPlan (NFC). (#198407)
Check VPlan print for constant folding test. This makes it more robust
w.r.t. to future cost-modeling changes.
[VPlan] Collect unit-stride predicates for making vector decisions. (#199568)
Split-off from approved https://github.com/llvm/llvm-project/pull/197276
to land separately.
Collect all unit-stride predicates once, up-front, ensuring that cost
decisions have predicates available, independent of processing order.
Reapply "[LV] Handle chained selects/blends when creating new rdx cha… (#199559)
This reverts commit ab1745439c7019d0753afc616c5fc5aef7b82fb6 & reapplies
#199443.
Recommit with additional additional fix to handle other select-like
recipes including VPWidenRecipe and VPReplicateRecipe.
Original message:
Make sure we recursively clone chains of selects/blends when re-creating
a reduction chain with new types.
Fixes https://github.com/llvm/llvm-project/issues/199406.
[SLP] Enable widening strided revectorization of vector stores (#198920)
This commit adds support for re-vectorization of vector stores into
widened strided stores. That is:
```
%p1 = getelementptr i16, ptr %p0, i64 16
store <4 x i16> zeroinitializer, ptr %p1, align 2
store <4 x i16> zeroinitializer, ptr %p0, align 2
```
can be further vectorized to:
```
call void @llvm.experimental.vp.strided.store.v2i64.p0.i64(<2 x i64> zeroinitializer, ptr align 2 %p0, i64 32, <2 x i1> splat (i1 true), i32 2)
```
clang-offload-bundler incorrectly errors on multi-CCOB binaries (#182579)
Issue: https://github.com/ROCm/llvm-project/issues/448
Objects can have multiple Clang Compressed Offload Bundles (CCOB) in the
.hip_fatbin section. This happens when there are multiple
translation/compilation units built and then linked together into an
Archive or Shared Object. The resulting .hip_fatbin section will have
multiple offload bundles delimited by the magic string "CCOB" (on a 4k
alignment boundary). The Clang Offload bundler API, when a List of
bundle entries is requested, was not properly iterating (looping) over
each separate bundle.
REPRODUCTION
Test File: librocblas.so.5 from ROCm 6.x distribution
.hip_fatbin section: 8,163,887 bytes containing 64 concatenated CCOBs
Extract the .hip_fatbin section with:
objcopy --dump-section .hip_fatbin=fatbin.bin binary
[19 lines not shown]
[clang] fix getTemplateInstantiationArgs
This implements a new strategy for collecting the template arguments, by
relying on the qualifiers and template parameter lists to navigate the template
context of out-of-line definitions.
This greatly simplifies the signature of that function, by removing a bunch
of workarounds, and simpliffying a couple that weren't removed yet.
Since this now relies on qualifiers and template parameter lists,
this patch expends most of its effort making sure these are placed,
transformed and propagated to template instantiations.
Also makes the explicit specialization AST nodes stop abusing the template
parameter lists by storing it's own template parameter list, creating a
dedicated field for them, similar to partial specializations.
[VPlan] Add matcher for canonical VPWidenIntOrFpInductionRecipe (NFC). (#199539)
Add matcher for canonical VPWidenIntOrFpInductionRecipe to simplify some
matching.
[clang] fix getTemplateInstantiationArgs
This implements a new strategy for collecting the template arguments, by
relying on the qualifiers and template parameter lists to navigate the template
context of out-of-line definitions.
This greatly simplifies the signature of that function, by removing a bunch
of workarounds, and simpliffying a couple that weren't removed yet.
Since this now relies on qualifiers and template parameter lists,
this patch expends most of its effort making sure these are placed,
transformed and propagated to template instantiations.
Also makes the explicit specialization AST nodes stop abusing the template
parameter lists by storing it's own template parameter list, creating a
dedicated field for them, similar to partial specializations.
Update transformations sensitive to signaling NaNs
Previously exception handling behavior was uses as an indicator of sNaN
support. With introducing a special function attribute `signaling_nans`
the checks for sNaN support must be changed to use the function
attribute rather than the exception behavior.
[clang] fix getTemplateInstantiationArgs
This implements a new strategy for collecting the template arguments, by
relying on the qualifiers and template parameter lists to navigate the template
context of out-of-line definitions.
This greatly simplifies the signature of that function, by removing a bunch
of workarounds, and simpliffying a couple that weren't removed yet.
Since this now relies on qualifiers and template parameter lists,
this patch expends most of its effort making sure these are placed,
transformed and propagated to template instantiations.
Also makes the explicit specialization AST nodes stop abusing the template
parameter lists by storing it's own template parameter list, creating a
dedicated field for them, similar to partial specializations.
Update transformations sensitive to signaling NaNs
Previously exception handling behavior was uses as an indicator of sNaN
support. With introducing a special function attribute `signaling_nans`
the checks for sNaN support must be changed to use the function
attribute rather than the exception behavior.
[AtomicExpand] Support non-integer atomic loads. (#199310)
This is arguably an enhancement rather than a bugfix. But
AtomicExpandPass already tries to support some non-integer atomic ops
using cmpxchg by bitcasting to/from an integer type. We're just missing
this one path used by atomic load. Seems easy enough to support it.
This bug was found by a large run of Opus 4.7 looking for bugs in LLVM.