[clang] fix getTemplateInstantiationArgs
This implements a new strategy for collecting the template arguments, by
relying on the qualifiers and template parameter lists to navigate the template
context of out-of-line definitions.
This greatly simplifies the signature of that function, by removing a bunch
of workarounds, and simpliffying a couple that weren't removed yet.
Since this now relies on qualifiers and template parameter lists,
this patch expends most of its effort making sure these are placed,
transformed and propagated to template instantiations.
Also makes the explicit specialization AST nodes stop abusing the template
parameter lists by storing it's own template parameter list, creating a
dedicated field for them, similar to partial specializations.
Update transformations sensitive to signaling NaNs
Previously exception handling behavior was uses as an indicator of sNaN
support. With introducing a special function attribute `signaling_nans`
the checks for sNaN support must be changed to use the function
attribute rather than the exception behavior.
[clang] fix getTemplateInstantiationArgs
This implements a new strategy for collecting the template arguments, by
relying on the qualifiers and template parameter lists to navigate the template
context of out-of-line definitions.
This greatly simplifies the signature of that function, by removing a bunch
of workarounds, and simpliffying a couple that weren't removed yet.
Since this now relies on qualifiers and template parameter lists,
this patch expends most of its effort making sure these are placed,
transformed and propagated to template instantiations.
Also makes the explicit specialization AST nodes stop abusing the template
parameter lists by storing it's own template parameter list, creating a
dedicated field for them, similar to partial specializations.
Update transformations sensitive to signaling NaNs
Previously exception handling behavior was uses as an indicator of sNaN
support. With introducing a special function attribute `signaling_nans`
the checks for sNaN support must be changed to use the function
attribute rather than the exception behavior.
[AtomicExpand] Support non-integer atomic loads. (#199310)
This is arguably an enhancement rather than a bugfix. But
AtomicExpandPass already tries to support some non-integer atomic ops
using cmpxchg by bitcasting to/from an integer type. We're just missing
this one path used by atomic load. Seems easy enough to support it.
This bug was found by a large run of Opus 4.7 looking for bugs in LLVM.
Reland "[LifetimeSafety] Detect iterator invalidation through container aliases" (#197873)
This relands #195231, which was reverted in commit
7c9717848851f3a71908becab4312ddc2d8482b8.
The original crash from the reproducer no longer reproduces after
#196680, #197220, and #197604. I verified the original `repro.cpp`: it
no longer hits the lifetime-safety assertion now.
Also added regression tests for the crash:
```cpp
struct SinkInteriorBorrow {
const char *dest_; // expected-note {{this field dangles}}
SinkInteriorBorrow(std::string *dest, int n) : dest_(dest->data()) { // expected-warning {{parameter which escapes to a field is later invalidated}}
if (n > 0)
dest->clear(); // expected-note {{invalidated here}}
}
[3 lines not shown]
[AMDGPU] Fix v_dot4_i32_i8 alias to set neg_lo modifiers (#197998)
Fixes issue here https://github.com/ROCm/ROCm/issues/6126
The `v_dot4_i32_i8` assembly alias was not setting the `neg_lo` modifier
bits when converted to `v_dot4_i32_iu8`, which causes signed int8
operands to be treated as unsigned.
For example: `q=[1,-1,1,-1], k=[1,1,1,1]`: expected 0, returned 512. The
instruction is computing `1*1 + 255*1 + 1*1 + 255*1 = 512` ; treating
`-1 (0xFF)` as `255`.
On AMD GFX11+, the native `v_dot4_i32_i8` instruction doesn't exist. The
hardware provides `v_dot4_i32_iu8` with `neg_lo` modifier bits to
control signedness of each operand. The compiler correctly lowers
`v_dot4_i32_i8` intrinsics by setting `neg_lo:[1,1,0]`, but inline
assembly using the `v_dot4_i32_i8` mnemonic bypasses this lowering and
goes directly to the assembler.
[10 lines not shown]