[LifetimeSafety] Introduce buildOriginFlowChain for use-after-scope (#199345)
After adding `buildOriginFlowChain`, we need to choose a diagnostic type
that is as simple as possible to verify its feasibility during `Sema`
diagnostics.
I did not choose the annotation suggestions described in
https://github.com/llvm/llvm-project/pull/188467/#issuecomment-4359071778
as the first target to implement, because it does not seem to occur
within a single CFG block. The `IssueFact` always resides in the block
preceding the `OriginEscapesFact`, which causes me to always get an
empty `OriginFlowChain`.
Since we use `buildOriginFlowChain`, we can directly trace distinct
assignment steps that occur within a single source-level expression. For
example:
```cpp
#include <vector>
[77 lines not shown]
[AMDGPU][test] Use mir test for regalloc issue
Use the newly introduced split-from flag to produce a more robust test case
for the hoistSpillInsideBB live-range update issue.
NFC
[IR] Make CanBeFreed calculation optional (NFC) (#203490)
Make the CanBeFreed argument of getPointerDereferenceableBytes() a
pointer, so that nullptr can be passed if we're not interested in
whether frees are possible or not.
Nearly all places don't actually care about frees, including BasicAA,
which is the hottest caller of this API. This improves compile-time when
deref-at-point semantics are enabled.
I've kept the argument required so that callers still have to make an
explicit choice to ignore frees. (I'd be open to making it optional
though, given that only a single caller actually cares...)
Parse GROUP blocks in scst.conf test parser
The FC rendering in the tests now wrap LUNs in GROUP security_group.
Teach the test parser to recurse into GROUP blocks and update
assertions to match the nested structure.
[flang] Avoid invalid declare_value for promoted dummy-scope variables (#202498)
This fixes a verifier failure in mem2reg after inlining a CUDA device
procedure. When a promoted FIR alloca had an associated fir.declare with
a dummy_scope, mem2reg could create a fir.declare_value at a loop header
where the original dummy scope did not dominate.
Skip creating block-argument fir.declare_value ops for such
declarations, matching the existing replaced-value handling. Add a FIR
mem2reg regression test for the loop-header block argument case.
[X86] combineConcatVectorOps - concat(rotate(x,a),rotate(y,b)) -> rotate(concat(x,y),concat(a,b)) (#203553)
128/256-bit rotates are widened in tablegen, we don't need to limit
these to VLX targets - any AVX512 target can perform these
We already have test coverage to ensure 128-bit XOP rotates don't get
concatenated to 256-bit
[AArch64] Define GCS operations as SYS and SYSL aliases
Move the remaining `GCS` instructions from dedicated opcodes to `SYSxt/SYSLxt`
aliases, keeping a tied `SYSL` pseudo for codegen where `GCS` preserves the`
input register when disabled at runtime.
Update `GCS` intrinsic selection, scheduling, disassembly aliases, and MC
coverage for the generic `SYS/SYSL` encodings.
[AArch64] Define APAS, BRB and TRCIT as SYS aliases
`APAS`, `BRB IALL/INJ` and `TRCIT` use `SYS` encodings, so define them
as aliases of SYSxt instead of separate instructions.
Check that the preferred architectural aliases are printed when their
features are enabled and that disassembly falls back to the generic SYS
spelling when not enabled.
Render security_group ACG for FC targets on non-HA systems
The non-HA branch of the FC rendering in scst.conf.mako emitted bare
TARGET blocks with the LUN at target level, ignoring any configured
initiator setting on the target. Non-HA users could not restrict FC
initiator access by WWPN - middleware accepted the configuration but
silently dropped it during rendering.
[CIR] Load flattened struct args from coerce slot
At the call site, a struct argument that flattens into scalar wire
arguments was coerced to the ABI struct as a whole value and then
decomposed with cir.extract_member. When the coercion goes through
memory, read each field from the coerced slot with cir.get_member +
cir.load instead, so the lowering takes pointers to the members it
wants rather than loading the entire structure and extracting from the
value. The shared memory half of the coercion is factored into
emitCoercionToMemory, which returns the destination-typed pointer to
the coerce slot; emitCoercion now builds on it and loads the whole
value, so its existing callers are unchanged. The no-coercion call
site (the operand already has the coerced type) keeps cir.extract_member
because that value has no backing slot to take member pointers from.
The remaining changes are mechanical: llvm::append_range and
SmallVector::append for the per-field loops, spelling out cir::RecordType
instead of auto at the getFlattenedCoercedType call sites, an enumerate
loop over the coerced members, and renaming the builder parameter from
[5 lines not shown]
[MIR] Serialize/Deserialize MachineInstr::LRSplit attribute
The LRSplit MachineInstr flag is set by SplitKit on copies inserted for
live-range splitting.
Until now the flag had no MIR-text representation.
This patch fixes that so that it gets easier to reproduce/capture issues
that involves SplitKit.
Round-trip coverage in
llvm/test/CodeGen/MIR/AMDGPU/lr-split-flag.mir.
AMDGPU/GlobalISel: RegBankLegalize rules for sched barriers intrinsics (#203425)
Add rules for sched barrier intrinsics. Note, there are regressions due
to AGPR results being copied back to VGPR un-necessarily. That will be
addressed in a future follow-up patch.
libtest: simplify the protocol for test case setup/teardown.
Remove 'enum test_case_status': a true/false value suffices
to indicate setup and teardown failure.