[LoopInterchange] Bail out when outer loop latch PHI has non-PHI user (#201923)
When there are non-PHI instructions in the outer loop that use values
originating from the LCSSA PHIs of the inner loop, it becomes difficult
to adjust the wiring during the transformation. In fact, multiple issues
(#200819 and #201571) have been raised related to this pattern. #201059
tried to resolve the issue by modifying the transformation phase, but it
was insufficient.
Instead of spending effort in the transformation phase, this patch adds
an additional check in the legality check and rejects such cases. I
think the cases rejected by this additional check are not very
practical, so the impact on realistic cases should be low, and it is
simpler than adjusting the wiring in the transformation phase.
This patch also effectively reverts #201059, as it is no longer
necessary.
Fix #201571.
[X86] Don't assert on EFLAGS copies in unreachable blocks (#203208)
X86FlagsCopyLowering collects the EFLAGS copies to lower using a
ReversePostOrderTraversal, which only visits blocks reachable from the
entry. Its end-of-pass verification, however, iterated over every block
in the function, so an EFLAGS copy left in an unreachable block (e.g.
produced by ISel for an always-taken branch whose other edge is dead)
tripped the "Unlowered EFLAGS copy!" assertion.
Such copies are harmless: the unreachable block is removed by the
unreachable-block elimination pass that runs right after this one,
before register allocation, so the copy never reaches a pass that cannot
handle it. Restrict the verification to reachable blocks (depth_first
from the entry) to match the set of blocks actually processed.
Found via fuzzing (llvm-isel-fuzzer).
Remove stale ARC graph names from reporting API
`reporting.get_data` accepted three graph names — `arcrate`, `arcactualrate`, `arcresult` — whose backing plugin classes were deleted during the ZFS netdata plugin rewrite. The Pydantic `Literal` and the in-memory `__graphs` dict drifted out of sync, so passing any of them crashed `netdata_get_data` with an uncaught `KeyError`.
Removed the dead names from `GraphIdentifier.name`'s `Literal` and docstring in both `v26_0_0/reporting.py` and `v27_0_0/reporting.py`. Added a `ReportingNetdataGetDataArgs.from_previous` on each so legacy WS clients walking the adapter chain get the dead entries silently filtered instead of a hard rejection at the final v27 boundary. Hardened the dispatch site in `plugins/reporting/graphs.py` to raise `CallError(ENOENT)` for any unknown name — mirroring what `netdata_graph` already does — so future schema/implementation drift surfaces as a clean RPC error rather than an unhandled exception.
[lldb-dap] Support loading core files through attachCommands (#202785)
The `attachCommands` attach option lets users bootstrap a session with
arbitrary LLDB commands, but a command that loaded a core (e.g. `target
create --core`) produced a broken session:
`ConfigurationDoneRequestHandler` would call `process.Continue()` on the
core and fail, because the non-live-session handling was keyed on the
`coreFile` attach argument rather than on the actual resulting process.
This teaches `AttachRequestHandler` to detect, after the attach commands
run, whether the selected process was loaded from a core via the
`SBProcess:: IsLiveDebugSession()` API added in #203111. When it is a
core, it sets `stop_at_entry` and clears `is_live_session`, mirroring
what the `coreFile` key does.
[StringMap] Invalidate iterators in remove() (#203249)
erase() bumps the epoch to invalidate iterators (#202237), but the
lower-level remove() — which detaches an entry without destroying it,
used
by ValueSymbolTable via Value::setName() — did not. Move the
incrementEpoch() into remove() so remove-while-iterating fails fast
under
LLVM_ENABLE_ABI_BREAKING_CHECKS too.
Aided by Claude Opus 4.8
Reland after lldb fix #203035
[mlir][LangRef] Clarify terminator continuations (#201111)
Document that terminators may have no normal control-flow continuation,
such as ub.unreachable. Also clarify that no-return calls do not remove
the structural terminator requirement.
Assisted-by: Codex
[KnownBits] Fix add() SelfAdd assertion for bitwidths >= 512 (#202769)
`KnownBits::add()` with `SelfAdd=true` lowers `X+X` to `shl(X, 1)` using
a fixed 8-bit shift amount:
```cpp
KnownBits Amt = KnownBits::makeConstant(APInt(8, 1));
return KnownBits::shl(LHS, Amt, NUW, NSW, /*ShAmtNonZero=*/true);
```
The comment there claims the shift-amount bitwidth is independent of the
source bitwidth, but that is not true: `shl()`'s `getMaxShiftAmount()`
extracts `Log2_32(BitWidth)` bits from the shift amount's max value when
`BitWidth` is a power of two:
```cpp
static unsigned getMaxShiftAmount(const APInt &MaxValue, unsigned BitWidth) {
if (isPowerOf2_32(BitWidth))
return MaxValue.extractBitsAsZExtValue(Log2_32(BitWidth), 0);
[23 lines not shown]
[LoopInterchange] Add tests for outer-variant inner IV step (NFC) (#202750)
Adds test cases for #202383 and #202401. Both have an induction variable
in the inner loop whose step value is not loop-invariant with respect to
the outer loop.
[NFC][AMDGPU][InsertWaitCnts] Move some simple functions into Utils
Move really trivial functions into helpers to declutter InsertWaitCnt a bit more.
I had to move HardwareLimits into a different header but it's only used in InsertWaitCnt so it doesn't matter.
[RFC][AMDGPU] Remove DebugCounter-based WaitCnt debugging
It's 8 years old, only used by a handful of tests, and has not been updated
in a while except for maintenance as far as I can see.
I don't mind keeping it in if there are users of it, but right now it
looks like a dead feature. If we want some more elaborate waitcnt debugging,
we should have a modern, generic system that works on any waitcnt, not
something specific to 3 GFX9 counters.
[AMDGPU][InsertWaitCnts] Move HWEvent analysis code
Building up on the previous RFC, if it is accepted:
Move the code that maps a MachineInstr to HWEventSet to a separate file.
This should be NFC.
[RFC][AMDGPU][InsertWaitCnt] Move WaitEventType into separate HWEvent header
I propose to move `WaitEventType` into its own header to start a new
component of the back-end targeted at analyzing and treating hardware events
fired by instructions. Right now this just moves code around and renames things
(NFCI) but over time, we should generalize the events so they can be reused
by other passes instead of being hyper-specialized for InsertWaitCnt.