[RegAlloc] Trace through COPYs to find rematerializable definitions (#190955)
After live range splitting, successful rematerialization in one split
interval can remove the original defining instruction, leaving only COPY
instructions in other split intervals. When attempting to rematerialize
uses in those intervals, the code fails to find the original definition
and gives up.
This patch traces backwards through COPY chains to recover the original
rematerializable definition instead of giving up.
[VPlan] Add Type* and getType() to VPSymbolicValue (NFC) (#195183)
Add a Type* field to VPSymbolicValue, along with a getType() methods to
query the stored scalar type.
This makes it easier to retrieve the type of various symbolic values,
and also simplifies VPTypeAnalysis construction.
PR: https://github.com/llvm/llvm-project/pull/195183
[analyzer] Clean up evalBind, fix bad logic (#196313)
This commit refactors `ExprEngine::evalBind` to eliminate the use of a
`NodeBuilder` and fix incorrect logic that was apparently introduced
because the `NodeBuilder` had obfuscated the underlying set operations.
In the special case when the engine is binding to an `Unknown` or
`Undefined` memory location, with the old code on each execution path
_either_ only the `check::Bind` checkers _or_ only the pointer escape
checkers were invoked. This commit ensures that on each execution path
_both_ the `check::Bind` checkers _and then_ the pointer escape checkers
get a chance to activate.
I'm pretty sure that the bad logic did not cause incorrect behavior of
the analyzer, because there are no `checkBind` checkers that generate
non-sink transitions when the location is `Unknown` or `Undefined`.
I also added an assertion that the location argument of `evalBind`
cannot be a `NonLoc`, because this is a common sense precondition, seems
to be actually true and makes it easier to reason about the behavior of
this function.
Reapply "[lldb] Do not refcount breakpoints in lldb-server" (#195858) (#196891)
This reapplies #195858 with a fix for 32-bit arm (and generally, any
architecture that uses software single-stepping). The problem was that
the temporary breakpoints used for single-stepping were interfering with
the breakpoints set by the client.
The fix is to check for existing breakpoints before setting the
temporary ones. To achieve this, I've separated the notion of "next PC
candidates for a thread" from "step breakpoints we've actually set".
The freebsd code had some software single stepping code, but:
- this was [introduced](https://reviews.llvm.org/D95802) for mips64
support, which was
[removed](https://github.com/llvm/llvm-project/pull/179582) earlier this
year
- AFAICT, this never worked since the original patch only checked
`m_threads_stepping_with_breakpoint`, but never set it to anything.
[18 lines not shown]
[libc] Add some types to netinet/in.h (#196932)
Not including more types because I need to fix in_addr definition first.
This exposes stdint macros and types through the header, but POSIX
permits that behavior (and explicitly requires that we define uint8_t
and uint32_t).
No test as this is just adding a typedef, and I don't *think* we have
tests for that, but I can add a "check that type is defined" test if
that is desirable.
[GlobalISel] Recursively Optimise MatchTable Matchers
The core of this change is the additional call to `Matcher::optimize()` in the `optimizeRules` function,
which enables the match table optimization logic to recurse on the children of every GroupMatcher, forming
additional groups (which hoist more common predicates into a shared group).
To enable that, I had to update the `getFirstConditionAsRootType` implementation to support `GroupMatcher`.
I also included a small refactoring of the match table optimization pipeline that was identical between the
GlobalISel and GlobalISelCombiner emitters.
The results of this change are up to a 25% size reduction for GlobalISel match tables.
There is a tiny increase (a few bytes) in a combiner table because we now create new groups
(which need up to 3 additional opcodes because of the new `Try` and `Reject` required) to hoist one predicate for only 2 rules, which
result in a small net negative change (one or two more ops).
I used a small bash script to compare all relevant files, this is the before/after:
```
FILE OLD NEW DIFF% SAME?
---- ------- ------- ----- -----
[8 lines not shown]
[GlobalISel][AMDGPU][AArch64] Fix GlobalISel copy propagation (#188781)
Disallow propagation of sub-registers after GlobalISel, as the current
code is blindly dropping any sub-register information. This also fixes
bugs in AArch64 and AMDGPU back-end that rely on the incorrect behavior
and would fail with the fix:
* Update `selectG_UNMERGE_VALUES` in AMDGPU so instead of generating
`hi16` for SGPR it shifts higher bits into the destination register
using `lshr`.
* Prevent AArch64 back-end from generating spurious `sub_32:gpr32all`
when selecting copy.
* Test changes: `fpto[s/u]i-sat-vector.ll`: The correct number of
conversions is now generated as higher 16-bits are handled correctly;
however, it introduces `lshr` instructions. This should be resolved in
#188287 by enabling `s_cvt_hi_*`.
[TableGen] Add submulticlass typechecking to template arg values (#197128)
Some typechecking was missing when parsing a submulticlass reference.
Add the CheckTemplateArgValues call in ParseSubMultiClassReference.
Resolves https://github.com/llvm/llvm-project/issues/84910.
[LifetimeSafety] Diagnose invalidated-field (#196680)
Teach lifetime safety invalidation diagnostics to handle origins that
escape through fields before the referenced object is invalidated.
Previously they were skipped.
Partially addresses https://github.com/llvm/llvm-project/issues/195706
[InstCombine] Relax the requirements for (X ^ C2) + C -> (C2 + C) - X (#196897)
If (C2 - X) has no borrow between bits, it is equivalent to (X ^ C2).
A borrow would occur when c2_bit=0 and x_bit=1.
It follows that c2_bit=1 or x_bit=0 means no borrow.
Remove an artificial condition that C2 must be a low bits mask.
Proof: https://alive2.llvm.org/ce/z/uNMsg_