[libclc] Base the build around `add_sources` instead of source list (#197034)
Summary:
The current build uses a curated + deduplicated source list. This PR
seeks to simplify this a little bit and canonicalize the behavior.
Now we create the target up-front, `clc` and `opencl`. We add the
directories which add sources to this target. We normalize the
architecture to the variants. We always add target specific versions
first. When we add sources we check if the file already exists and defer
to the architecture specific one.
This normalized the behavior, the directories are now laid out like this
`clc/<arch>/<os>`. We normalize these to `amdgpu`, `nvptx`, and `spirv`
respectively. We use the OS for the newly created vulkan target. We now
control variants via checking if the directory for that exists, so it's
nested more naturally.
Hopefully this makes more sense, the goal is to exercise the fact that
we have individual builds now. Previously this did not work because you
could not add_subdirectory more than once.
[RegAlloc] Trace through COPYs to find rematerializable definitions (#190955)
After live range splitting, successful rematerialization in one split
interval can remove the original defining instruction, leaving only COPY
instructions in other split intervals. When attempting to rematerialize
uses in those intervals, the code fails to find the original definition
and gives up.
This patch traces backwards through COPY chains to recover the original
rematerializable definition instead of giving up.
[VPlan] Add Type* and getType() to VPSymbolicValue (NFC) (#195183)
Add a Type* field to VPSymbolicValue, along with a getType() methods to
query the stored scalar type.
This makes it easier to retrieve the type of various symbolic values,
and also simplifies VPTypeAnalysis construction.
PR: https://github.com/llvm/llvm-project/pull/195183
[X86] Manage atomic store of fp -> int promotion in DAG
When lowering atomic <1 x T> vector types with floats, selection can fail since
this pattern is unsupported. To support this, floats can be casted to
an integer type of the same size.
[RFC][AMDGPU] Add BARRIER address space
Add a new BARRIER address space that is used for global variables that are used to represent the barrier IDs in GFX12.5.
These barrier addresses just have values corresponding 1-1 to barrier IDs. They are still implemented on top of LDS, but the offsetting happens during an addrspacecast to generic, not whenever the barrier GV is used.
The motivation for this is to make the relation between LDS and barrier GVs explicit in the compiler. It does add a bit more complexity, but that complexity was already there, just hidden by pretending barrier GVs were actual LDS.
[analyzer] Clean up evalBind, fix bad logic (#196313)
This commit refactors `ExprEngine::evalBind` to eliminate the use of a
`NodeBuilder` and fix incorrect logic that was apparently introduced
because the `NodeBuilder` had obfuscated the underlying set operations.
In the special case when the engine is binding to an `Unknown` or
`Undefined` memory location, with the old code on each execution path
_either_ only the `check::Bind` checkers _or_ only the pointer escape
checkers were invoked. This commit ensures that on each execution path
_both_ the `check::Bind` checkers _and then_ the pointer escape checkers
get a chance to activate.
I'm pretty sure that the bad logic did not cause incorrect behavior of
the analyzer, because there are no `checkBind` checkers that generate
non-sink transitions when the location is `Unknown` or `Undefined`.
I also added an assertion that the location argument of `evalBind`
cannot be a `NonLoc`, because this is a common sense precondition, seems
to be actually true and makes it easier to reason about the behavior of
this function.
[NFCI][clang] Allow overriding any global variable address space
Allow the target to change the AS of a global variable at will, not just whenever Clang cannot assign one.
This enables the next patch that will specialize LDS GVs for barriers as a separate address space.
[NFC][AMDGPU] Generalize some LDS MemoryUtils
In preparation for upcoming work, I need some functions used by the LDS lowering
system to work on any GV. I removed the LDS specific queries inside these functions
and replaced them with functors passed by the caller, so these utility functions can be reused.
I also cleaned-up a few things that weren't up to code, such as lowercase variable names.
[GlobalISel] Recursively Optimise MatchTable Matchers
The core of this change is the additional call to `Matcher::optimize()` in the `optimizeRules` function,
which enables the match table optimization logic to recurse on the children of every GroupMatcher, forming
additional groups (which hoist more common predicates into a shared group).
To enable that, I had to update the `getFirstConditionAsRootType` implementation to support `GroupMatcher`.
I also included a small refactoring of the match table optimization pipeline that was identical between the
GlobalISel and GlobalISelCombiner emitters.
The results of this change are up to a 25% size reduction for GlobalISel match tables.
There is a tiny increase (a few bytes) in a combiner table because we now create new groups
(which need up to 3 additional opcodes because of the new `Try` and `Reject` required) to hoist one predicate for only 2 rules, which
result in a small net negative change (one or two more ops).
I used a small bash script to compare all relevant files, this is the before/after:
```
FILE OLD NEW DIFF% SAME?
---- ------- ------- ----- -----
[8 lines not shown]
Reapply "[lldb] Do not refcount breakpoints in lldb-server" (#195858) (#196891)
This reapplies #195858 with a fix for 32-bit arm (and generally, any
architecture that uses software single-stepping). The problem was that
the temporary breakpoints used for single-stepping were interfering with
the breakpoints set by the client.
The fix is to check for existing breakpoints before setting the
temporary ones. To achieve this, I've separated the notion of "next PC
candidates for a thread" from "step breakpoints we've actually set".
The freebsd code had some software single stepping code, but:
- this was [introduced](https://reviews.llvm.org/D95802) for mips64
support, which was
[removed](https://github.com/llvm/llvm-project/pull/179582) earlier this
year
- AFAICT, this never worked since the original patch only checked
`m_threads_stepping_with_breakpoint`, but never set it to anything.
[18 lines not shown]