[AMDGPU] Verify dominance when rewriting spills to registers
Rev1: Updated condition to check for "joint domination", i.e. no reload
is reachable from entry without reaching a store to the same slot. Still
working on reduced test or unit test.
When performing spill elimination in the AGPR copy rewrite pass it was
possible to see spill reloads that were not jointly dominated by any
store. This caused invalid MIR to be generated where vreg uses were not
dominated by defs. This patch adds a joint dominance check before
rewriting spills.
[AMDGPU] Add amdgpu-lower-exec-sync pass to lower named-barrier globals (#165692)
This PR introduces `amdgpu-lower-exec-sync` pass which specifically
lowers named-barrier LDS globals introduced by #114550 .
Changes include:
- Moving the logic of lowering named-barrier LDS globals from
`amdgpu-lower-module-lds` pass to this new pass.
- This PR adds the pass to pipeline, remove the existing lowering logic for
named-barrier LDS in `amdgpu-lower-module-lds`
See #161827 for discussion on this topic.
[ADT] Add roundUpNumBuckets to DenseMap (NFC) (#168301)
This patch adds computeNumBuckets, a helper function to compute the
number of buckets.
This is part of the effort outlined in #168255. This makes it easier
to move the core logic of grow() to DenseMapBase::grow().
[NFC][Clang][Test] Drop calling convention check from address-space-conversions.cpp (#167261)
Calling convention is irrelevant to address space verification and adds
complixity for other target triples.
[Object] Add getRISCVVendorRelocationTypeName to render RISCV vendor-specific relocations to strings. (#168293)
This will be used in places like LLD to render them for error messages.
[AMDGPU] TableGen-erate SDNode descriptions (#168248)
This allows SDNodes to be validated against their expected type profiles
and reduces the number of changes required to add a new node.
Autogenerated node names start with "AMDGPUISD::", hence the changes in
the tests.
The few nodes defined in R600.td are *not* imported because TableGen
processes AMDGPU.td that doesn't include R600.td. Ideally, we would have
two sets of nodes, but that would require careful reorganization of td
files since some nodes are shared between AMDGPU/R600. Not sure if it
something worth looking into.
Some nodes fail validation, those are listed in
`AMDGPUSelectionDAGInfo::verifyTargetNode()`.
Part of #119709.
Pull Request: https://github.com/llvm/llvm-project/pull/168248
[ADT] Move initWithExactBucketCount to DenseMapBase (NFC) (#168283)
This patch moves initWithExactBucketCount and ExactBucketCount to
DenseMapBase to share more code.
Since SmallDenseMap::allocateBuckets always returns true,
initWithExactBucketCount is equivalent to:
void initWithExactBucketCount(unsigned NewNumBuckets) {
allocateBuckets(NewNumBuckets);
initEmpty();
}
for SmallDenseMap.
Note that ExactBucketCount is not used within DenseMapBase yet.
This moves us closer to the storage policy idea outlined in #168255.
[VPlan] Delegate to other VPInstruction constructors. (NFCI)
Update VPInstruction constructor to delegate to constructor with more
comprehensive checking and validation.
This required updating some unit tests, to make sure the constructed
VPInstructions are valid.
[CodeGen] Remove a redundant declaration (NFC) (#168285)
EnableFSDiscriminator is declared in DebugInfoMetadata.h.
Identified with readability-redundant-declaration.
[SLP]Do not consider split nodes, when checking parent PHI-based nodes
The compiler should not consider split vectorize nodes, when checking
for non-schedulable PHI-based parent nodes. Only pure PHI nodes must be
considered, they only can be considered as explicit users, split nodes
are not.
Fixes #168268