[clang][bytecode] Optimize `interp::Record` a bit (#183494)
And things around it.
Remove the `FieldMap`, since we can use the field's index instead and
only keep an array around. `reserve()` the sizes and use
`emplace_back()`.
[TableGen] Complete the support for artificial registers
Artificial registers were added in eb0c510ecde667cd911682cc1e855f73f341d134
as a means of giving super-registers heavier weights than that
of their subregisters, even when they only contain a single
physical subregister.
Artifical registers thus do exist in code and participate in
register unit weight calculations, but are not supposed to be
available for register allocation.
This patch completes the support for artificial registers to:
- Ignore artificial registers when joining register unit uber
sets. Artificial registers may be members of classes that
together include registers and their sub-registers, making it
impossible to compute normalised weights for uber sets they
belong to.
[28 lines not shown]
[flang][OpenMP] Inline CheckNestedBlock, NFC (#181732)
CheckNestedBlock no longer calls itself, which was the primary reason
for the code to be in a separate function.
[AMDGPU] Hoist WMMA coexecution hazard V_NOPs from loops to preheaders (#176895)
On GFX1250, V_NOPs inserted for WMMA coexecution hazards are placed at
the use-site. When the hazard-consuming instruction is inside a loop and
the WMMA is outside, these NOPs execute every iteration even though the
hazard only needs to be covered once.
This patch hoists the V_NOPs to the loop preheader, reducing executions
from N iterations to 1.
```
Example (assuming a hazard requiring K V_NOPs):
Before:
bb.0 (preheader): WMMA writes vgpr0
bb.1 (loop): V_NOP xK, VALU reads vgpr0, branch bb.1
-> K NOPs executed per iteration
After:
bb.0 (preheader): WMMA writes vgpr0, V_NOP xK
[12 lines not shown]
[VPlan] Simplify ExitingIVValue and use for tail-folded IVs. (#182507)
Now that we have ExitingIVValue, we can also use it for tail-folded
loops; the only difference is that we have to compute the end value with
the original trip count instead the vector trip count.
This allows removing the induction increment operand only used when
tail-folding.
PR: https://github.com/llvm/llvm-project/pull/182507
AMDGPU: Stop adding uniform-work-group-size=false
This is one of the string attributes that takes a boolean
value for no reason. There is no point in ever writing this
with an explicit false. Stop adding the noise and reporting
an unnecessary change.
[SCEV] Introduce SCEVUse wrapper type (NFC)
Add SCEVUse as a PointerIntPair wrapper around const SCEV * to prepare
for storing additional per-use information.
This commit contains the mechanical changes of adding an intial SCEVUse
wrapper and updating all relevant interfaces to take SCEVUse. Note that
currently the integer part is never set, and all SCEVUses are
considered canonical.
[DAG] Move (X +/- Y) & Y --> ~X & Y fold from visitAnd to SimplifyDemandedBits (#183270)
Add DemandedElts handling to allow better vector support
To prevent RISCV falling back to a mul call in known-never-zero.ll I've
had to tweak the (mul step_vector(C0), C1) to (step_vector(C0 * C1))
fold to only occur if C0 is already non-power-of-2, C0 * C1 is a
power-of-2 or the target has good mul support.
[SPIRV] Simplify `selectPhi` and remove unreachable code (#183060)
Before it created a `OpPhi` with a Type argument, to immediately remove
this Type and change the opcode to `PHI`.
Only `TargetOpcode::PHI` get to `SPIRVTargetLowering::finalizeLowering`.
The `TargetOpcode::PHI` gets lowered to `SPIRV::OpPhi` much later, by
`patchPhi` in the `SPIRVModuleAnalysis`.
`SPIRVModuleAnalysis` is requested by the
`SPIRVAsmPrinter` through `getAnalysisUsage` (which is ugly).
```
[lldb] Fix issues handling ANSI codes and Unicode in option help (#183314)
Fixes #177570, and a bunch of FIXMEs for other tests known to be
incorrect.
To do this, I have adapted code from the existing ansi::TrimAndPad. At
first I tried a wrapper function, but there's a few things we need to
handle that cannot be done with a simple wrapper.
We must only split at word boundaries. This requires knowing whether the
last adjustment, which may be the final adjustment, was made at, or just
before, a word boundary. Also it must check for single words wider than
the requested width (though this you could do with a wrapper).
For this reason, the new TrimAtWordBoundary has more special case checks
and a more complex inner loop. Though the core is the same split into
left, ansi escape code and right that TrimAndPad uses.
It is that splitting that implements the "bias" we need to print
[20 lines not shown]
[Mips] Remove NoNaNsFPMath uses (#183045)
Remove `NoNaNsFPMath` by using `PatFrag`, we should only use `nnan`.
Duplicate tests in `CodeGen/Mips/llvm-ir/nan-fp-attr.ll` are removed.
[mlir][x86] Rename x86vector to x86 (#183311)
Renames 'x86vector' dialect to 'x86'.
This is the first PR in series of cleanups around dialects targeting x86
platforms.
The new naming scheme is shorter, cleaner, and opens possibility of
integrating other x86-specific operations not strictly fitting pure
vector representation. For example, the generalization will allow for
future merger of AMX dialect into the x86 dialect to create one-stop x86
operations collection and boost discoverability.
[NFC][AArch64] Extract MOVaddr* expansion model into AArch64ExpandImm
This makes the expansion logic reusable by getInstSizeInBytes in a
follow-up patch.
Revert "[lldb][Process/FreeBSDKernelCore] Implement DoWriteMemory()" (#183485)
Reverts llvm/llvm-project#183237
This was landed without addressing review comments.