[RegAlloc] Fix the terminal rule check for interfere with DstReg (#168661)
This maybe a bug which is introduced by commit
6749ae36b4a33769e7a77cf812d7cd0a908ae3b9, and has been present ever
since.
In this case, `OtherReg` always overlaps with `DstReg` cause they from
the `Copy` all.
[TableGen] Use MVT instead of MVT::SimpleValueType. NFC (#169180)
This improves type safety and is less verbose. Use SimpleTy only where
an integer is needed like switches or emitting a VBR.
---------
Co-authored-by: Sergei Barannikov <barannikov88 at gmail.com>
[llvm] Use llvm::equal (NFC) (#169173)
While I am at it, this patch uses const l-value references for
std::shared_ptr. We don't need to increment the reference count by
passing std::shared_ptr by value.
Identified with llvm-use-ranges.
[flang][OpenMP] Canonicalize loops with intervening OpenMP constructs (#169191)
Example based on the gfortran test a.6.1.f90
```
do 100 i = 1,10
!$omp do
do 100 j = 1,10
call work(i,j)
100 continue
```
During canonicalization of label-DO loops, if the body of an OpenMP
construct ends with a label, treat the label as ending the construct
itself.
This will also allow handling of cases like
```
do 100 i = 1, 10
!$omp atomic write
[2 lines not shown]
[flang][OpenMP] Canonicalize loops with intervening OpenMP constructs
Example based on the gfortran test a.6.1.f90
```
do 100 i = 1,10
!$omp do
do 100 j = 1,10
call work(i,j)
100 continue
```
During canonicalization of label-DO loops, if the body of an OpenMP
construct ends with a label, treat the label as ending the construct
itself.
This will also allow handling of cases like
```
do 100 i = 1, 10
!$omp atomic write
[3 lines not shown]
[VPlan] Share PreservesUniformity logic between isSingleScalar and isUniformAcrossVFsAndUFs
Extract the PreservesUniformity logic from isSingleScalar into a shared
static helper function. Update isUniformAcrossVFsAndUFs to use this
logic for VPWidenRecipe and VPInstruction, so that any opcode that
preserves uniformity is considered uniform-across-vf-and-uf if its
operands are.
This unifies the uniformity checking logic and makes it easier to extend
in the future.
This should effectively by NFC currently.
[AMDGPU] Enable serializing of allocated preload kernarg SGPRs info (#168374)
- Support serialization of the number of allocated preload kernarg SGPRs
- Support serialization of the first preload kernarg SGPR allocated
Together they enable reconstructing correctly MIR with preload kernarg
SGPRs.
ELF: Use index 0 for unversioned undefined symbols (#168189)
The GNU documentation is ambiguous about the version index for
unversioned undefined symbols. The current specification at
https://sourceware.org/gnu-gabi/program-loading-and-dynamic-linking.txt
defines VER_NDX_LOCAL (0) as "The symbol is private, and is not
available outside this object."
However, this naming is misleading for undefined symbols. As suggested
in
discussions, VER_NDX_LOCAL should conceptually be VER_NDX_NONE and apply
to unversioned undefined symbols as well.
GNU ld has used index 0 for unversioned undefined symbols both before
version 2.35 (see https://sourceware.org/PR26002) and in the upcoming
2.46 release (see https://sourceware.org/PR33577). This change aligns
with GNU ld's behavior by switching from index 1 to index 0.
While here, add a test to dso-undef-extract-lazy.s that undefined
symbols of index 0 in DSO are treated as unversioned symbols.
[VPlan] Create resume phis in scalar preheader early. (NFC) (#166099)
Create phi recipes for scalar resume value up front in addInitialSkeleton during initial construction. This will allow moving the remaining code dealing with resume values to VPlan transforms/construction.
PR: https://github.com/llvm/llvm-project/pull/166099
[AArch64] Mark FMOVvXfY_ns as rematerializable, cheap
Otherwise, the register allocator may spill and reload constants that
can be rematerialized with a single instruction.
[DFAJumpThreading] Try harder to avoid cycles in paths. (#169151)
If a threading path has cycles within it then the transformation is not
correct. This patch fixes a couple of cases that create such cycles.
Fixes https://github.com/llvm/llvm-project/issues/166868
[flang][OpenMP] Implement loop nest parser (#168884)
Previously, loop constructs were parsed in a piece-wise manner: the
begin directive, the body, and the end directive were parsed separately.
Later on in canonicalization they were all coalesced into a loop
construct. To facilitate that end-loop directives were given a special
treatment, namely they were parsed as OpenMP constructs. As a result
syntax errors caused by misplaced end-loop directives were handled
differently from those cause by misplaced non-loop end directives.
The new loop nest parser constructs the complete loop construct,
removing the need for the canonicalization step. Additionally, it is the
basis for parsing loop-sequence-associated constructs in the future.
It also removes the need for the special treatment of end-loop
directives. While this patch temporarily degrades the error messaging
for misplaced end-loop directives, it enables uniform handling of any
misplaced end-directives in the future.