AMDGPU: Handle invariant loads when considering if a load can be scalar
Doesn't touch the globalisel version because the handling
there looks a bit broken.
Reapply "DAG: Allow select ptr combine for non-0 address spaces" (#168292) (#168786)
This reverts commit 6d5f87fc4284c4c22512778afaf7f2ba9326ba7b.
Previously this failed due to treating the unknown MachineMemOperand
value as known uniform.
[flang] Switch select-case-statement.f90 to new lowering (#168754)
test/Lower/select-case-statement.f90 was still using the old lowering.
Modified the test with FIR generated using the new lowering. Changed the
test to use flang_fc1 instead of bbc and added testing for -O0 and -O1,
since character comparison lowering is done differently at -O0 (uses
runtime function) and -O1 (inlines some cases). Use different FileCheck
prefixes for different optimization levels (CHECK-O0 for -O0, CHECK-O1
for -O1, CHECK for both).
[SDAG] Fix whitespace errors (NFC) (#168897)
To make life easier for future contributors. Note that formatting
changes are due to git clang-format on the touched whitespace-error
lines.
[profcheck] Exclude `naked`, asm-only functions from profcheck (#168447)
We can't do anything meaningful to such functions: they aren't optimizable, and even if inlined, they would bring no code open to optimization.
AMDGPU: Handle invariant loads when considering if a load can be scalar
Doesn't touch the globalisel version because the handling
there looks a bit broken.
Reapply "DAG: Allow select ptr combine for non-0 address spaces" (#168292)
This reverts commit 6d5f87fc4284c4c22512778afaf7f2ba9326ba7b.
Previously this failed due to treating the unknown MachineMemOperand
value as known uniform.
AMDGPU: Fix treating divergent loads as uniform (#168785)
Avoids regression which caused the revert 6d5f87fc42.
This is a hack on a hack. We currently have isUniformMMO,
which improperly treats unknown source value as known uniform.
This is hack from before we had divergence information in the
DAG, and should be removed. This is the minimum change to avoid
the regression; removing the aggressive handling of the unknown
case (or dropping isUniformMMO entirely) are more involved fixes.
Reapply "[compiler-rt] Default to Lit's Internal Shell (#168232)" (#168760)
This reverts commit eb20b5392599996ce94e4c0392095cacaa33687c.
This relands the compiler-rt internal shell after XRay and Darwin tests
that were failing under the internal shell have been fixed.
[LoopPeel] Fix BFI when peeling last iteration without guard (#168250)
LoopPeel sometimes proves that, when reached, the original loop always
executes at least two iterations. LoopPeel then unconditionally executes
both the remaining loop's initial iteration and the peeled final
iteration. But that increases the latter's frequency above its frequency
in the original loop. To maintain the total frequency, this patch
compensates by decreasing the remaininng loop's latch probability.
This is another step in issue #135812 and was discussed at
<https://github.com/llvm/llvm-project/pull/166858#discussion_r2528968542>.
Fix build breakage when using modules (#168883)
Commit c9f573463ebd7b4e46da4877802f2364f700e54a removed the file
TargetLibraryInfo.def but did not remove it from the module map.
[LoongArch] TableGen-erate SDNode descriptions (#168129)
This allows SDNodes to be validated against their expected type profiles
and reduces the number of changes required to add a new node.
I had to split `VSHUF4I` into two variants (`VSHUF4I` and `VSHUF4I_D`)
since `loongarch_vshuf4i` and `loongarch_vshuf4i_d` have different
number of operands, and this prevented the node from being imported.
There is just one node that currently fails validation, see
`LoongArchSelectionDAGInfo::verifyTargetNode()`.
Part of #119709.
Pull Request: https://github.com/llvm/llvm-project/pull/168129
[CMake] handle the AIX form of the lto cache dir option (#168868)
This handles the AIX form of the thinLTO cache dir option, which get's
turned on when thinLTO is enabled.
[mlir][spirv] Add support for SwitchOp (#168713)
The dialect implementation mostly copies the one of `cf.switch`, but
aligns naming to the SPIR-V spec.
[CodeGen] Use MCRegister in MachineBasicBlock::liveout_iterator. NFC (#168834)
MachineBasicBlock::liveout_begin() calls this constructor with
MCRegisters so this removes an implicit cast.
Fix build breakage when using modules
Commit c9f573463ebd7b4e46da4877802f2364f700e54a removed the file TargetLibraryInfo.def but did not remove it from the module map.
VectorCombine/AMDGPU: Cleanup a test and add a new one (#168817)
The existing, recently added test contains a whole lot of noise in the
form of dead instructions. Also, prefer named values.
The new test isolates a separate issue with concatenating i8 vectors.
[flang][OpenMP] Better diagnostics for invalid or misplaced directives
Add two more AST nodes, one for a misplaced end-directive, and one for
an invalid string following the OpenMP sentinel (e.g. "!$OMP XYZ").
Emit error messages when either node is encountered in semantic analysis.
[flang][OpenMP] Implement loop nest parser
Previously, loop constructs were parsed in a piece-wise manner: the
begin directive, the body, and the end directive were parsed separately.
Later on in canonicalization they were all coalesced into a loop
construct. To facilitate that end-loop directives were given a special
treatment, namely they were parsed as OpenMP constructs. As a result
syntax errors caused by misplaced end-loop directives were handled
differently from those cause by misplaced non-loop end directives.
The new loop nest parser constructs the complete loop construct,
removing the need for the canonicalization step. Additionally, it is
the basis for parsing loop-sequence-associated constructs in the future.
It also removes the need for the special treatment of end-loop
directives. While this patch temporarily degrades the error messaging
for misplaced end-loop directives, it enables uniform handling of any
misplaced end-directives in the future.