[CIR] Add AtomicFenceOp and signal/thread fence builtins and required helpers (#168346)
This PR adds the AtomicFenceOp and signal/thread fence builtins.
DAG: Use RuntimeLibcalls to legalize vector frem calls
This continues the replacement of TargetLibraryInfo uses in codegen
with RuntimeLibcallsInfo started in 821d2825a4f782da3da3c03b8a002802bff4b95c.
The series there handled all of the multiple result calls. This
extends for the other handled case, which happened to be frem.
For some reason the Libcall for these are prefixed with "REM_", for
the instruction "frem", which maps to the libcall "fmod".
[clang] Revert changes to prefer the toolchain-provided libc++.dylib
This patch reverts the change that made clang prefer the toolchain
libc++.dylib when there is one (#170303), and the subsequent test
workaround we landed to fix bots (#170912).
We are seeing some failure on macOS LLDB bots that need to be
investigated, and that will require more time than I can spare
before the end of today.
This reverts commits bad1a88963 and 190b8d0b.
[Profile] Fix debuginfod test with internal shell
The recent relanding of the internal shell broke one of the debuginfod
tests as it is not tested by any upstream buildbot due to the use of
curl. Rewriting the test to not use subshells is pretty simple.
[compiler-rt] Add CMake flag for AArch64 Linux with 39-bit VA. (#167028)
Sanitizers currently assume AArch64 Linux has 48-bit VA. Followup
patches will add checks for this flag to asan and hwasan.
[VPlan] Use strict whitespace checks for VPlan printing test.
Use --strict-whitespace for vplan-printing.ll to catch stray
whitespaces. The test updates show a few places where we currently emit
those.
[flang][NFC] Strip trailing whitespace from tests (11 of 14)
Only some fortran source files in flang/test/Semantics have been
modified. The remaining files will be cleaned up in subsequent commits.
[RISCV] LMUL lists for indexed and strided loads (#169756)
Create additional lists representing valid LMULs for strided and indexed
load of particular element sizes.
[flang][OpenMP] Store list of expressions in InitializerT
The INITIALIZER clause holds a stylized expression that can be
intiantiated with different types. Currently, the InitializerT
class only holds one expression, which happens to correspond to
the first type in the DECLARE_REDUCTION type list.
Change InitializerT to hold a list of expressions instead, one
for each type. Keep the lowering code unchanged by picking the
first expression from the list.
AMDGPU/PromoteAlloca: Simplify how deferred loads work (#170510)
The second pass of promotion to vector can be quite simple. Reflect that
simplicity in the code for better maintainability.
[clang] Temporarily disable Darwin test for linking against libc++ on non-darwin systems (#170912)
Disable the test added in #170303, which breaks bots that don't use ld
as their linker. This is a temporary and narrow disablement of the test
until we can make it more general again, to get the bots green.
Co-authored-by: Louis Dionne <ldionne.2 at gmail.com>
[SPIRV] Add `<2 x half>` and `<4 x half>` atomics via `SPV_NV_shader_atomic_fp16_vector` (#170213)
This adds support for the `SPV_NV_shader_atomic_fp16_vector` extension,
and then uses it to enable lowering of atomic add, sub, min and max on 2
and 4 component vectors of FP16, which are rather common options in ML
workloads. Even though `bfloat16` also works in practice, we do not
enable it since it's not specified in the extension (which might need
updating / promoting to KHR at least). A `TODO` is also inserted in
`SPIRVModuleAnalysis.cpp' regarding the need to upgrade its ample usage
of `report_fatal_error`; I have a WiP patch for that, but it still needs
a bit of baking. Finally, a paired patch will be necessary in the
Translator, as it's not aware of the extension either - I'll update this
review to reference the PR once I create it.
[AArch64] Add isAppleMLike helper to check for M cores and aligned CPUs. (#170553)
Add a new isAppleMLike helper, that returns true if the core is part of
the Apple M core family or Apple A14 or later. Used to apply cost
decisions consistently to those groups of cores.
The function is now a single place to update when new cores are added.
It also makes sure we apply unrolling decisions for newer Apple cores to
Apple A17.
PR: https://github.com/llvm/llvm-project/pull/170553
[flang][OpenMP] Reject END DO on construct that crosses label-DO (#169714)
In a label-DO construct where two or more loops share the same
teminator, an OpenMP construct must enclose all the loops if an
end-directive is present. E.g.
```
do 100 i = 1,10
!$omp do
do 100 j = 1,10
100 continue
!$omp end do ! Error, but ok if this line is removed
```
Fixes https://github.com/llvm/llvm-project/issues/169536.