Reland yet again: [mlir] Add FP software implementation lowering pass: `arith-to-apfloat` (#167608)
Fix both symbol visibility issue in the mlir_apfloat_wrappers lib and the linkage issue in ArithToAPFloat.
[clang][bytecode] Fix diagnosing subtration of zero-size pointers (#167839)
We need to get the element type size at bytecode generation time to
check. We also need to diagnose this in the LHS == RHS case.
[libunwind] Ensure zaDisable() is called in jumpto/returnto (NFC) (#167674)
This is an NFC for now, as the SME checks for macOS platforms are not
implemented, so zaDisable() is a no-op, but both paths for resuming from
an exception should disable ZA.
This is a fixup for a recent change in #165066.
[libc++] Merge is_{,un}bounded_array.h into is_array.h (#167479)
These headers are incredibly simple and closely related, so this merges
them into a single one.
[libc++] Add an initial modulemap for the test support headers (#162800)
This should improve the time it takes to run the test suite a bit. Right
now there are only a handful of headers in the modulemap because we're
missing a lot of includes in the tests. New headers should be added
there from the start, and we should fill up the modulemap over time
until it contains all the test support headers.
[libc++] Implement our own is{,x}digit functions for the C locale (#165467)
The C locale is defined by the C standard, so we know exactly which
digits classify as (x)digits. Instead of going through the locale base
API we can simply implement functions which determine whether a
character is one ourselves, and probably improve codegen significantly
as well that way.
Revert "[compiler-rt] [builtins] Remove unused/misnamed x86 chkstk functions"
This reverts parts of commit 885d7b759b5c166c07c07f4c58c6e0ba110fb0c2,
and adds verbose comments explaining all the variants of this
function, for clarity for future readers.
It turns out that those functions actually weren't misnamed or
unused after all: Apparently Clang doesn't match GCC when it comes
to what stack probe function is referenced on i386 mingw. GCC < 4.6
references a symbol named "___chkstk", with three leading underscores,
and GCC >= 4.6 references "___chkstk_ms".
Restore these functions, to allow linking object files built with
GCC with compiler-rt.
[clang-tidy][docs][NFC] Enforce 80 characters limit (1/N) (#167492)
Fix documentation in `abseil`, `android`, `altera`, `boost` and
`bugprone`.
This is part of the codebase cleanup described in
[#167098](https://github.com/llvm/llvm-project/issues/167098)
[CI] Fix misspelled runtimes_targets variable (#167696)
This was preventing check-compiler-rt from actually running when we
touched a project that was supposed to cause compiler-rt to be tested.
[LoongArch] Support memcmp expansion for vectors and combine for i128/i256 setcc
This commit enables memcmp expansion for lsx/lasx. After doing
this, i128 and i256 loads which are illegal types on LoongArch
will be generated. Without process, they will be splited to
legal scalar type.
So this commit also enable combination for `setcc` to bitcast
i128/i256 types to vector types before type legalization and
generate vector instructions.
Inspired by x86 and riscv.
[RISCV][GISel] Fallback to SelectionDAG for vleff intrinsics. (#167776)
Supporting this in GISel requires multiple changes to IRTranslator to
support aggregate returns containing scalable vectors and non-scalable
types. Falling back is the quickest way to fix the crash.
Fixes #167618
AMDGPU: Really use AV classes by default for vector classes
Update getRegClassFor to use AV classes in place of VGPRs for
gfx90a-gfx950. There are a handful of regressions. Most are
enabling unprofitable rematerialization which reduce register
count by 1 but add an unnecessary instruction.
32-bitcase
Note this does very little because we only use VGPR classes
for FP types (though this doesn't particularly make any sense),
and we legalize normal loads and stores to integer.
[AMDGPU] Insert `s_wait_xcnt(0)` before atomics to work around write-combining miss hazard
This patch adds a workaround for a hazzard on GFX1250, which inserts an `s_wait_xcnt(0)` instruction before any atomic operation that might write to memory.
Fixes SWDEV-543703.
AMDGPU: Start to use AV classes for unknown vector class
Use AGPR+VGPR superclasses for gfx90a+. The type used
for the class should be the broadest possible class, to
be contextually restricted later. InstrEmitter clamps these
to the common subclass of the context use instructions, so we're
best off using the broadest possible class for all types.
Note this does very little because we only use VGPR classes
for FP types (though this doesn't particularly make any sense),
and we legalize normal loads and stores to integer.
[AMDGPU] Insert `s_wait_xcnt(0)` before atomics to work around write-combining miss hazard
This patch adds a workaround for a hazzard on GFX1250, which inserts an `s_wait_xcnt(0)` instruction before any atomic operation that might write to memory.
Fixes SWDEV-543703.
[clang-doc] lift Mustache template generation from HTML
To prepare for more backends to use Mustache templates, this patch lifts
the Mustache functionality from HTMLMustacheGenerator.cpp to
Generators.h. A MustacheGenerator interface is created to share code for
template creation.
[mlir][linalg] Fix Linalg runtime verification test (#167814)
This integration test has been broken for a while. This commit partially
fixes it.
- Use `CHECK` + `CHECK-NEXT` to ensure that the correct error lines are
matched together.
- Move all `CHECK-NOT` to the end. Having a `CHECK` with the same string
does not make sense after a `CHECK-NOT`.
- Add a missing `CHECK: ERROR` for one of the test cases.
- Deactivate `reverse_from_3`, which is broken, and put a TODO.