[AArch64][GlobalISel] Legalize F64 to BF16 fptruncates (#196077)
This two-step expansion of bf16 fptrunc steps needs to be careful to
avoid double-rounding error. Under AArch64 we can apparently convert to
a fcvtxn that performs round-to-odd, followed by a standard fp truncate
to bf16 to make sure the rounding from there is done correctly. This
reuses the existing lowering added for vector operations.
[Clang][Modules] Fix -Wunused-variable (#196577)
Mark some variables [[maybe_unused]] and inline others that do not have
side effects to avoid -Wunused-variable in non-assert builds.
[Object][Wasm] Fix off-by-one in data segment name index validation (#196338)
The check `Index > DataSegments.size()` in `parseNameSection()` allows
`Index == DataSegments.size()`, which is an out-of-bounds access.
In an assertions-disabled ASan build, a malformed wasm object with one
data segment and a data segment name entry using index 1 triggers a
heap-buffer-overflow READ in `WasmObjectFile::parseNameSection()`.
Fix by checking `Index >= DataSegments.size()` instead.
Also add a regression test that verifies the malformed input is rejected
with "invalid data segment name entry".
[libc] Fix op_tests Memcmp guard to require SSE4.1 (#196572)
The is_vector<__m128i> specialisation in op_x86.h is gated on
__SSE4_1__, but op_tests.cpp included generic::Memcmp<__m128i> under the
weaker __SSE2__ guard. On baseline x86-64 (where __SSE2__ is always
defined but __SSE4_1__ may not be), this caused a static_assert failure
in is_element_type_v.
Changed the guard from __SSE2__ to __SSE4_1__ to match the
specialisation requirement, consistent with how BcmpImplementations
already guards its __m128i entry.
Assisted-by: Automated tooling, human reviewed.
[DAG] canCreateUndefOrPoison - ISD::FCEIL/FFLOOR/FTRUNC/FRINT/FNEARBYINT/FROUND/FROUNDEVEN can never create poison/undef (#196543)
Also add missing fold support for ftrunc(fround(x)) -> fround(x)
clang: Consolidate -aux-triple handling
All of the offload languages were essentially doing the
same thing, with overcomplicated conditions conditional on
the language.
[AMDGPU] Pre-commit unit test for RP tracking reset/advance behavior
This adds a new AMDGPU unit test file for testing the behavior of
`GCNRPTracker` and its related classes. The two test showcase confusing
return value and behavioral semantics for variants of the advance and
reset functions, which will be clarified in a follow up commit.
This also moves some common test helpers from other AMDGPU unit tests to
the `AMDGPUUnitTests` TU to avoid repetition between unit tests.
[CodeGen][AMDGPU] Move boilerplate unit test code to base class (NFC) (#196547)
This adds the `CodeGenTestBase` class to handle boilerplate code for
codegen unit tests and makes use of it wherever possible, in particular
in AMDGPU unit tests.
Furthermore, this makes all AMDGPU unit tests rely on GoogleTest's API
for "run once per test-suite" code, instead of re-implementing that
behavior using a `std::once` flag. As a consequence all TEST(...) become
TEST_F(...).
This patch enables the fexec-charset option to control the execution charset of string literals. It sets the default internal charset, system charset, and execution charset for z/OS and UTF-8 for all other platforms.
[flang][docs] Removed HighLevelFIR transition plan section (#196227)
Removed the "Transition Plan" section from flang/docs/HighLevelFIR.md,
since the transition has completed a long time ago and the legacy
lowering code is being removed now.
[X86] Remove tests for non-existant intrinsics. NFC (#196237)
There is no PSRAQ instruction until AVX512. The incorrect intrinsic
names were just being interpreted as a call to an external functional.