[JITLink][COFF] Synthesize __imp_ IAT entries (#203906)
Adds a default COFF/x86_64 JITLink pass that synthesizes `__imp_` Import
Address Table (IAT) entries for dllimport references. This allows COFF
objects using dllimport to be JIT-linked without a hand-built import library or
a special generator.
On COFF, `__declspec(dllimport)` codegen emits indirect accesses through a named
`__imp_X` symbol (`callq *__imp_bar(%rip)`; `movq __imp_g(%rip)` for data),
with `__imp_X` left undefined. JITLink had no handling for this. The new pass —
the COFF counterpart of the ELF/Mach-O GOT builder — defines each undefined
external `__imp_X` over an 8-byte slot holding the address of `X`, and leaves `X`
as an ordinary external to be resolved normally (import library, dynamic-library
search generator, etc.). Both the call and data-access forms then resolve
indirectly through the slot.
Rather than the `GOTTableManager` pattern (anonymous entry + edge redirection),
the pass defines the *named* `__imp_X` symbol over the slot. ELF GOT references
are nameless edge kinds, so that builder must create an anonymous entry and
[14 lines not shown]
CodeGenPassBuilder: Use cl::boolOrDefault directly in CGPassBuilderOption (#204196)
Current implementation that uses std::optional<bool> captures cl::BOU_FALSE,
for example -global-isel=0, as true. Explictly setting option to 0 should be
false, forced option not set.
This could be fixed but I find it cleaner to use boolOrDefault directly and
use same logic as in TargetPassConfig.
Options EnableIPRA and EnableGlobalISelAbort are left as optional since for
them it is explicitly checked if they are set using getNumOccurrences.
boolOrDefault has encoded unset option.
[libc++][test] Migrate _BitInt probe to __BITINT_MAXWIDTH__ and fix latent test bugs (#203876)
`libcxx` tests gate `_BitInt` blocks on `TEST_HAS_EXTENSION(bit_int)`,
which is not a recognized Clang extension and returns 0 in every
language mode. The blocks have been compiling as dead code, hiding
latent bugs across 23 files.
Migrate to a `TEST_HAS_BITINT` helper backed by the standard
`__BITINT_MAXWIDTH__`. The latent bugs the activation surfaces are fixed
in the same commit:
- overflow-safe `min`;
- post-P4052R0 saturating-arithmetic renames plus a
`clang-21`/`apple-clang-21` skip for `saturating.bitint.pass.cpp` (Clang
21 asserts in constexpr eval on non-byte-aligned `_BitInt`);
- an `intcmp` syntax fix;
- `byteswap.verify` directive tightening;
- a missing `<climits>` include in `byteswap.pass` (only visible under
`-fmodules`);
- C++03-compatible `static_assert` form in `digits10`; gating
[13 lines not shown]
[clang][bytecode] Check const writes more thorougly (#204529)
We used to only have a list of blocks under construction, but now we
have a list of pointers, which gives us more information.
Use this new list to diagnose a case we couldn't previously diagnose.
The test case is from `constant-expression-cxx14.cpp` and shows that a
write to a const member is invalid, even if the parent object is being
constructed right now.
[AMDGPU][doc] Refactor Barrier Execution Model
Remove everything that has to do with named barriers and put it in a series of model extensions specific to /sbarrier/named-barriers.
I had to change a few things to make it fit, in summary:
Base Model:
* Stylistic changes that make it easier to refer to specific rules. Each rule is in a rubric instead of a bullet point.
* (-) No longer defines `barrier-mutually-exclusive`
* (-) No longer defines barrier `join` and any associated rule.
New named barrier extensions
* Define "named barrier" as a sub-type of barrier objects. This makes barrier-mutually-exclusive redundant.
* Define barrier join as an op that can exclusively be done on `named barrier objects`.
* Define rules relating to join and its ordering with other barrier operations
Following these changes, the target tables changed a bit as well.
[2 lines not shown]
[AArch64] Add SVE shuffle optimization pass (#193951)
Add a pass to perform VLA shuffle optimizations for SVE.
First up is using tbl to replace deinterleave4+uunpk+zext/uitofp
by generating shuffle masks with index, exploiting the fact that
out-of-range indices in the mask produce zeroes in the result
vector. That way, we can easily zero-extend smaller elements
by using the destination type when generating the mask, and
having one index in range with several out-of-range for each
destination element.
[Delinearization] Narrow the scope of the term collection (#204145)
In parametric delinearization, it collects subexpressions whose SCEV
type is `SCEVUnknown` and uses them as candidates for the array
dimensions. When traversing these subexpressions, it may follow any kind
of expression. For example, if it follows a `sext` expression, this can
lead to type inconsistencies among the collected terms.
This patch fixes this issue by preventing traversal into subexpressions
other than `SCEVAddExpr` or `SCEVAddRecExpr`.
Note: I tried to minimize the test case, but this seems to be as far as
it can go.
Fix #204066.
[mlir][ExecutionEngine] Fix dead -Wno-c++98-compat-extra-semi guard (#204524)
`check_cxx_compiler_flag` stores its result in
`CXX_SUPPORTS_NO_CXX98_COMPAT_EXTRA_SEMI_FLAG`, but the guarding `if()`
checked `CXX_SUPPORTS_CXX98_COMPAT_EXTRA_SEMI_FLAG` (without `_NO_`),
which is never set. The condition was therefore always false and the
`-Wno-c++98-compat-extra-semi` suppression for `mlir_rocm_runtime` was
never applied.
The sibling flag checks in the same block (`-Wno-return-type-c-linkage`,
`-Wno-nested-anon-types`, `-Wno-gnu-anonymous-struct`) already use
matching variable names, so this aligns the typo'd guard with the
established pattern.
No test is included, this is a build-system-only (CMake) change to a
warning-suppression guard and is not unit-testable.
Signed-off-by: bogdan-petkovic <bpetkovi at amd.com>
[SPIR-V] Fix crash on void indirect call with aggregate argument (#204388)
removeAggregateTypesFromCalls named the call to key the type-restoration
metadata, which asserts for void-returning calls. Key the metadata via
instruction metadata on the call instead, which works for void results.
[Clang][NEON ACLE] Remove +bf16 requirement from opaque bfloat builtins. (#204201)
Builtins that only care about the size of the element type but not its
format (e.g loads, stores and shuffles) do not require any special
instructions to code generate beyond those already available to +neon.
Fixes https://github.com/llvm/llvm-project/issues/203159
[AArch64] Combine undef UZP and NVCAST away.
These are used to lower insert_subvec nodes quite early in SDAG. After
DAG combines run, it's possible that the inputs to these AArch64 nodes
become UNDEF.
[AArch64][SDAG] Legalise nxv1 gather/scatter nodes (#204620)
This updates WidenVecRes_MGATHER and WidenVecOp_MSCATTER to support
scalable vector types.
[SPIR-V] Legalize G_PHI of oversized vectors via fewer-elements (#203993)
`G_PHI` on vectors wider than the SPIR-V max vector size previously
failed legalization. This PR adds a `fewerElementsIf` rule that splits
them down to `MaxVectorSize`, matching how other vector ops are handled
in `SPIRVLegalizerInfo.cpp`.
Added the following test
`llvm/test/CodeGen/SPIRV/instructions/phi-large-vector.ll` covering
spirv32 and spirv64.