[MLIR][OpenMP] Add OpenMPToLLVMIRTranslation support for is_device_ptr
This PR adds support for the OpenMP is_device_ptr clause in the MLIR to LLVM IR translation for target regions. The is_device_ptr clause allows device pointers (allocated via OpenMP runtime APIs) to be used directly in target regions without implicit mapping.
[NFC] [DirectX] Update DirectX codegen test `CBufferAccess/gep-ce-two-uses.ll` due to changes to ReplaceConstant (#169848)
Fixes an LLVM DirectX codegen test after it broke due to #169141
The CBuffer loads and GEPs are no longer duplicated when there are two
or more accesses within the same basic block.
This PR removes the duplicate check for CBuffer load and GEP from the
original test function `@f` and adds a new test function `@g` which
places duplicate CBuffer loads into separate basic blocks.
[lld][WebAssembly] Fix SEGFAULT when importing wrapped symbol (#169656)
When wrapping a symbol `foo` via `-wrap=foo`, we create the symbol
`__wrap_foo` that replaces all mentions of `foo`. This feature was
implemented for wasm-ld in commit a5ca34e.
So far, no valid signature has been attached to the undefined symbol,
leading to a nullptr dereference in the logic for creating the import
section. This change adds the correct signature to the wrapped symbol,
enabling the generation of an import for it.
[AArch64] Use SVE for fixed-length bf16 operations with +sve-b16b16 (#169329)
This can avoid the promotion bf16 -> f32 -> bf16 round trip (or costly
expansions).
Revert "[ShrinkWrap] Modify shrink wrapping to accommodate functions terminated by no-return blocks" (#169852)
Reverts llvm/llvm-project#167548
As commented at
https://github.com/llvm/llvm-project/pull/167548#issuecomment-3587008602
this is causing miscompiles in two-stage RISC-V Clang/LLVM builds that
result in test failures on the builders.
[clang:ast] Avoid warning for unused var without assertions. (NFC) (#169822)
This PR avoids a compiler warning, which turns into an error with
`-Werror`, for a variable introduced in #169276 and only used in an
assertion (which is, thus, unused if compiled without assertions).
Signed-off-by: Ingo Müller <ingomueller at google.com>
Co-authored-by: Simon Pilgrim <llvm-dev at redking.me.uk>
[ubsan] Change "Type mismatch in operation" trap reason to "Alignment, null, or object-size error" (#169752)
I originally proposed this rewording when trap reasons were introduced
in
https://github.com/llvm/llvm-project/pull/145967#discussion_r2196212344.
This was not adopted because there was a counter-proposal to split the
enum; however, that work appears to have stalled
(https://github.com/llvm/llvm-project/pull/151243). In the meantime,
there has been an additional datapoint that the current wording is
confusing to users. Thus, let's reword it now to prevent further
confusion.
[AArch64] recognise zip1/zip2 with flipped operands (#167235)
Currently, the following two snippets get treated very differently from
each other (https://godbolt.org/z/rYGj9TGz6):
```LLVM
define <8 x i8> @foo(<8 x i8> %x, <8 x i8> %y) local_unnamed_addr #0 {
entry:
%0 = shufflevector <8 x i8> %x, <8 x i8> %y, <8 x i32>
<i32 0, i32 8, i32 1, i32 9, i32 2, i32 10, i32 3, i32 11>
ret <8 x i8> %0
}
define <8 x i8> @bar(<8 x i8> %x, <8 x i8> %y) local_unnamed_addr #0 {
entry:
%0 = shufflevector <8 x i8> %x, <8 x i8> %y, <8 x i32>
<i32 8, i32 0, i32 9, i32 1, i32 10, i32 2, i32 11, i32 3>
ret <8 x i8> %0
}
```
[39 lines not shown]
[libc][darwin] add syscall numbers from macos sdk (#166354)
This PR adds support to include syscall.h from MacOS sdk by explicitly including the path to the sdk via `xcrun`.
[BOLT][BTI] Skip inlining BasicBlocks containing indirect tailcalls (#168403)
In the Inliner pass, tailcalls are converted to calls in the inlined
BasicBlock. If the tailcall is indirect, the `BR` is converted to `BLR`.
These instructions require different BTI landing pads at their targets.
As the targets of indirect tailcalls are unknown, inlining such blocks
is unsound for BTI: they should be skipped instead.
[MLIR][SCF] Sink scf.if from scf.while before region into after region in scf-uplift-while-to-for (#165216)
When a `scf.if` directly precedes an `scf.condition` in the before
region of an `scf.while` and both share the same condition, move the if
into the after region of the loop. This helps simplify the control flow
to enable uplifting `scf.while` to `scf.for`.
[HIP][AMDGPU] Remove 't' from all __builtin_*_load_lds builtins
Allows for type checking depending on the builtin signature.
stack-info: PR: https://github.com/llvm/llvm-project/pull/165389, branch: users/jmmartinez/fix/load_lds_typesignature/3
[mlir:bazel] Fix build broken by #169670. (#169804)
This PR adds a dependency to the `BUILD` files overlay silently added by
#169670.
Signed-off-by: Ingo Müller <ingomueller at google.com>
[SPIRV][AMD] Disable SPV_KHR_float_control2 for AMD flavored SPIRV (#169659)
AMD uses the translator to recover LLVM-IR from SPIRV.
Currently, the translator doesn't implement the
`SPV_KHR_float_controls2` extension (I'm working on it).
If this extension is used by the SPIRV module, we cannot translate it
back to LLVM-IR.
I'm working on the extension, but in the meantime, lets just disable it
when the target triple's vendor is `amd`.