[AArch64] recognise zip1/zip2 with flipped operands (#167235)
Currently, the following two snippets get treated very differently from
each other (https://godbolt.org/z/rYGj9TGz6):
```LLVM
define <8 x i8> @foo(<8 x i8> %x, <8 x i8> %y) local_unnamed_addr #0 {
entry:
%0 = shufflevector <8 x i8> %x, <8 x i8> %y, <8 x i32>
<i32 0, i32 8, i32 1, i32 9, i32 2, i32 10, i32 3, i32 11>
ret <8 x i8> %0
}
define <8 x i8> @bar(<8 x i8> %x, <8 x i8> %y) local_unnamed_addr #0 {
entry:
%0 = shufflevector <8 x i8> %x, <8 x i8> %y, <8 x i32>
<i32 8, i32 0, i32 9, i32 1, i32 10, i32 2, i32 11, i32 3>
ret <8 x i8> %0
}
```
[39 lines not shown]
[libc][darwin] add syscall numbers from macos sdk (#166354)
This PR adds support to include syscall.h from MacOS sdk by explicitly including the path to the sdk via `xcrun`.
[BOLT][BTI] Skip inlining BasicBlocks containing indirect tailcalls (#168403)
In the Inliner pass, tailcalls are converted to calls in the inlined
BasicBlock. If the tailcall is indirect, the `BR` is converted to `BLR`.
These instructions require different BTI landing pads at their targets.
As the targets of indirect tailcalls are unknown, inlining such blocks
is unsound for BTI: they should be skipped instead.
[MLIR][SCF] Sink scf.if from scf.while before region into after region in scf-uplift-while-to-for (#165216)
When a `scf.if` directly precedes an `scf.condition` in the before
region of an `scf.while` and both share the same condition, move the if
into the after region of the loop. This helps simplify the control flow
to enable uplifting `scf.while` to `scf.for`.
[HIP][AMDGPU] Remove 't' from all __builtin_*_load_lds builtins
Allows for type checking depending on the builtin signature.
stack-info: PR: https://github.com/llvm/llvm-project/pull/165389, branch: users/jmmartinez/fix/load_lds_typesignature/3
[mlir:bazel] Fix build broken by #169670. (#169804)
This PR adds a dependency to the `BUILD` files overlay silently added by
#169670.
Signed-off-by: Ingo Müller <ingomueller at google.com>
[SPIRV][AMD] Disable SPV_KHR_float_control2 for AMD flavored SPIRV (#169659)
AMD uses the translator to recover LLVM-IR from SPIRV.
Currently, the translator doesn't implement the
`SPV_KHR_float_controls2` extension (I'm working on it).
If this extension is used by the SPIRV module, we cannot translate it
back to LLVM-IR.
I'm working on the extension, but in the meantime, lets just disable it
when the target triple's vendor is `amd`.
[lldb] Fix CxxMethodName Parser on return type (#169652)
The simplified parser incorrectly assumes if there is a context, there
is no return type.
Fixed the case where functions have both a context and a return type.
For example,
`int foo::bar::func()`
`Type<int> foo::bar::func()`
Also fixed the case where there is no space between the context and
return.
`std::vector<int>foo::bar()`
[DA][Delinearization] Move validation logic into Delinearization (#169047)
This patch moves the validation logic of delinearization results from DA
to Delinearization. Also call it in `printDelinearization` to test its
behavior. The motivation is as follows:
- Almost the same code exists in `tryDelinearizeFixedSize` and
`tryDelinearizeParametricSize`. Consolidating it in Delinearization
avoids code duplication.
- Currently this validation logic is not well tested. Moving it to
Delinearization allows us to write regression tests easily.
This patch changes the test outputs and debug messages, but otherwise
NFCI.
[LoopUnroll] Introduce parallel accumulators when unrolling FP reductions. (#166630)
This is building on top of
https://github.com/llvm/llvm-project/pull/149470, also introducing
parallel accumulator PHIs when the reduction is for floating points,
provided we have the reassoc flag. See also
https://github.com/llvm/llvm-project/pull/166353, which aims to
introduce parallel accumulators for reductions with vector instructions.
[libc++] Merge the implementations of ranges::copy_n and std::copy_n and fix vector::insert to assign (#157444)
This reduces the amount of code we have to maintain a bit.
This also simplifies `vector` by using the internal API instead of
`#if`s to switch based on language dialect.
[libc++][C++03] Remove code in the C++03-specific tests that is guarded on the language version (#169354)
This is dead code, since `test/libcxx-03` is only ever executed with
`-std=c++03`.