[InstCombine] Don't sink if it would require dropping deref assumptions. (#166945)
Currently sinking assumes in instcombine drops assumes if they would
prevent sinking. Removing dereferenceable assumptions earlier on can
inhibit vectorization of early-exit loops in practice.
Special-case deferenceable assumptions so that they block sinking. This
can be combined with a separate change to drop dereferencebale
assumptions after vectorization: https://clang.godbolt.org/z/jGqcx3sbs
PR: https://github.com/llvm/llvm-project/pull/166945
[BOLT] Support restartable sequences in tcmalloc (#167195)
Add `RSeqRewriter` to detect code references from `__rseq_cs` section
and ignore function referenced from that section. Code references are
detected via relocations (static or dynamic).
Note that the abort handler is preceded by a 4-byte signature byte
sequence and we cannot relocate the handler without that the signature,
otherwise the application may crash. Thus we are ignoring the function,
i.e. making sure it's not separated from its signature.
[lldb] Disable TestLocationsAfterRebuild for remote targets (#167239)
#160199 broke buildbots `lldb-remote-linux-ubuntu` and
`lldb-remote-linux-win`.
This patch must make these buildbots green for now.
[AArch64] Decompose faddv with known zero elements
FADDV is matched into FADDPv4f32 + FADDPv2i32p but this can be relaxed
when one element (usually the 4th) or more are known to be zero.
Before:
movi d1, #0000000000000000
mov v0.s[3], v1.s[0]
faddp v0.4s, v0.4s, v0.4s
faddp s0, v0.2s
After:
mov s1, v0.s[2]
faddp s0, v0.2s
fadd s0, s0, s1
[SandboxIR] Remove tight-coupling with LLVM's SwitchInst::CaseHandle (#167093)
SandboxIR's SwitchInst CaseHandle was relying on LLVM IR's
SwitchInst::CaseHandleImpl template, which may call private functions of
SandboxIR's SwitchInst. This creates a dependency cycle which is against
the design principles of Sandbox IR.
The issue was exposed by:
https://github.com/llvm/llvm-project/pull/166842 Thanks to @aengelke for
raising the issue.
[AArch64][GlobalISel] Correct instructions for 64bit fneg constant vectors. (#166537)
This code was assuming that the vectors were 128bit. Add handling for
64bit vectors. Some of the tests do not apply yet due to not matching
non-splat vectors.
Fixes #166400
[mlir] Consolidate two implementations of meet (NFC) (#167208)
This patch consolidates two implementations of meet using
"if constexpr", migrating away from the SFINAE-based approach.
[SPIRV] Add support for `bfloat16` atomics via the `SPV_INTEL_16bit_atomics` extension (#166257)
This enables support for atomic RMW ops (add, sub, min and max to be
precise) with `bfloat16` operands, via the [SPV_INTEL_16bit_atomics
extension](https://github.com/intel/llvm/pull/20009). It's logically a
successor to #166031 (I should've used a stack), but I'm putting it up
for early review.
---------
Co-authored-by: Matt Arsenault <arsenm2 at gmail.com>
[clang-tidy] Fix `readability-container-data-pointer` check (#165636)
Fix issue in readability-container-data-pointer when the container
expression is a dereference (e.g., `&(*p)[0]`). The previous fix-it
suggested `*p.data()`, which changes semantics because `.` binds tighter
than `*`. The fix now correctly suggests `(*p).data()`.
Closes [#164852](https://github.com/llvm/llvm-project/issues/164852)
---------
Co-authored-by: Baranov Victor <bar.victor.2002 at gmail.com>
[AArch64][SVE] Avoid movprfx by reusing register for _UNDEF pseudos.
For predicated SVE instructions where we know that the inactive
lanes are undef, it is better to pick a destination register that
is not unique. This avoids introducing a movprfx to copy a unique
register to the destination operand, which would be needed to comply
with the tied operand constraints.
For example:
%src1 = COPY $z1
%src2 = COPY $z2
%dst = SDIV_ZPZZ_S_UNDEF %p, %src1, %src2
Here it is beneficial to pick $z1 or $z2 as the destination register,
because if it would have chosen a unique register (e.g. $z0) then
the pseudo expand pass would need to insert a MOVPRFX to expand
the operation into:
[10 lines not shown]
[NFCI][lldb][test][Recognizer] Fix mismatched C/C++ frontend subtitutions (#167220)
The explicit language specifications for Objective C/C++ don't seem necessary either so I've removed
them too.
I found these by using Clang frontend configuration files containing language-specific options for
both C and C++ (e.g. `-std=c2y` and `-std=c++26`).
Prior-art: 21041c9