[InstCombine] Fold constant byte stores to integer stores (#196740)
Byte constants are equivalent to integer constants when stored to
memory. Replacing them in store instructions reduces IR differences and
enables existing optimizations over integer constants.
[flang][OpenMP] Optionally get final symbol in Get(Argument|Object)Symbol
Originally these functions returned the ultimate symbol for the one
obtained from the argument or object. However, this may be somewhat
unintuitive/unexpected, so instead return the original symbol, and
add a flag to optionally return the ultimate one.
[llvm][RISCV] Optimize fcopysign for fixed vectors (#193802)
vfsgnj is not available on zvfhmin or zvfbfmin, it's expected to expand
to integer operations instead of unrolling to scalar operations.
General expandFCOPYSIGN already handles that in most of cases except for
fixed vector types that are not promotable, we need to find a better
heuristic to gate this.
[llvm][RISCV] Optimize fabs for fixed vectors (#194554)
vfabs is not available on zvfhmin or zvfbfmin, it's expected to expand
to integer operations instead of unrolling to scalar operations.
General expandFABS already handles that in most of cases except for
fixed vector types that are not promotable, we need to find a better
heuristic to gate this.
[llvm][RISCV] Optimize fneg for fixed vectors (#194555)
vfneg is not available on zvfhmin or zvfbfmin, it's expected to expand
to integer operations instead of unrolling to scalar operations.
General expandFNEG already handles that in most of cases except for
fixed vector types that are not promotable, we need to find a better
heuristic to gate this.
[CIR][AArch64] Lower NEON vuzp intrinsics (#195591)
### Summary
part of : https://github.com/llvm/llvm-project/issues/185382
lower `vuzp` intrinsics in:
https://arm-software.github.io/acle/neon_intrinsics/advsimd.html#unzip-elements
this is a follow up : https://github.com/llvm/llvm-project/pull/195527
Lower `NEON::BI__builtin_neon_vuzp_v` and
`NEON::BI__builtin_neon_vuzpq_v`in CIRGenBuiltinAArch64.cpp by porting
by porting the existing incubator
logic(clangir/clang/lib/CIR/CodeGen/CIRGenBuiltinAArch64.cpp) : two
bitcasts on the input vectors,two rounds of cir.vec.shuffle generating
the deinterleave (even/odd) shuffle patterns with indices 2*i+vi, each
stored via ptr_stride on the sret base pointer.
[IRBuilder] Split CreateAssumption to one with bundle and one with condition [NFC] (#196795)
as it is not possible to combine bundles and conditions from
https://github.com/llvm/llvm-project/pull/160460 reflect that in
CreateAssumption
[clang-tidy] Rename hicpp-multiway-paths-covered to bugprone-unhandled-code-paths (#191625)
Part of the work in https://github.com/llvm/llvm-project/issues/183462.
Closes https://github.com/llvm/llvm-project/issues/183464.
Splitting the check into two more focused checks was considered during
discussion, but since clang-tidy does not support one-to-many aliases, a
single name covering both behaviors was chosen instead that is more
clear than `multiway-paths-covered`.
---------
Co-authored-by: Zeyi Xu <mitchell.xu2 at gmail.com>
[AArch64] Improve post-inc stores of SIMD/FP values (#151372)
Add patterns to match post-increment truncating stores from lane 0 of
wide integer vectors (v4i32/v2i64) to narrower types (i8/i16/i32). This
avoids transferring the value through a GPR when storing.
Also remove the pre-legaliztion early-exit in `combineStoreValueFPToInt`
as it prevented the optimization from applying in some cases.
[X86] Cast atomic vectors in IR to support floats
This commit casts floats to ints in an atomic load during AtomicExpand to support
floating point types. It also is required to support 128 bit vectors in SSE/AVX.
[LifetimeSafety] Warn on incorrectly placed `[[clang::lifetimebound]]` attributes (#196144)
Adds new warning that is emitted when parameter is marked as
`[[clang::lifetimebound]]` but is not returned in one way or another
(tracked via `OriginEscapeFact`).
Closes #182935
Revert "[VectorCombine] foldShuffleChainsToReduce - add support for partial vector reductions" (#196796)
Reverts llvm/llvm-project#195119 while reported assertions are investigated.
[lldb][Windows] Invalidate cached register values on thread stop (#192430)
Invalidate cached values in register context data structures on every
thread stop.
NativeRegisterContextRegisterInfo::InvalidateAllRegisters performs no
operation by default. Subclasses may override it to clear cached values
within their register context data structures whenever a thread stops.
This change intends to set up the necessary infrastructure to support
caching of the thread context in NativeRegisterContextWindows_arm64,
which will improve read performance. Currently, the thread context is
retrieved for every read or write operation.
[AArch64] Improve post-inc stores of SIMD/FP values
Add patterns to match post-increment truncating stores from lane 0 of
wide integer vectors (v4i32/v2i64) to narrower types (i8/i16/i32).
This avoids transferring the value through a GPR when storing.
Also remove the pre-legalization early-exit in combineStoreValueFPToInt
as it prevented the optimization from applying in some cases.
[AArch64] Improve post-inc stores of SIMD/FP values
Add patterns to match post-increment truncating stores from lane 0 of
wide integer vectors (v4i32/v2i64) to narrower types (i8/i16/i32).
This avoids transferring the value through a GPR when storing.
Also remove the pre-legalization early-exit in combineStoreValueFPToInt
as it prevented the optimization from applying in some cases.
[LoopPeel] Peel last iteration to enable load widening
In loops that contain multiple consecutive small loads (e.g., 3 bytes
loading i8's), peeling the last iteration makes it safe to read beyond
the accessed region, enabling the use of a wider load (e.g., i32) for
all other N-1 iterations.
Patterns such as:
```
%a = load i8, ptr %p
%b = load i8, ptr %p+1
%c = load i8, ptr %p+2
...
%p.next = getelementptr i8, ptr %p, 3
```
Can be transformed to:
```
%wide = load i32, ptr %p ; Read 4 bytes
[9 lines not shown]