[X86] Cast atomic vectors in IR to support floats
This commit casts floats to ints in an atomic load during AtomicExpand to support
floating point types. It also is required to support 128 bit vectors in SSE/AVX.
[LifetimeSafety] Warn on incorrectly placed `[[clang::lifetimebound]]` attributes (#196144)
Adds new warning that is emitted when parameter is marked as
`[[clang::lifetimebound]]` but is not returned in one way or another
(tracked via `OriginEscapeFact`).
Closes #182935
Revert "[VectorCombine] foldShuffleChainsToReduce - add support for partial vector reductions" (#196796)
Reverts llvm/llvm-project#195119 while reported assertions are investigated.
[lldb][Windows] Invalidate cached register values on thread stop (#192430)
Invalidate cached values in register context data structures on every
thread stop.
NativeRegisterContextRegisterInfo::InvalidateAllRegisters performs no
operation by default. Subclasses may override it to clear cached values
within their register context data structures whenever a thread stops.
This change intends to set up the necessary infrastructure to support
caching of the thread context in NativeRegisterContextWindows_arm64,
which will improve read performance. Currently, the thread context is
retrieved for every read or write operation.
[AArch64] Improve post-inc stores of SIMD/FP values
Add patterns to match post-increment truncating stores from lane 0 of
wide integer vectors (v4i32/v2i64) to narrower types (i8/i16/i32).
This avoids transferring the value through a GPR when storing.
Also remove the pre-legalization early-exit in combineStoreValueFPToInt
as it prevented the optimization from applying in some cases.
[AArch64] Improve post-inc stores of SIMD/FP values
Add patterns to match post-increment truncating stores from lane 0 of
wide integer vectors (v4i32/v2i64) to narrower types (i8/i16/i32).
This avoids transferring the value through a GPR when storing.
Also remove the pre-legalization early-exit in combineStoreValueFPToInt
as it prevented the optimization from applying in some cases.
[LoopPeel] Peel last iteration to enable load widening
In loops that contain multiple consecutive small loads (e.g., 3 bytes
loading i8's), peeling the last iteration makes it safe to read beyond
the accessed region, enabling the use of a wider load (e.g., i32) for
all other N-1 iterations.
Patterns such as:
```
%a = load i8, ptr %p
%b = load i8, ptr %p+1
%c = load i8, ptr %p+2
...
%p.next = getelementptr i8, ptr %p, 3
```
Can be transformed to:
```
%wide = load i32, ptr %p ; Read 4 bytes
[9 lines not shown]
[DAG][GISel] Rename CTTZ_ZERO_UNDEF/CTLZ_ZERO_UNDEF/CTTZ_ELTS_ZERO_UNDEF -> CTTZ_ZERO_POISON/CTLZ_ZERO_POISON/CTTZ_ELTS_ZERO_POISON (#196732)
DAG/GISel are ambiguous about whether zero-input results in
UNDEF/POISON, unlike the rest of LLVM which makes it clear its POISON.
I've tried to clean this up once and for all by ensuring
SelectionDAG::canCreateUndefOrPoison does a includesPoison(Kind) check,
renaming the opcodes (including the VP variants) and updating as many
comments/tests as possible (I may still have missed some...).
[cmake] use target names instead of legacy variables (#185463)
Use the [name of the imported
targets](https://cmake.org/cmake/help/latest/module/CheckSymbolExists.html)
when testing the libraries during cmake configuration. This removes the
need to also set `CMAKE_REQUIRED_INCLUDES` and
`CMAKE_REQUIRED_DEFINITIONS` and reflects more modern CMake usage where
targets are preferred over variables.
This is already the case when checking libcurl in the same file.
[Clang] Transform lambda's constraints when instantiating parameter mapping (#195995)
This way we can remove a few workarounds of lambda expressions where
outer template arguments of concepts have to be preserved through
ImplicitConceptSpecializationDecls.
Fixes #193944