[MLIR][NVVM][NFC] Restructure NVVM dialect (#195811)
Moves the declarations of the NVVM dialect and some widely used enums
(`FPRoundingModeAttr` and `SaturationModeAttr`) to separate files to make
them easier to maintain and also use in the NVGPU dialect.
[AArch64][CostModel] Model sve costs for ctpop (#192428)
Targets supporting sve prefer sve for ctpop with fixed length vectors.
Update cost model to reflect the same.
[InstCombine][NFC] Change the order of checks in SliceUpIllegalIntegerPHI for faster compile time. (#183726)
SliceUpIllegalIntegerPHI searches for PHIs that have illegal type and
are only used by trunc or trunc(lshr) operations. It bails out if
encounters invoke or EH pad instructions.
It first checks whether it encounters invoke or EH pad, which is time
consuming as it checks every instruction. Then it checks whether it is
used by trunc or trunc(lshr). The former check is generally loose, while
the latter one is stricter. Switch the order of the checks will speed up
compilation.
Signed-off-by: XinlongZHANG-Bob <zhangxinlong.bob at bytedance.com>
[LV][NFC] Reshape pointer_iv_non_uniform_0 test to use distinct loads (#196494)
The followup [patch](https://github.com/llvm/llvm-project/pull/196080)
is folding some of the idempotent binary ops This test has `sub x - x`
operation which is affected by the followup patch. This patch is making
the test immune to the fold.
[clang] implement CWG2064: ignore value dependence for decltype
The 'decltype' for a value-dependent (but non-type-dependent) should be known,
so this patch makes them non-opaque instead.
This patch also implements what's neceessary to allow overloading
on pure differences in instantiation dependence, making `std::void_t`
usable for SFINAE purposes.
This also readds a few test cases from da98651, which was a previous attempt
at resolving CWG2064.
Fixes #8740
Fixes #61818
Fixes #190388
[clang] Add arm64_neon.h wrapper on windows (#196014)
Add an MSVC-compatible <arm64_neon.h> resource header that forwards to
Clang's generated <arm_neon.h>. This lets ARM64 Windows code using the
MSVC header name lower NEON intrinsics through Clang builtins instead of
eaving external neon_* calls such as neon_ld1m4_q32
Fixes #195683
[ADT] Avoid map storage for small SmallMapVector (#196473)
SmallMapVector previously used SmallDenseMap for its index, which still
initializes and maintains map storage even when the number of entries is
tiny.
Teach MapVector to support a vector-only small mode. While the entry
count stays
within the configured small size, operations use the underlying vector
directly.
When the size grows past the threshold, the map index is built and
subsequent
operations use the regular MapVector path.
This mirrors the small-size strategy used by SmallSetVector.
[AtomicExpand] Add bitcasts when expanding load atomic vector
AtomicExpand fails for aligned `load atomic <n x T>` because it
does not find a compatible library call. This change adds appropriate
bitcasts so that the call can be lowered. It also adds support for
128 bit lowering in tablegen to support SSE/AVX.
[clang-tidy] comment braced and parenthesized init arguments (#180408)
Handle arguments like `{}`, `Type{}` and `Type()` in
`bugprone-argument-comment` and
add coverage for `initializer_list` and designated initializers.
Fixes: https://github.com/llvm/llvm-project/issues/171842
Add missing direct includes for bit.h/SwapByteOrder.h. NFC (#196843)
These translation units use llvm::endianness, llvm::byteswap,
llvm::has_single_bit, or sys::IsLittleEndianHost without explicitly
including the header that declares them. They currently compile only
because llvm/ADT/Hashing.h transitively pulls in
llvm/Support/SwapByteOrder.h (which includes llvm/ADT/bit.h).