[VectorCombine] Fix the PtrAdd offset in shrinkLoadForShuffles to account for element type size (#179001)
This PR fixes an [issue I pointed out in regards to incorrect GEP
indices](https://github.com/llvm/llvm-project/pull/149093#discussion_r2748266079)
introduced by PR #149093.
Changes:
- Updated the pointer offset calculation in
`VectorCombine::shrinkLoadForShuffles` so that the offset is now
multiplied by the element size (`ElemSize`) when computing the new
pointer for loads
- Updated the GEP indices in
`llvm/test/Transforms/VectorCombine/load-shufflevector.ll` for the
correct byte offsets
[HLSL] Make Matrix types in `buildInitializerListImpl` index in row major order for initializer lists. (#178931)
fixes #178930
- changes the loop indexing order
- updates the associated tests
[msan] Support Arm NEON usdot (#178982)
Handle tariff-free dot-product using the existing
handleVectorDotProductIntrinsic() instead of with the default handler.
Restore unintentionally changed files
This restores files that were unintentionally added to commit
21a74f527839b5b8dd882e62a25093d980c79078, 'Revert "[lldb] Add FP
conversion instructions to IR interpreter (#175292)"'
[SPIRV][NFC] Merge Subgroup Reduce into uniform selector (#178802)
The ReduceMax, ReduceMin, and ReduceSum selectors were all doing the
samething with the exception of which opcode they were using.
This change unifies these implementations and allows pick the opcode via
a helper lambda.
[clang][driver][darwin] Tweak the use after scope fix in Darwin driver toolchain (#178981)
It's ever so slightly cleaner looking and less error prone to make the
SmallVector hold std::string instead of making a local just for the
version string.
[compiler-rt][common] Don't try to unmap non-page aligned pointers
When the sanitizer hasn't mapped the alternate signal stack, but the
host program has (like LLVM), the stack's base pointer may not be
aligned, if it were allocated via malloc, and thus wouldn't be safe to
unmap anyway. A solution that doesn't unmap the alternate stack unless
the sanitizer had mapped it in the first place will take more time to
design. For now, we can just avoid calling munmap on pointers without
the correct alignment.
[flang] Add support for additional compiler directive sentinel (#178941)
This patch allows to set up additional compiler directive sentinel in
addition to the default `!dir$`. Some user code could use other vendor
specific compiler directive sentinel and this solution allows to add
them to the parser options.
[lld][MachO] Accept hex format for cstring hashes in order file (#178933)
Support both decimal and hexadecimal formats for cstring hashes in
the order file. Hex values must use the 0x prefix (case insensitive).
Examples:
CSTR;1234567890 (decimal)
CSTR;0x499602D2 (hex)
Co-authored-by: Sharon Xu <sharonxu at fb.com>
[clang][Driver] Fix use after scope in darwin driver (#178967)
`Version.getAsString()` returns an `std::string`, and thus the
`StringRef` points to an invalid location when pushed into the
Components vector. This just keeps the temporary alive for the
new string to be generated, to fix the ASAN failure after #176541
Attributor: Use anchor scope for SimplifyQuery context (#178958)
This was asserting in computeKnownFPClass when a dominator tree
check happened across functions.
Fixes #178954
[clang][Driver] Fix use after scope in darwin driver
`Version.getAsString()` returns an `std::string`, and thus the
`StringRef` points to an invalid location when pushed into the
Components vector. This just keeps the temporary alive for the
new string to be generated, to fix the ASAN failure after #176541
[AArch64] Convert CLS intrinsics to use ISD::CTLS (#178885)
This patch converts AArch64 CLS intrinsics (aarch64_neon_cls) to use the
generic ISD::CTLS node.
- aarch64_neon_cls: Lowered to ISD::CTLS, pattern-matched to CLS
instruction
- Set ISD::CTLS as Legal for NEON vector types (v8i8, v16i8, v4i16,
v8i16, v2i32, v4i32)
Also adds generic CTLS expansion support:
- ExpandIntRes_CTLS in LegalizeIntegerTypes for i64->i32 type expansion
- expandCTLS in TargetLowering for targets without native CLS
instruction
Part of: https://github.com/llvm/llvm-project/issues/174337