[IR] Inline remove_if in PMDataManager::removeNotPreservedAnalysis (#199571)
PR #198982 rewrote removeNotPreservedAnalysis to use DenseMap::remove_if
with one predicate shared across two call sites. The predicate is always
inlined; the cost is that two call sites make
DenseMapBase::remove_if<...>
itself emit out of line instead of inlining into the caller. As this
runs
after every modifying codegen pass (legacy PM), it shows up as a small
instructions:u regression, most visibly at -O0 where the legacy codegen
PM
is a large fraction of compile time:
https://llvm-compile-time-tracker.com/compare.php?from=69a5cf515fd317bcf918e48de9137dd8549870c5&to=6302439f5aaea6cb776d8ceb5c2ef9108fccf702&stat=instructions%3Au
Collect the maps into a SmallVector and prune them from a single
remove_if
call site, so the instantiation is inlined again.
[InferAttrs] Annotate math and basic string libcalls with `nosync` (#197761)
Math libcalls as well as some simple string ones do not create
synchronizes-with edges, thus `nosync` may be derived.
Co-authored-by: Johannes Doerfert <jdoerfert.llvm at gmail.com>
[RISCV] Use append TableGen feature in RISCVInstrInfoXqci.td (#199603)
This improves the readability of the file.
An AI came up with the patch which I reviewed and ensured that the tests
pass.
[mlir][x86] Fix - multiple issues / F8 support for AMX dot-product lowering (#196984)
This patch fixes issues or support additional patterns for AMX
`dot-product` lowering.
1. Fix issue related to write-back to `C` matrix,
2. Supports additional lowering pattern where the cache tile sizes are:
32,32,32,
3. Online packing - loop peeling is now based on `step` size,
4. Extends support for `f8` lowering (`mx-fp8` lowering will be
supported after vector.contract has `mx` support).
[NFC] [clangd] [C++20] [Modules] Fix false duplicate module warning for equivalent paths (#199343)
When checking for multiple source files declaring the same module, the
comparison used raw string equality on file paths. This causes false
positives when the same file is represented by different but equivalent
path strings.
Use pathEqual(normalizePath(...), normalizePath(...)) instead to compare
canonical paths, consistent with how clangd handles path comparisons
elsewhere.
sanitizer_common: Fix build on MIPS with _TIME_BITS=64 (#199590)
When we build sanitizer_common with -D_TIME_BITS=64, the assert of
struct_kernel_stat_sz fails due to the size of struct stat get different
size.
[AArch64] Add support for MSVC-style mangling for SVE (#196738)
Fixes #196170
Recent MSVC toolchains added support for AArch64 SVE types and use
dedicated builtin manglings such as `$_CD` instead of the older
artificial `__clang` struct manglings.
Update Clang's Microsoft mangling implementation to match MSVC for
supported SVE builtin types.
Unsupported SVE types continue using the existing artificial tag
mangling until MSVC gains support for them.
Adds representative coverage for:
* scalar SVE types
* tuple/vector SVE types
* fallback manglings for unsupported types
[CodeGenPrepare] splitMergedValStore: don't split atomic stores. (#199592)
splitMergedValStore notices when you do e.g.
z = x | (y << 32)
store z
and may split this up into 32-bit two stores, of x and y, depending
e.g. on the type of x and y.
It skips this optimization for volatile stores, but currently does NOT
skip it for atomics (!!). So an atomic store can be split up into two
(non-atomic!) stores.
This bug was found by a large run of Opus 4.7 looking for bugs in LLVM.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply at anthropic.com>
[RISCV][P-ext] Fold bitcast of v4i8/v2i16 const splat to scalar on RV64 (#199513)
clang lowers `int8x4_t f(int8x4_t a) { return ~a; }` to a
bitcast-wrapped vector xor with splat(-1). v4i8/v2i16 aren't legal on
RV64, so the xor gets scalarized to i64 with the constant still wrapped
in BITCAST:
`i64 = xor X, (bitcast (v8i8 splat -1))`
The scalar `not` PatFrag (xor X, -1) requires a literal constant and
can't see through BITCAST, so XORI -1 (= `not`) misses and we emit `li
-1; xor` (2 insns). The v8i8/v4i16/v2i32 paths stay at vector level and
match the bitcast-aware vector `vnot` td-pat, so they're fine; only the
widened-from-v4i8/v2i16 path falls through to scalar `not`.
Fix it by folding the bitcast of a v4i8/v2i16 constant splat to a scalar
i32 constant pre-legalization. Type promotion sign-extends to i64 -1 and
XORI matches.
[X86] FixupInstTuning: ProcessShiftLeftToAdd should return true after mutating. (#199589)
I think this is almost NFC, though it should affect some of the
compilation statistics like "number of instrs changed by X pass".
This bug was found by a large run of Opus 4.7 looking for bugs in LLVM.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply at anthropic.com>
[SelectionDAG] Handle CSE in PromoteIntOp_VP_STRIDED. (#199562)
If the UpdateNodeOperands triggers CSE, we need to handle result
replacement ourselves because strided load has 2 results.
[DA] Consolidate accumulating GCD functions (NFCI) (#197936)
This patch consolidates two functions `accumulateCoefficientsGCD` and
`analyzeCoefficientsForGCD` by merging the latter into the former. These
two functions are very similar, and keeping both of them does not make
much sense.