[CodeGenPrepare] splitMergedValStore: don't split atomic stores. (#199592)
splitMergedValStore notices when you do e.g.
z = x | (y << 32)
store z
and may split this up into 32-bit two stores, of x and y, depending
e.g. on the type of x and y.
It skips this optimization for volatile stores, but currently does NOT
skip it for atomics (!!). So an atomic store can be split up into two
(non-atomic!) stores.
This bug was found by a large run of Opus 4.7 looking for bugs in LLVM.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply at anthropic.com>
[RISCV][P-ext] Fold bitcast of v4i8/v2i16 const splat to scalar on RV64 (#199513)
clang lowers `int8x4_t f(int8x4_t a) { return ~a; }` to a
bitcast-wrapped vector xor with splat(-1). v4i8/v2i16 aren't legal on
RV64, so the xor gets scalarized to i64 with the constant still wrapped
in BITCAST:
`i64 = xor X, (bitcast (v8i8 splat -1))`
The scalar `not` PatFrag (xor X, -1) requires a literal constant and
can't see through BITCAST, so XORI -1 (= `not`) misses and we emit `li
-1; xor` (2 insns). The v8i8/v4i16/v2i32 paths stay at vector level and
match the bitcast-aware vector `vnot` td-pat, so they're fine; only the
widened-from-v4i8/v2i16 path falls through to scalar `not`.
Fix it by folding the bitcast of a v4i8/v2i16 constant splat to a scalar
i32 constant pre-legalization. Type promotion sign-extends to i64 -1 and
XORI matches.
[X86] FixupInstTuning: ProcessShiftLeftToAdd should return true after mutating. (#199589)
I think this is almost NFC, though it should affect some of the
compilation statistics like "number of instrs changed by X pass".
This bug was found by a large run of Opus 4.7 looking for bugs in LLVM.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply at anthropic.com>
[SelectionDAG] Handle CSE in PromoteIntOp_VP_STRIDED. (#199562)
If the UpdateNodeOperands triggers CSE, we need to handle result
replacement ourselves because strided load has 2 results.
[DA] Consolidate accumulating GCD functions (NFCI) (#197936)
This patch consolidates two functions `accumulateCoefficientsGCD` and
`analyzeCoefficientsForGCD` by merging the latter into the former. These
two functions are very similar, and keeping both of them does not make
much sense.
[mlir][IR] Require token producer and consumer traits
Add marker traits for operations that intentionally produce or consume the
builtin token type. The verifier now rejects token results without
TokenProducerTrait, token operands without TokenConsumerTrait, token entry
block arguments whose parent op does not produce tokens, and token block
arguments outside entry blocks.
Extend the Test dialect token ops to cover valid opt-in cases and each
verifier rejection path.
Assisted-by: Codex
[mlir][async] Lazily create the coroutine destroy-cleanup block
`setupCoroMachinery` previously emitted a `cleanupForDestroy` block
unconditionally, alongside the normal `cleanup` block. That block is
only ever used as the "destroy" successor of an `async.coro.suspend`,
so for coroutines that never suspend (e.g. an `async.func` whose body
contains no `async.await`) it ended up unreachable in the lowered CFG.
Make `cleanupForDestroy` mirror the existing `setError` pattern and
materialize it lazily via a new `setupCleanupForDestroyBlock` helper,
called only from the two places (`outlineExecuteOp` and the
`async.await` lowering) that actually wire it up. Store the coroutine
id on `CoroMachinery` so the helper can rebuild the block contents
without keeping the original `async.coro.id` op around.
Assisted-by: Opus 4.7
[InstrProf] Do not emit metadata for zero values with zero counts (#199380)
If we have a indirect call site with a profile that has VP information
for said callsite that only contains zero values with zero counts, we
would start to emit invalid profile information after
1d146967d51ba76b8379d9e12961aa23e5745701. VP metadata in this case is at
best redundant with BFI. So we restrict metadata emission to only if we
have a sufficient number of values for the VP metadata to be valid.
[mlir][async] Lazily create the coroutine destroy-cleanup block
`setupCoroMachinery` previously emitted a `cleanupForDestroy` block
unconditionally, alongside the normal `cleanup` block. That block is
only ever used as the "destroy" successor of an `async.coro.suspend`,
so for coroutines that never suspend (e.g. an `async.func` whose body
contains no `async.await`) it ended up unreachable in the lowered CFG.
Make `cleanupForDestroy` mirror the existing `setError` pattern and
materialize it lazily via a new `setupCleanupForDestroyBlock` helper,
called only from the two places (`outlineExecuteOp` and the
`async.await` lowering) that actually wire it up. Store the coroutine
id on `CoroMachinery` so the helper can rebuild the block contents
without keeping the original `async.coro.id` op around.
Assisted-by: Opus 4.7
[mlir][async] Lazily create the coroutine destroy-cleanup block
`setupCoroMachinery` previously emitted a `cleanupForDestroy` block
unconditionally, alongside the normal `cleanup` block. That block is
only ever used as the "destroy" successor of an `async.coro.suspend`,
so for coroutines that never suspend (e.g. an `async.func` whose body
contains no `async.await`) it ended up unreachable in the lowered CFG.
Make `cleanupForDestroy` mirror the existing `setError` pattern and
materialize it lazily via a new `setupCleanupForDestroyBlock` helper,
called only from the two places (`outlineExecuteOp` and the
`async.await` lowering) that actually wire it up. Store the coroutine
id on `CoroMachinery` so the helper can rebuild the block contents
without keeping the original `async.coro.id` op around.
Assisted-by: Opus 4.7
Revert "[compiler-rt][ASan] Add function copying annotations (#91702)" (#194204)
This reverts commit c76045d9bf3bd1c7a381dc85d1db63a38fd69aa4.
It does not look like this has been used anywhere since it was
implemented. I see no uses of it in LLVM, anywhere in our internal
monorepo, or across the entirety of Github outside of other copies of
LLVM tests. Given that, remove it. The intended use case around SSO ASan
string annotations is also likely to be significantly reworked soon.