[mlir][scf] Fix FoldTensorCastOfOutputIntoForallOp write order bug (#189162)
`FoldTensorCastOfOutputIntoForallOp` incorrectly updated the
destinations of `tensor.parallel_insert_slice` ops in the `in_parallel`
block by zipping `getYieldingOps()` with `getRegionIterArgs()`
positionally. This assumed that the i-th yielding op writes to the i-th
shared output, which is not required by the IR semantics. When slices
are written to shared outputs in non-positional order, the
canonicalization would silently reverse the write targets, producing
incorrect output.
Fix by replacing the positional zip with a per-destination check: for
each yielding op's destination operand, if it is a `tensor.cast` result
whose source is one of the new `scf.forall` region iter args (i.e., a
cast we introduced to bridge the type change), replace the destination
with the cast's source directly. This correctly handles all orderings.
Add a regression test that exercises the multi-result case where
`parallel_insert_slice` ops write to shared outputs in non-sequential
[4 lines not shown]
[MLIR][SparseTensor] Fix fingerprint changes in SparseFuncAssembler (#188958)
SparseFuncAssembler::matchAndRewrite was calling funcOp.setName(),
funcOp.setPrivate(), and funcOp->removeAttr() directly without notifying
the rewriter, causing "operation fingerprint changed" errors under
MLIR_ENABLE_EXPENSIVE_PATTERN_API_CHECKS.
Wrap all in-place funcOp mutations with rewriter.modifyOpInPlace.
Assisted-by: Claude Code
Fix a failure present with MLIR_ENABLE_EXPENSIVE_PATTERN_API_CHECKS=ON.
Co-authored-by: Claude Sonnet 4.6 <noreply at anthropic.com>
[MLIR][SparseTensor] Fix domination violation in co-iteration for dense iterators (#188959)
In exitWhileLoop, random-accessible (dense) iterators were being located
using whileOp.getResults().back() while the insertion point was still
inside the while loop's after block. This caused a domination violation:
the ADDI created by locate() was inside the after block, but it was
later used (via derefImpl's SUBI) after the while loop exits.
Move the locate() calls for random-accessible iterators to after
builder.setInsertionPointAfter(whileOp), where the while results are
properly in scope.
Fixes 10 failing tests under MLIR_ENABLE_EXPENSIVE_PATTERN_API_CHECKS.
Assisted-by: Claude Code
Co-authored-by: Claude Sonnet 4.6 <noreply at anthropic.com>
[AArch64][NFC] Move `isZExtLoad/isSExtLoad` from `AArch64FastISel` to `AArch64InstrInfo` (#189486)
Move the static function `isZExtLoad` and `isSExtLoad` helper functions
from `AArch64FastISel` into `AArch64InstrInfo` to be reused by other
passes.
[SandboxVec][VecUtils] Lane Enumerator (#188355)
This patch introduces an iterator that helps us iterate over lane-value
pairs in a range. For example, given a container `(i32 %v0, <2 x i32>
%v1, i32 %v2)` we get:
```
Lane Value
0 %v0
1 %v1
3 %v2
```
We use this iterator to replace the lane counting logic in
BottomUpVec.cpp.