[X86][APX] Remove NF entries in X86CompressEVEXTable (#189308)
NF (No-Flags) instructions should not compress to non-NF instructions,
as this would incorrectly modify flags behavior. The compression table
is only intended for encoding optimizations that preserve semantics.
This removes the incorrect NF entries that could have led to
miscompilation if the compression logic were applied.
[CodeGenPrepare][NFC] Reland: Update the dominator tree instead of rebuilding it (#179040)
The original differential revision is https://reviews.llvm.org/D153638
Reverted in
https://github.com/llvm/llvm-project/commit/f5b5a30858f32e237636acd296b6d0f87c1dfe97
because of causing a clang crash.
This patch relands it with the crash fixed. Call `DTU->flush()` in each
iteration of `while (MadeChange)`
loop, flush all awaiting BasicBlocks deletion, and prevent iterator
invalidation.
[RISCV] Check hasVInstructions() rather than hasStdExtZbb() for UMAX/UMIN/SMAX/SMIN combines. (#189506)
The combines are related to combining min/max with vector reductions. I
don't think it matters if Zbb is enabled.
I did not merge this with other hasVInstructions() because I have a P
extension patch coming after this that will need to separate them.
[lldb] Fix copy-paste error in SetPrivateRunLockToRunning (#189322)
SetPrivateRunLockToRunning incorrectly delegated to
SetPrivateRunLockToStopped instead of SetPrivateRunLockToRunning,
causing the private run lock to never transition to the running state on
process resume.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply at anthropic.com>
[CodeGen] Add additional params to `TargetLoweringBase::getTruncStoreAction` (#187422)
The truncating store analogue of #181104.
Adds `Alignment` and `AddrSpace` parameters to
`TargetLoweringBase::getTruncStoreAction` and dependents, and introduces
a `getCustomTruncStoreAction` hook for targets to customize legalization
behavior using this new information.
This change is fully backwards compatible from the target's point of
view, with `setTruncStoreAction` having identical functionality. The
change is purely additive.
[Clang] More aggressively mark this* dead_on_return in destructors (#183347)
Now also mark the this pointer dead_on_return for classes with a
non-zero number of base classes. We saw a limited number of failures
internally due to this change, so it doesn't seem like there are too
many problems with real world deployment.
[AArch64][llvm] Gate some `tlbip` insns with +tlbid or +d128
Change the gating of `tlbip` instructions containing `*E1IS*`, `*E1OS*`,
`*E2IS*` or `*E2OS*` to be used with `+tlbid` or `+d128`. This is because
the 2025 Armv9.7-A MemSys specification says:
```
All TLBIP *E1IS*, TLBIP*E1OS*, TLBIP*E2IS* and TLBIP*E2OS* instructions
that are currently dependent on FEAT_D128 are updated to be dependent
on FEAT_D128 or FEAT_TLBID
```
[AArch64][llvm] Separate TLBI-only feature gating from TLBIP aliases (part 2) (#189503)
(This is the change message for 306e86be5, which GitHub unhelpfully discarded
when I enabled "auto-merge" when all tests had passed. It merged my change
and didn't merge the detailed commit message which I had carefully written(!)
So this is an NFC change, for those in the future looking for the lost message,
which explains the changes in 306e86be5)
-------------
Correct the `TLBI` system operand definitions so the emitted aliases are
generated from a single data table per feature group. The TLBI data is now
written once as anonymous `TLBI<...>` entries inside a defvar list, and we
iterate over it with a foreach to define those entries.
Hopefully this is clearer and more future-proof, since it is a complex set of
interactions between `tlbi`/`tlbip` with `*nXS` variants, and differing gating.
[7 lines not shown]
[NFC][LLVM] Rename `IITDescriptor` fields to confirm to LLVM CS (#189448)
Rename fields of `IITDescriptor` to conform to LLVM coding standards
naming conventions.
[libc][NFC] Add LIBC_INLINE and cleanup wchar internals (#188856)
Some of the functions were missing LIBC_INLINE and some of the variable
names were less descriptive than I liked. This PR fixes both as well as
cleaning up dependencies.
[AArch64][llvm] Separate TLBI-only feature gating from TLBIP aliases (part 2)
(This is the change message for 306e86be5, which GitHub unhelpfully lost
when I enabled "auto-merge" when all tests had passed. It merged my change
and didn't merge the commit message. So this is an NFC change, for those
in the future looking for the lost message, which explains the actual change)
-------------
Correct the `TLBI` system operand definitions so the emitted aliases are
generated from a single data table per feature group. The TLBI data is now
written once as anonymous `TLBI<...>` entries inside a defvar list, and we
iterate over it with a foreach to define those entries.
Hopefully this is clearer and more future-proof, since it is a complex set of
interactions between `tlbi`/`tlbip` with `*nXS` variants, and differing gating.
The gating was incorrect before. The gating is now:
- `FeatureTLB_RMI`, `FeatureRME`, and `FeatureTLBIW` gate only TLBI aliases
[6 lines not shown]
[MLIR][SCF] Fix scf.index_switch lowering to preserve large case values (#189230)
`IndexSwitchLowering` stored case values as `SmallVector<int32_t>`,
which silently truncated any `int64_t` case value larger than INT32_MAX
(e.g. `4294967296` became `0`). The `cf.switch` flag was also created
via `arith.index_cast index -> i32`, losing the upper 32 bits on 64-bit
platforms.
Fix: store case values as `SmallVector<APInt>` with 64-bit width, cast
the index argument to `i64`, and use the `ArrayRef<APInt>` overload of
`cf::SwitchOp::create` so the resulting switch correctly uses `i64` case
values and flag type.
Fixes #111589
Assisted-by: Claude Code
[mlir][scf] Fix FoldTensorCastOfOutputIntoForallOp write order bug (#189162)
`FoldTensorCastOfOutputIntoForallOp` incorrectly updated the
destinations of `tensor.parallel_insert_slice` ops in the `in_parallel`
block by zipping `getYieldingOps()` with `getRegionIterArgs()`
positionally. This assumed that the i-th yielding op writes to the i-th
shared output, which is not required by the IR semantics. When slices
are written to shared outputs in non-positional order, the
canonicalization would silently reverse the write targets, producing
incorrect output.
Fix by replacing the positional zip with a per-destination check: for
each yielding op's destination operand, if it is a `tensor.cast` result
whose source is one of the new `scf.forall` region iter args (i.e., a
cast we introduced to bridge the type change), replace the destination
with the cast's source directly. This correctly handles all orderings.
Add a regression test that exercises the multi-result case where
`parallel_insert_slice` ops write to shared outputs in non-sequential
[4 lines not shown]
[MLIR][SparseTensor] Fix fingerprint changes in SparseFuncAssembler (#188958)
SparseFuncAssembler::matchAndRewrite was calling funcOp.setName(),
funcOp.setPrivate(), and funcOp->removeAttr() directly without notifying
the rewriter, causing "operation fingerprint changed" errors under
MLIR_ENABLE_EXPENSIVE_PATTERN_API_CHECKS.
Wrap all in-place funcOp mutations with rewriter.modifyOpInPlace.
Assisted-by: Claude Code
Fix a failure present with MLIR_ENABLE_EXPENSIVE_PATTERN_API_CHECKS=ON.
Co-authored-by: Claude Sonnet 4.6 <noreply at anthropic.com>