[AMDGPU][Scheduler] Fix non-monotonic SlotIndex after schedule revert (#192039)
modifyRegionSchedule restores the original instruction order by splicing
MIs before RegionEnd. When an MI is already at the expected position
(MII == RegionEnd) its SlotIndex was left unchanged, even though earlier
splices may have shifted neighboring indices. This could leave a stale,
lower-numbered slot on a non-moved MI, breaking SlotIndex monotonicity
and corrupting LiveIntervals.
The corruption surfaced as a "register isn't live" assertion in
GCNDownwardRPTracker when PreRARematStage's finalizeGCNSchedStage
globally reverted regions that were already locally reverted by
checkScheduling.
Fix by calling LIS->handleMove for non-moved MIs whose SlotIndex has
become non-monotonic (PrevIdx >= MI_Idx). Additionally, track whether
checkScheduling already reverted a region and skip the redundant global
revert in finalizeGCNSchedStage.
Assisted-by: Claude Opus
Reapply "[cmake] Add support for statically linking libxml2" (#192088)
This applies a fix for windows not discovering libxml
This reverts commit 2a9c32496b5e8e63844597f638bdf67e4732fd35.
Merge tag 'sound-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound updates from Takashi Iwai:
"Nothing too thrilling here, but we see lots of driver updates and bug
fixes, including quirk additions and refactoring works, while there
have been little changes in the core functionality. Here are some
highlights:
Core:
- Add validation for the control API put callback
- Fixes in compress-offload API timestamp handling
- Continued ASoC core API cleanups
ASoC:
- Add support for bus keepers (for Apple devices in future)
- Enhancements to the SDCA support, including retaskable jacks
- Test improvements for Cirrus Logic drivers
- Lots of fixes for the NXP, nVidia and Qualcomm
- Support for AMD RPL DMIC, Cirrus Logic CS42L43 and CS47L47, nVidia
[40 lines not shown]
5012 mpt_sas fails to issue writes >1MB
Reviewed by: Albert Lee <trisk at forkgnu.org>
Reviewed by: Klaus Ziegler <klausz at haus-gisela.de>
Approved by: Robert Mustacchi <rm+illumos at fingolfin.org>
[MLIR][Vector] Fix WarpOpScfForOp and WarpOpScfIfOp leaving invalid ops after region moves (#188951)
WarpOpScfForOp::matchAndRewrite called mergeBlocks() to move forOp's
body block into the inner WarpOp. mergeBlocks() erases the source block,
leaving forOp with an empty body region (0 blocks). Since scf.for
requires exactly 1 body block, IR verification fails with "region with 1
blocks" after the pattern succeeds. Additionally, when forOp had no init
args, the pattern was missing the scf.yield terminator in the new ForOp.
WarpOpScfIfOp::matchAndRewrite had the same issue: takeBody() emptied
the ifOp's then/else regions, leaving scf.if with 0 blocks.
Fix:
- Restore the conditional scf.yield creation (only when newForOp has
results).
- After merging/taking the regions, replace the remaining op's results
with ub.poison and erase the now-invalid op from the new WarpOp's body.
Assisted-by: Claude Code
Fix a failure present with MLIR_ENABLE_EXPENSIVE_PATTERN_API_CHECKS=ON.
[mlir][OpenMP] Fix teams reduction assert with sibling distributes (#191475)
Avoid double-mapping teams reduction block arguments when multiple
sibling `omp.distribute` ops appear under the same `omp.teams`.
Replace `teamsReductionContainedInDistribute` with
`getDistributeCapturingTeamsReduction`, which returns the unique
`omp::DistributeOp` containing all non-debug uses of the teams reduction
block arguments, or null if no such distribute exists. In that case,
reduction setup falls back to the teams op.
This ensures only the owning distribute handles reduction setup and
prevents `moduleTranslation.mapValue()` from being called multiple times
for the same block arguments.
Fixes https://github.com/llvm/llvm-project/issues/191470
Assisted with Copilot and GPT-5.4
[CIR][OpenMP] Add OpenMP-to-LLVM type conversion for CIR-to-LLVM lowering (#190063)
Register OpenMP conversion legality and patterns in the CIR-to-LLVM pass
so that OpenMP operations (e.g. omp.map.info, omp.target) have their CIR
types converted to LLVM types during lowering. Without this,
the conversion leaves behind unrealized_conversion_cast ops that cause
translation to LLVM IR to fail.
Also registers omp::PointerLikeType on cir::PointerType so that CIR
pointers are accepted as operands in OpenMP map operations.
Assised-by: Cursor / Claude Opus 4.6
[CIR] Handle full expression cleanups in conditional branches (#191479)
This adds CIR support for handling full expression cleanups in
conditional branches. Because CIR uses structured control flow, it was
necessary to handle these cleanups differently than is done in classic
codegen. CIR speculatively creates a cleanup scope when an
ExprWithCleanups contains a conditional operator and maintains a
dedicated stack of these deferred cleanups, which is added to the
cleanup scope at the end of the full expression with an active flag used
to control whether the cleanup should be executed based on any branches
that may have been taken during the conditional expression evaluation.
This is similar to the mechanism used for lifetime extended cleanups,
but the timing of when the cleanups are moved to the main EH stack is
different, so we need to maintain two different pending cleanup stacks.
We are able to use the same PendingCleanupEntry class for both.
Assisted-by: Cursor / claude-4.6-opus-high
[CIR] Handle irregular partial init destroy (#192158)
This implements CIR handling for the case where an array of a destructed
type is being initialized using an initializer list and therefore the
address of the last successfully constructed element must be loaded from
a temporary variable before the partial array destroy loop can begin.
In classic codegen this shares its implementation with the general array
ctor/dtor handling, but in CIR we have dedicated operations to abstract
array ctors/dtors, so the implementation of the irregular partial
destruction happens in a different place in codegen.
Assisted-by: Cursor / claude-4.6-opus-high
arm64: Define the .iplt section placement.
Ensure that the .plt and .ipld sections are in the executable memory segment.
MFC after: 1 week
Reviewed by: andrew
Differential Revision: https://reviews.freebsd.org/D56403