[MLIR][XeGPU] XeGPU Layout adds support for fractional-subgroup-size vector (#183434)
This PR enhances the layout assignment for XeGPU load/store operations
to handle vector size smaller than subgroup size.
Say for vector[4], in case of lane_data=[1], lane_layout=[4] and
inst_data=[4].
The fractional-subgroup-size vector support is required to support the
cross-subgroup reduction case. The number of participant subgroups in
reduction can be small, so it causes each subgroup needs to reduce a
small vector size, often a fraction of subgroup size.
Most layout-based subgroup distribution patterns support
fraction-subgroup-size without no change except a few: reduction,
insert/extract, constant. We don't expect ND operations (like
load_nd/store_nd/dpas) accept fractional-subgroup-size vector.
Revert "[mlir-tblgen] Remove `namespace {}` around OpDocGroup (#182721)" (#183458)
Reverts #182721, it's not needed after #183457.
It was a work around for #182720.
This reverts commit a0f344f69d7eb5d87dd78c628a196a3a7440e792.
[SafeStack] Allow -fsanitize-minimal-runtime with -fsanitize=safestack (#183644)
SafeStack does not require a full sanitizer runtime, so it should be
compatible
with the minimal runtime flag.
[mlir][vector] Fix fold result for empty vector.mask with no results (#180345)
This PR fixes `foldEmptyMaskOp` to return `failure` when folding an
empty vector.mask whose terminatorhas no operands. Previously this case
returned success without producing any folded results, which violates
the folding contract. Fixes #177825.
[DenseMap] Add memory barrier for sanitizers in getInlineBuckets/getLargeRep (#183457)
Add a compiler memory barrier to prevent optimizations from triggering
false positives on partially poisoned buckets in (HW)ASan.
Fixes #182720.
[DenseMap] Add memory barrier for sanitizers in getInlineBuckets/getLargeRep
Add a compiler memory barrier to prevent optimizations from triggering
false positives on partially poisoned buckets in (HW)ASan.
Fixes #182720.
Pull Request: https://github.com/llvm/llvm-project/pull/183457
[Sema] Fix crash on invalid operator template-id (#181404)
Add checks in GetNameFromUnqualifiedId to handle invalid TemplateId
cases safely. This avoids a crash when handling an invalid template-id
during error recovery and allows normal error reporting to continue.
Fixes #177549
[LoopUnrollAndJam] Update test unroll-and-jam.ll (NFC) (#183520)
The test `unroll-and-jam.ll` has the following issues:
- Some functions use `%i` and `%I` as variable names, which UTC fails to
distinguish, causing it to update the assertions incorrectly.
- Some tests use parameters for loop bounds, which means they will start
failing in the near future due to the ongoing changes in DA.
To address these issues, this patch updates the test as follows:
- Renames certain variables to avoid the naming conflict.
- For the tests that will be affected by the DA changes, adds variants
with constant loop bounds.
[MLIR][Presburger][NFC] Don't add empty regions when unioning PWMA functions (#182468)
This will prevent exponential behaviour in lexicographic maximum
computation, where the `tiebreak` predicate is very likely to return
empty regions.
[LoopUnrollAndJam] Update test dependencies.ll (NFC) (#183509)
Recent on-going works to fix the correctness issues in DA will affect
some existing regression tests for passes that rely on it. As a result,
the original intent of several tests will be lost.
This patch updates `dependencies.ll` to avoid such issues and preserve
its intent. Specifically, this patch changes the loop bounds from
parameters to constants, which allows SCEV to infer no-wrap flags for
the addrecs. Also this patch updates other minor issues in the test,
such as adding pseudo codes and removing some `nuw` to avoid UB.
[CIR] Remove branch through cleanup fixups (#182953)
Because we are using a structured representation of cleanups in CIR, we
don't need to handle branching through cleanups during codegen. These
branches are created during CFG flattening instead. However, we had
already committed some code that copied the classic codegen behavior for
branching through cleanups. This change deletes that unneeded code.
The most significant change here is that when we encounter a return
statement we emit the return directly in the current location.
The coroutine implementation still creates a return block in the current
lexical scope and branches to that block. Cleaning up that
representation is left as future work.
The popCleanupBlock handling still has a significant amount of logic
that is carried over from the classic codegen implementation. It is left
in place until we can be sure we won't need it.
[MLIR][Python] Support op adaptor for Python-defined operations (#183528)
Previously, in #177782, we added support for dialect conversion and
generated an `OpAdaptor` subtype for every ODS-defined operation. In
this PR, we will also generate `OpAdaptor` subtypes for Python-defined
operations, so that they can be applied in dialect conversion as well.
[mlir][arith-to-spirv] Fix null dereference when converting trunci/extui with tensor types (#183654)
`getScalarOrVectorConstInt` only handles `VectorType` and `IntegerType`,
returning `nullptr` for any other type (e.g., a `RankedTensorType` that
slips through after type emulation maps `tensor<Nxi16>` to
`tensor<Nxi32>` with the same destination type). The callers in
`TruncIPattern` and `ExtUIPattern` passed this null value directly to
`spirv::BitwiseAndOp::create`, causing a null-pointer dereference in
`OperandStorage`.
Similarly, the signed-extension pattern passes the result of
`getScalarOrVectorConstInt` as a shift amount to
`ShiftLeftLogicalOp::create` without a null check.
Add `if (\!mask)` / `if (\!shiftSize)` guards that return a match
failure in all three cases, converting the crash into a proper
legalization failure.
Fixes #178214