[Polly] Avoid __builtin_assume circular context reasoning (#189350)
The conversion of SCEVs to isl::pw_aff may only be valid under
conditions that have to be confirmed via RTC. This also happens with
__builtin_assume. These user-added assumptions are then added to
ScopInfo::Context. However, the conclusion in ScopInfo::Context is then
also used to simplify ("gist") its own RTC preconditions in
ScopInfo::AssumedContext and ScopInfo::InvalidContext away.
Avoid by adding user assumptions with preconditions to
ScopInfo::DefinedBehaviourContext instead, which is not used to simplify
AssumedContext/InvalidContext.
Fixes #187922
Thanks @thapgua for the report
[BasicBlockUtils] Fixed LoopInfo update in UpdateAnalysisInformation() (#177147)
SplitLandingPadPredecessors() results in an irreducible loop
and makes LoopInfo invalid. Verification results in a crash:
Assertion `CB != OutsideLoopPreds[i] && "Loop has multiple entry
points!"' failed.
Created a new test with a broken LoopInfo after
SplitLandingPadPredecessors().
The test @split-lp-predecessors-test() after
SplitBlockPredecessors(catch_dest, { loop }, "", DT, LI) changes to
the following IR where the loop {%catch_dest} gets into irreducible
loop {%catch_dest.split-lp, %catch_dest}:
```
define void @split-lp-predecessors-test() personality ptr null {
entry:
invoke void @foo()
to label %loop unwind label %catch_dest.split-lp
[32 lines not shown]
[InstCombine] Fold X * ldexp(1.0, Y) -> ldexp(X, Y). (#188493)
This would avoid the FMUL in sequences such as
[these](https://godbolt.org/z/xhqfe5sb1).
[clang][x86] Fix the return type of the cvtpd2dq builtin (#189254)
The CVTPD2DQ instruction converts packed 64-bit floating-point values to
packed 32-bit signed integer values. This patch fixes the return type of
the corresponding builtin, which previously returned a vector of two
64-bit signed integers. The new behavior is in line with the return type
of the CVTTPD2DQ builtin.
[DAG] Fix incorrect ForSigned handling in computeConstantRange calls (#188889)
Fix two places where ForSigned was incorrectly passed to
computeConstantRange, causing wrong signed/unsigned range computation.
In computeConstantRangeIncludingKnownBits (DemandedElts overload),
the call omitted ForSigned, so Depth (unsigned) was implicitly
converted to bool for the ForSigned parameter. Introduced in
a6a66a4e6915.
In visitIMINMAX, the call always passed ForSigned=false, even when
folding SMAX/SMIN which query signed bounds from the resulting range.
[MLIR][Vector] Fix direct operand.set() bypassing rewriter in WarpOpScfIfOp/ForOp (#188948)
In WarpOpScfIfOp and WarpOpScfForOp, the walk that updates users of
escaping values (after moving them to the inner WarpOp) was calling
operand.set() directly, bypassing the rewriter API. This causes the
MLIR_ENABLE_EXPENSIVE_PATTERN_API_CHECKS fingerprint check to fail.
Fix by wrapping the operand updates with rewriter.modifyOpInPlace().
Assisted-by: Claude Code
Fix a failure present with MLIR_ENABLE_EXPENSIVE_PATTERN_API_CHECKS=ON.
[MLIR][MPI] Fix direct getRefMutable().assign() bypassing rewriter in FoldCast (#188943)
The FoldCast canonicalization pattern was calling
op.getRefMutable().assign(src) directly, bypassing the rewriter. This
violates the pattern API contract and causes fingerprint change failures
when
MLIR_ENABLE_EXPENSIVE_PATTERN_API_CHECKS is enabled. Wrap the
modification with b.modifyOpInPlace() to properly notify the rewriter of
the changes.
Assisted-by: Claude Code
Fix a failure present with MLIR_ENABLE_EXPENSIVE_PATTERN_API_CHECKS=ON.
[AArch64][llvm] Redefine some isns as an alias of `SYS`
Some instructions are not currently defined as an alias of `SYS`
when they should be, so they don't disassemble back into the
native instruction, but instead disassemble into `SYS`.
Fix these cases and add additional testcase.
Note that I've left `GCSPUSHM` due to a `mayStore`, `GCSSS1` and
`GCSSS2` as they're used in AArch64ISelDAGToDAG.cpp, and `GCSPOPM`
has an intrinsic pattern in AArch64InstrInfo.td. They will disassemble
correctly though, as they use `InstAlias`.
[DA] Consolidate the logic for checking overlap at the boundary (NFCI) (#189341)
In the Weak Crossing SIV test, there were two places where we checked
the dependency at the boundary, one is at the first iteration, and the
other is at the last iteration. Now the former can be merged into the
latter. There used to be an edge case when the coefficient is zero, and
we had an explicit check for that. This patch removes that check as
well, by moving the boundary check after the assertion that ensures the
(maybe negated) coefficient is positive.
`#pragma redefine_extname`: warn only if conflicting ID is at TU scope. (#188256)
As an example, this should keep warning:
```
static void foo();
```
because here, the identiifer `foo` won't be affected. In fact, it now
becomes (mostly) impossible to even declare anything later that would
get affected, thus the new definition is in active conflict with the
`#pragma`.
This however will not warn anymore:
```
namespace blargh {
static void foo();
}
[30 lines not shown]
Serialize `#pragma redefine_extname` into precompiled headers. (#186755)
Also deserialize them back again on reading.
The implementation is based on the existing implementation of `#pragma
weak` serialization.
Fixes issue #186742.
---------
Co-authored-by: Chuanqi Xu <yedeng.yd at linux.alibaba.com>
clang: Return Triple from OffloadArchToTriple instead of a string
Also stop bothering to call normalizeOffloadTriple. This was
produced by code which should always produce normalized triples.
[DA] Stop negating Delta in the Weak Zero SIV test (#188212)
This patch removes the variable `NewDelta`, which was calculated as the
negation of `Delta`, along with its uses. `NewDelta` is now referenced
in only one place, and that code is effectively dead because more
general analysis with ConstantRange is performed at an earlier stage.
Also the test using `NewDelta` is not correct when `Delta` is a signed
minimum value, as negating it yields the same value as original. This
patch also fixes the correctness issue in such a situation.
[mlir][affine] Add ValueBounds-based simplification for delinearize(linearize) pairs (#187245)
`affine.linearize_index` pairs
(`CancelDelinearizeOfLinearizeDisjointExactTail`) only match when basis
elements are exactly equal as `OpFoldResult` values. This means they
cannot simplify cases where dynamic basis products are semantically
equal but represented by different SSA values or affine expressions.
This patch adds a new pass `affine-simplify-with-bounds` with two
rewrite patterns that use `ValueBoundsConstraintSet` to prove equality
of basis products:
- **`SimplifyDelinearizeOfLinearizeDisjointManyToOneTail`**: matches
when multiple consecutive linearize dimensions have a product equal to a
single delinearize dimension (many-to-one).
- **`SimplifyDelinearizeOfLinearizeDisjointOneToManyTail`**: matches
when a single linearize dimension equals the product of multiple
consecutive delinearize dimensions (one-to-many).
[8 lines not shown]
[mlir][tosa] Harden folds/canonicalizations for unranked and dynamic shapes (#188188)
This MR fixes #188187 and #187974. Tighten TOSA constant folding and
identity-style folds so they do not produce invalid or type-incorrect
results when the op’s result type is unranked, rank-dynamic, or
otherwise not a static `RankedTensorType`. Several paths previously
assumed ranked/static shapes or folded through to the operand without
checking that the result type matched the value being returned.
`DenseElementsAttr::get`, `SplatElementsAttr::get` and similar builders
need a static shape; folding with `tensor<*xT>` or dynamic dims must not
fabricate dense attributes with the wrong shape.
Returning the operand from a “no-op” fold is only valid when
`operand.getType() == op.getType()`; otherwise the folder would change
the IR’s type semantics (e.g. ranked → unranked). Which in the bigger
pipeline supposed to be handled by `-tosa-infer-shapes`
Assisted-by: CLion code completion, GPT 5.3 - Codex
[3 lines not shown]
[DA] Fix overflow of calculation in weakCrossingSIVtest (#188450)
This patch fixes a correctness issue where integer overflow in the
upper bound calculation of weakCrossingSIVtest caused the pass to
incorrectly prove independence.
The previous logic used `SCEV::getMulExpr` to calculate
`2 * ConstCoeff * UpperBound` and compared it to `Delta` using
`isKnownPredicate`. In the presence of overflow, this could yield
unsafe results.
This change replaces the SCEV arithmetic with `ConstantRange` to
work around calculation overflows, ensures we conservatively assume
a dependence if the bounds cannot be proven safe.
---------
Signed-off-by: Ruoyu Qiu <cabbaken at outlook.com>
[lldb] Use AppendMessageWithFormatv instead of AppendMessageWithFormat (#185634)
Part 4. This converts all the remaining simple uses (the ones that ended
with a newline).
What remains in tree are the outliers that expect multiple ending
newlines, or are building a message in pieces.
[VPlan] Generalize noalias-licm-check to replicate regions (NFC) (#187017)
In order to use the cannotHoistOrSinkWithNoAlias check in use-sites
after replicate regions are created, generalize it to work with
replicate regions.