[LV] Add test for cost modeling wide calls with mixed return types (NFC) (#195177)
Add missing test coverage for test with multiple calls with different
return types
[VPlan] Dissolve replicate regions with vector live-outs. (#189022)
Remove the scalar VF restriction and properly handle replicate regions
with vector live outs.
After unrolling the replicate regions, we end up with a set of scalar
VPPhis. The current patch post-processes them and converts them to
a chain of InsertElement + VPWidenPHiRecipes to match original codegen
as closely as possible.
An alternative would be to keep the phis scalar and combine them with
BuildVector at the end, but that would result in quite different
codegen.
Now that ::execute for replicate regions is dead, clean up
VPTransformState::Lane and various ::execute that relied on it.
Depends on https://github.com/llvm/llvm-project/pull/186252
PR: https://github.com/llvm/llvm-project/pull/189022
[DebugInfo] Fix crash in declare-to-assign when memcpy writes to scalable-vector alloca (#194107)
## Problem
`declare-to-assign` (`AssignmentTrackingPass`) crashes with a fatal error when a fixed-size `memcpy` writes into a scalable-vector alloca (e.g. an RVV `vint32m1_t`):
Cannot implicitly convert a scalable size to a fixed-width size in TypeSize::operator ScalarTy()
**PS**: The compiler explorer always implicitly adds the '-g' option, when adding the '-g0', the crash will disappear: https://riscvc.godbolt.org/z/dEqhc4EoE
**Reproducer** (clang `-target riscv64-unknown-linux-gnu -march=rv64gcv -O1 -g`):
```c
#include <string.h>
#include <riscv_vector.h>
vint32m1_t get_i32x4(int* v) {
vint32m1_t r;
memcpy(&r, v, 16);
return r;
}
[13 lines not shown]
[llubi] Fix inconsistent intrinsic argument retrieval (#195499)
This PR fixes inconsistent intrinsic argument retrieval by making all
intrinsics fetch their arguments from `Args`. This change is a
prerequisite for handling parameter attributes in `enterCall`.
workflows/release-binaries: Remove extra depencies for Arm64 Windows (#195222)
The python modules these were needed for were removed in
cdc41818e3bd9e8cb7788d59365e39fe6433159e.
[LifetimeSafety] Detect iterator invalidation through container aliases (#195231)
The previous heuristic in `handleInvalidatingCall` is too conservative.
The ideal way would be completely removing this, but it would introduce
~10 regressions in the existing testcases.
This commit replace the filter with a narrower guard that only skips
direct field accesses (AccessPath currently lacks field granularity and
cannot distinguish `s.v1` from `s.v2`).
Closes https://github.com/llvm/llvm-project/issues/193044
[RFC][IR] Support vector splats in `ConstantPointerNull`
This PR allows `ConstantPointerNull` to represent both scalar pointer nulls and
fixed or scalable vector splats of pointer nulls. This change first aligns with
the native splat behavior of `ConstantInt` and `ConstantFP`, and second, makes
it easier to eventually change the semantics of `ConstantPointerNull` to
represent a semantic null pointer instead of a zero value, which is what it
represents today.
[NFC][TableGen] Drop OperandInfo::addField/fields() wrappers and use OperandInfo::Fields instead (#195489)
Fields is already a public member; the wrappers added no semantic value
beyond a thin storage indirection (and ArrayRef-typed reads). Use Fields
directly at all call sites for consistency with the rest of the struct's
plain-data style.
Assisted by Claude.
Co-authored-by: Claude Opus 4.7 (1M context) <noreply at anthropic.com>
[RFC][IR] Support vector splats in `ConstantPointerNull`
This PR allows `ConstantPointerNull` to represent both scalar pointer nulls and
fixed or scalable vector splats of pointer nulls. This change first aligns with
the native splat behavior of `ConstantInt` and `ConstantFP`, and second, makes
it easier to eventually change the semantics of `ConstantPointerNull` to
represent a semantic null pointer instead of a zero value, which is what it
represents today.
[RFC][IR] Support vector splats in `ConstantPointerNull`
This PR allows `ConstantPointerNull` to represent both scalar pointer nulls and
fixed or scalable vector splats of pointer nulls. This change first aligns with
the native splat behavior of `ConstantInt` and `ConstantFP`, and second, makes
it easier to eventually change the semantics of `ConstantPointerNull` to
represent a semantic null pointer instead of a zero value, which is what it
represents today.
[LV] Modernize as_cast.ll test. (NFC) (#195481)
Update as_cast.ll to cover both loop-invariant and varying address space
casts, as well as auto-generating the checks.
[VPlan] Set predecessor of DispatchVPBB early (NFC). (#195480)
This allows finding the containing plan earlier, which helps when trying
to print DispatchVPBB in a debugger.
[CIR] Use declarative TableGen constraints for overflow flag verification
Replace hand-written C++ verifiers with PredOpTrait-based constraints
(FlagRequiresIntType, HasAtMostOneOfAttrs). Introduce CIR_SaturatableBinaryOp
base class and use append/prepend ODS directives to compose arguments, format,
and traits across the op hierarchy. Fix HasAtMostOneOfAttrsPred to use
accessor methods instead of dollar-sign references. Add Commutative trait
to AddOp and MulOp.
[X86][GlobalISel] Support fp80 for G_FPTRUNC and G_FPEXT (#141611)
Introduce `G_FPEXTLOAD` and `G_FPTRUNCSTORE` for extending load and
truncating store of a floating point value.
* Introduce `IfFPExtend` and `IfFPTrunc` into `GINodeEquiv` to dispatch
SDAG patterns to the newly introduced opcodes similarly to `G_SEXTLOAD`
and `G_ZEXTLOAD`.
* Added narrowing and widening for the opcodes. However they aren't used
anywhere.
* Supported lowering of `G_FPEXTLOAD` and `G_FPTRUNCSTORE` for X86 by
using X87.
* Added `lowerFPExtAndTruncMem` as default lowering for `G_FPTRUNC` and
`G_FPEXT` using memory.
* Dropped autogenerated line from `legalizer-info-validation.mir` as
scripts can't update them anymore.
* Updated `match-table-cxx.td` with regexps. This is not the first PR
that updates the whole test after opcode introduction.
[CIR] Extract CIR_ClassCastOp base class for BaseClassAddrOp and DerivedClassAddrOp
Both ops have identical structure (arguments, results, assembly format)
and differ only in mnemonic and description. Extract a shared TableGen
base class to eliminate the duplication. Also improve the assembly format
to print nonnull before the operand and place the type after the offset.
[AMDGPU] Make v2f32 legal for G_FNEG and G_FABS and pattern update (#195419)
G_FNEG and G_FABS were made legal for v2f32 when packed fp32 instructions were implemented.
For some unknown reasons, this legalization was not upstreamed yet. This work makes v2f32 legal for
G_FNEG and G_FABS, and updates a few tablegen patterns to ensure instructions can be correctly
selected.
[SelectionDAG] Move VSelect sign pattern check from AArch64 to general SelectionDAG (#151840)
For some reason the check is already there, but it bails out. Doing the
transform in SelDAG has no negative effect.
[dsymutil] Update module-warnings.test to run with both linkers (#195474)
The classic linker emits a combined .debug_macinfo table and warns about
MacroLists it has to drop because no compile unit references them. The
parallel linker emits .debug_macinfo per compile unit, so unreferenced
lists are never emitted and have no corresponding warning.