LLVM/project ef2815dllvm/test/Transforms/LoopVectorize/AArch64 call-costs.ll

[LV] Add test for cost modeling wide calls with mixed return types (NFC) (#195177)

Add missing test coverage for test with multiple calls with different
return types
DeltaFile
+69-7llvm/test/Transforms/LoopVectorize/AArch64/call-costs.ll
+69-71 files

LLVM/project 9eb57b6mlir/include/mlir/Dialect/LLVMIR NVVMOps.td

[MLIR][NVVM] Add `NVVM_F32UnaryApproxOp` Base Class (NFC) (#194378)

Add `NVVM_F32UnaryApproxOp` tablegen class to unify implementation
DeltaFile
+15-26mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td
+15-261 files

LLVM/project efd429fllvm/lib/Transforms/Vectorize VPlanUnroll.cpp VPlanRecipes.cpp, llvm/test/Transforms/LoopVectorize hoist-predicated-loads-with-predicated-stores.ll struct-return-replicate.ll

[VPlan] Dissolve replicate regions with vector live-outs. (#189022)

Remove the scalar VF restriction and properly handle replicate regions
with vector live outs.

After unrolling the replicate regions, we end up with a set of scalar
VPPhis. The current patch post-processes them and converts them to
a chain of InsertElement + VPWidenPHiRecipes to match original codegen
as closely as possible.

An alternative would be to keep the phis scalar and combine them with
BuildVector at the end, but that would result in quite different
codegen.

Now that ::execute for replicate regions is dead, clean up
VPTransformState::Lane and various ::execute that relied on it.


Depends on https://github.com/llvm/llvm-project/pull/186252 

PR: https://github.com/llvm/llvm-project/pull/189022
DeltaFile
+120-200llvm/test/Transforms/LoopVectorize/X86/x86-interleaved-accesses-masked-group.ll
+50-96llvm/test/Transforms/LoopVectorize/X86/cost-conditional-branches.ll
+64-80llvm/test/Transforms/LoopVectorize/hoist-predicated-loads-with-predicated-stores.ll
+106-28llvm/lib/Transforms/Vectorize/VPlanUnroll.cpp
+26-108llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
+39-43llvm/test/Transforms/LoopVectorize/struct-return-replicate.ll
+405-55543 files not shown
+600-95349 files

LLVM/project 5d79fb0llvm/lib/IR DebugInfo.cpp, llvm/test/DebugInfo/Generic/assignment-tracking/declare-to-assign scalable-vector-memcpy.ll

[DebugInfo] Fix crash in declare-to-assign when memcpy writes to scalable-vector alloca (#194107)

## Problem

`declare-to-assign` (`AssignmentTrackingPass`) crashes with a fatal error when a fixed-size `memcpy` writes into a scalable-vector alloca (e.g. an RVV `vint32m1_t`):

Cannot implicitly convert a scalable size to a fixed-width size in TypeSize::operator ScalarTy()

**PS**: The compiler explorer always implicitly adds the '-g' option, when adding the '-g0', the crash will disappear: https://riscvc.godbolt.org/z/dEqhc4EoE

**Reproducer** (clang `-target riscv64-unknown-linux-gnu -march=rv64gcv -O1 -g`):
```c
#include <string.h>
#include <riscv_vector.h>
vint32m1_t get_i32x4(int* v) {
  vint32m1_t r;
  memcpy(&r, v, 16);
  return r;
}

    [13 lines not shown]
DeltaFile
+89-0llvm/test/DebugInfo/Generic/assignment-tracking/declare-to-assign/scalable-vector-memcpy.ll
+2-1llvm/lib/IR/DebugInfo.cpp
+91-12 files

LLVM/project b6fd155mlir/lib/Dialect/AMDGPU/Transforms FoldMemRefsOps.cpp

nits

Signed-off-by: Eric Feng <Eric.Feng at amd.com>
DeltaFile
+44-39mlir/lib/Dialect/AMDGPU/Transforms/FoldMemRefsOps.cpp
+44-391 files

LLVM/project d184d9allvm/tools/llubi/lib Interpreter.cpp

[llubi] Fix inconsistent intrinsic argument retrieval (#195499)

This PR fixes inconsistent intrinsic argument retrieval by making all
intrinsics fetch their arguments from `Args`. This change is a
prerequisite for handling parameter attributes in `enterCall`.
DeltaFile
+60-50llvm/tools/llubi/lib/Interpreter.cpp
+60-501 files

LLVM/project d873c55mlir/lib/Dialect/AMDGPU/IR AMDGPUOps.cpp, mlir/test/Dialect/AMDGPU invalid.mlir

nits

Signed-off-by: Eric Feng <Eric.Feng at amd.com>
DeltaFile
+11-15mlir/lib/Dialect/AMDGPU/IR/AMDGPUOps.cpp
+24-0mlir/test/Dialect/AMDGPU/invalid.mlir
+35-152 files

LLVM/project 9cf37dcmlir/include/mlir/Dialect/AMDGPU/IR AMDGPUOps.td, mlir/lib/Conversion/AMDGPUToROCDL AMDGPUToROCDL.cpp

polish

Signed-off-by: Eric Feng <Eric.Feng at amd.com>
DeltaFile
+12-28mlir/lib/Dialect/AMDGPU/IR/AMDGPUOps.cpp
+0-22mlir/test/Dialect/AMDGPU/invalid.mlir
+5-5mlir/lib/Conversion/AMDGPUToROCDL/AMDGPUToROCDL.cpp
+4-4mlir/include/mlir/Dialect/AMDGPU/IR/AMDGPUOps.td
+21-594 files

LLVM/project 7501fde.github/workflows release-binaries.yml

workflows/release-binaries: Remove extra depencies for Arm64 Windows (#195222)

The python modules these were needed for were removed in
cdc41818e3bd9e8cb7788d59365e39fe6433159e.
DeltaFile
+0-7.github/workflows/release-binaries.yml
+0-71 files

LLVM/project b561bdbclang/lib/Analysis/LifetimeSafety FactsGenerator.cpp, clang/test/Sema warn-lifetime-safety-invalidations.cpp

[LifetimeSafety] Detect iterator invalidation through container aliases (#195231)

The previous heuristic in `handleInvalidatingCall` is too conservative.
The ideal way would be completely removing this, but it would introduce
~10 regressions in the existing testcases.

This commit replace the filter with a narrower guard that only skips
direct field accesses (AccessPath currently lacks field granularity and
cannot distinguish `s.v1` from `s.v2`).

Closes https://github.com/llvm/llvm-project/issues/193044
DeltaFile
+65-21clang/test/Sema/warn-lifetime-safety-invalidations.cpp
+5-3clang/lib/Analysis/LifetimeSafety/FactsGenerator.cpp
+70-242 files

LLVM/project 5d2dbb1llvm/lib/Target/SystemZ SystemZOperands.td

Convert the last PatLeaf
DeltaFile
+4-1llvm/lib/Target/SystemZ/SystemZOperands.td
+4-11 files

LLVM/project 8f81288llvm/lib/IR Constants.cpp, llvm/test/Assembler aggregate-constant-values.ll

[RFC][IR] Support vector splats in `ConstantPointerNull`

This PR allows `ConstantPointerNull` to represent both scalar pointer nulls and
fixed or scalable vector splats of pointer nulls. This change first aligns with
the native splat behavior of `ConstantInt` and `ConstantFP`, and second, makes
it easier to eventually change the semantics of `ConstantPointerNull` to
represent a semantic null pointer instead of a zero value, which is what it
represents today.
DeltaFile
+31-31llvm/test/Transforms/RewriteStatepointsForGC/base-vector.ll
+30-30llvm/test/Transforms/LoopVectorize/X86/masked_load_store.ll
+52-4llvm/lib/IR/Constants.cpp
+28-0llvm/unittests/IR/ConstantsTest.cpp
+18-9llvm/test/Transforms/RewriteStatepointsForGC/base-inference.ll
+24-0llvm/test/Assembler/aggregate-constant-values.ll
+183-7463 files not shown
+354-22269 files

LLVM/project d3fb3ddllvm/include/llvm/Transforms/IPO Instrumentor.h InstrumentorUtils.h, llvm/lib/Transforms/IPO Instrumentor.cpp InstrumentorConfigFile.cpp

Fix review comments
DeltaFile
+23-31llvm/lib/Transforms/IPO/Instrumentor.cpp
+25-20llvm/include/llvm/Transforms/IPO/Instrumentor.h
+2-4llvm/include/llvm/Transforms/IPO/InstrumentorUtils.h
+4-1llvm/lib/Transforms/IPO/InstrumentorConfigFile.cpp
+1-1llvm/include/llvm/Transforms/IPO/InstrumentorConfigFile.h
+55-575 files

LLVM/project a5306abllvm/lib/Target/SystemZ SystemZOperands.td

Remove dead code
DeltaFile
+0-5llvm/lib/Target/SystemZ/SystemZOperands.td
+0-51 files

LLVM/project 46c83c9llvm/lib/Target/SystemZ SystemZOperands.td

Convert another PatLeaf
DeltaFile
+5-5llvm/lib/Target/SystemZ/SystemZOperands.td
+5-51 files

LLVM/project 37e0109llvm/utils/TableGen DecoderEmitter.cpp, llvm/utils/TableGen/Common InstructionEncoding.cpp InstructionEncoding.h

[NFC][TableGen] Drop OperandInfo::addField/fields() wrappers and use OperandInfo::Fields instead (#195489)

Fields is already a public member; the wrappers added no semantic value
beyond a thin storage indirection (and ArrayRef-typed reads). Use Fields
directly at all call sites for consistency with the rest of the struct's
plain-data style.

Assisted by Claude.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply at anthropic.com>
DeltaFile
+6-4llvm/utils/TableGen/Common/InstructionEncoding.cpp
+4-4llvm/utils/TableGen/DecoderEmitter.cpp
+0-6llvm/utils/TableGen/Common/InstructionEncoding.h
+10-143 files

LLVM/project d6ffa06llvm/lib/IR Constants.cpp, llvm/test/Assembler aggregate-constant-values.ll

[RFC][IR] Support vector splats in `ConstantPointerNull`

This PR allows `ConstantPointerNull` to represent both scalar pointer nulls and
fixed or scalable vector splats of pointer nulls. This change first aligns with
the native splat behavior of `ConstantInt` and `ConstantFP`, and second, makes
it easier to eventually change the semantics of `ConstantPointerNull` to
represent a semantic null pointer instead of a zero value, which is what it
represents today.
DeltaFile
+31-31llvm/test/Transforms/RewriteStatepointsForGC/base-vector.ll
+30-30llvm/test/Transforms/LoopVectorize/X86/masked_load_store.ll
+52-4llvm/lib/IR/Constants.cpp
+28-0llvm/unittests/IR/ConstantsTest.cpp
+18-9llvm/test/Transforms/RewriteStatepointsForGC/base-inference.ll
+24-0llvm/test/Assembler/aggregate-constant-values.ll
+183-7451 files not shown
+333-20357 files

LLVM/project 3e2eb32llvm/lib/IR Constants.cpp, llvm/test/Assembler aggregate-constant-values.ll

[RFC][IR] Support vector splats in `ConstantPointerNull`

This PR allows `ConstantPointerNull` to represent both scalar pointer nulls and
fixed or scalable vector splats of pointer nulls. This change first aligns with
the native splat behavior of `ConstantInt` and `ConstantFP`, and second, makes
it easier to eventually change the semantics of `ConstantPointerNull` to
represent a semantic null pointer instead of a zero value, which is what it
represents today.
DeltaFile
+31-31llvm/test/Transforms/RewriteStatepointsForGC/base-vector.ll
+30-30llvm/test/Transforms/LoopVectorize/X86/masked_load_store.ll
+51-4llvm/lib/IR/Constants.cpp
+28-0llvm/unittests/IR/ConstantsTest.cpp
+18-9llvm/test/Transforms/RewriteStatepointsForGC/base-inference.ll
+24-0llvm/test/Assembler/aggregate-constant-values.ll
+182-7450 files not shown
+328-20356 files

LLVM/project 5bc46b1llvm/test/Transforms/LoopVectorize as_cast.ll

[LV] Modernize as_cast.ll test. (NFC) (#195481)

Update as_cast.ll to cover both loop-invariant and varying address space
casts, as well as auto-generating the checks.
DeltaFile
+93-27llvm/test/Transforms/LoopVectorize/as_cast.ll
+93-271 files

LLVM/project ecbd653llvm/lib/Transforms/Utils CodeExtractor.cpp, llvm/test/Transforms/HotColdSplit stale-funcretval-after-sever.ll

[𝘀𝗽𝗿] initial version

Created using spr 1.3.7
DeltaFile
+32-0llvm/test/Transforms/HotColdSplit/stale-funcretval-after-sever.ll
+3-0llvm/lib/Transforms/Utils/CodeExtractor.cpp
+35-02 files

LLVM/project 1f34e4bllvm/lib/Transforms/Vectorize VPlanTransforms.cpp

[VPlan] Set predecessor of DispatchVPBB early (NFC). (#195480)

This allows finding the containing plan earlier, which helps when trying
to print DispatchVPBB in a debugger.
DeltaFile
+1-1llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+1-11 files

LLVM/project ff94721llvm/lib/Transforms/Vectorize LoopVectorizationPlanner.h VPlanTransforms.cpp

[VPlan] Add VPBuilder methods to create (First|Last)ActiveLane (NFC). (#195479)

Add dedicaed helpers to builder to slightly simplify code a use-sites.
DeltaFile
+14-0llvm/lib/Transforms/Vectorize/LoopVectorizationPlanner.h
+5-8llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+1-2llvm/lib/Transforms/Vectorize/VPlanConstruction.cpp
+20-103 files

LLVM/project 13371e2compiler-rt/lib/ubsan ubsan_diag.cpp

[compiler-rt][UBSan][NFC] Reorder includes in ubsan_diag.cpp (#195435)
DeltaFile
+3-1compiler-rt/lib/ubsan/ubsan_diag.cpp
+3-11 files

LLVM/project 1cd649cclang/include/clang/CIR/Dialect/IR CIROps.td, clang/lib/CIR/Dialect/IR CIRDialect.cpp

[CIR] Use declarative TableGen constraints for overflow flag verification

Replace hand-written C++ verifiers with PredOpTrait-based constraints
(FlagRequiresIntType, HasAtMostOneOfAttrs). Introduce CIR_SaturatableBinaryOp
base class and use append/prepend ODS directives to compose arguments, format,
and traits across the op hierarchy. Fix HasAtMostOneOfAttrsPred to use
accessor methods instead of dollar-sign references. Add Commutative trait
to AddOp and MulOp.
DeltaFile
+49-35clang/include/clang/CIR/Dialect/IR/CIROps.td
+0-39clang/lib/CIR/Dialect/IR/CIRDialect.cpp
+4-4clang/test/CIR/CodeGen/size-of-vla.cpp
+4-4clang/test/CIR/CodeGen/vla.c
+2-2clang/test/CIR/CodeGen/delete-array.cpp
+59-845 files

LLVM/project 3b32d6ellvm/include/llvm/CodeGen/GlobalISel GenericMachineInstrs.h, llvm/lib/CodeGen/GlobalISel LegalizerHelper.cpp

[X86][GlobalISel] Support fp80 for G_FPTRUNC and G_FPEXT (#141611)

Introduce `G_FPEXTLOAD` and `G_FPTRUNCSTORE` for extending load and
truncating store of a floating point value.

* Introduce `IfFPExtend` and `IfFPTrunc` into `GINodeEquiv` to dispatch
SDAG patterns to the newly introduced opcodes similarly to `G_SEXTLOAD`
and `G_ZEXTLOAD`.
* Added narrowing and widening for the opcodes. However they aren't used
anywhere.
* Supported lowering of `G_FPEXTLOAD` and `G_FPTRUNCSTORE` for X86 by
using X87.
* Added `lowerFPExtAndTruncMem` as default lowering for `G_FPTRUNC` and
`G_FPEXT` using memory.
* Dropped autogenerated line from `legalizer-info-validation.mir` as
scripts can't update them anymore.
* Updated `match-table-cxx.td` with regexps. This is not the first PR
that updates the whole test after opcode introduction.
DeltaFile
+259-0llvm/test/CodeGen/X86/isel-fptrunc-fpext.ll
+66-66llvm/test/TableGen/GlobalISelCombinerEmitter/match-table-cxx.td
+77-3llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp
+27-27llvm/test/TableGen/RegClassByHwMode.td
+39-5llvm/include/llvm/CodeGen/GlobalISel/GenericMachineInstrs.h
+32-0llvm/test/MachineVerifier/test_g_fptruncstore.mir
+500-10119 files not shown
+688-12825 files

LLVM/project 124d442clang/include/clang/CIR/Dialect/IR CIROps.td, clang/lib/CIR/Lowering/DirectToLLVM LowerToLLVM.cpp

[CIR] Extract CIR_ClassCastOp base class for BaseClassAddrOp and DerivedClassAddrOp

Both ops have identical structure (arguments, results, assembly format)
and differ only in mnemonic and description. Extract a shared TableGen
base class to eliminate the duplication. Also improve the assembly format
to print nonnull before the operand and place the type after the offset.
DeltaFile
+22-33clang/include/clang/CIR/Dialect/IR/CIROps.td
+18-18clang/lib/CIR/Lowering/DirectToLLVM/LowerToLLVM.cpp
+8-8clang/test/CIR/CodeGen/vtt.cpp
+5-5clang/test/CIR/CodeGen/derived-to-base.cpp
+4-4clang/test/CIR/CodeGen/inherited-ctors.cpp
+3-3clang/test/CIR/IR/vtt-addrpoint.cir
+60-7114 files not shown
+90-10120 files

LLVM/project c4fc27cllvm/lib/Transforms/Vectorize VPlanConstruction.cpp LoopVectorize.cpp

[VPlan] Strip pred-block check in inLoopReductions (NFC) (#194086)

A VPInstruction will only have a mask if the block needs predication.
DeltaFile
+3-8llvm/lib/Transforms/Vectorize/VPlanConstruction.cpp
+1-7llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+1-3llvm/lib/Transforms/Vectorize/VPlanTransforms.h
+5-183 files

LLVM/project 8ee61adllvm/lib/Target/AMDGPU SIInstructions.td AMDGPULegalizerInfo.cpp, llvm/test/CodeGen/AMDGPU fneg-fabs-v2f32.ll

[AMDGPU] Make v2f32 legal for G_FNEG and G_FABS and pattern update (#195419)

  G_FNEG and G_FABS were made legal for v2f32 when packed fp32 instructions were implemented.
For some unknown reasons, this legalization was not upstreamed yet. This work makes v2f32 legal for
G_FNEG and G_FABS, and updates a few tablegen patterns to ensure instructions can be correctly
selected.
DeltaFile
+256-0llvm/test/CodeGen/AMDGPU/fneg-fabs-v2f32.ll
+17-6llvm/lib/Target/AMDGPU/SIInstructions.td
+4-14llvm/test/CodeGen/AMDGPU/GlobalISel/strict_fma.f32.ll
+7-5llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
+284-254 files

LLVM/project 5d98710llvm/lib/CodeGen/SelectionDAG DAGCombiner.cpp, llvm/lib/Target/AArch64 AArch64ISelLowering.cpp

[SelectionDAG] Move VSelect sign pattern check from AArch64 to general SelectionDAG (#151840)

For some reason the check is already there, but it bails out. Doing the
transform in SelDAG has no negative effect.
DeltaFile
+1,003-0llvm/test/CodeGen/X86/cmp-select-sign.ll
+0-30llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+7-2llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+1,010-323 files

LLVM/project d20a3c0llvm/test/tools/dsymutil/X86 module-warnings.test

[dsymutil] Update module-warnings.test to run with both linkers (#195474)

The classic linker emits a combined .debug_macinfo table and warns about
MacroLists it has to drop because no compile unit references them. The
parallel linker emits .debug_macinfo per compile unit, so unreferenced
lists are never emitted and have no corresponding warning.
DeltaFile
+11-3llvm/test/tools/dsymutil/X86/module-warnings.test
+11-31 files