LLVM/project ea174f6llvm/test/CodeGen/AMDGPU amdgcn.bitcast.1024bit.ll amdgcn.bitcast.512bit.ll, llvm/test/CodeGen/RISCV clmul.ll

Merge branch 'main' into users/c8ef/fold_left_first
DeltaFile
+84,317-78,372llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.1024bit.ll
+25,751-24,782llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.512bit.ll
+23,663-20,281llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.960bit.ll
+21,867-18,577llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.896bit.ll
+25,051-14,920llvm/test/CodeGen/RISCV/clmul.ll
+13,685-22,906llvm/test/CodeGen/RISCV/rvv/clmulh-sdnode.ll
+194,334-179,8387,548 files not shown
+769,605-433,3387,554 files

LLVM/project 0af2d43clang/include/clang/Basic DiagnosticSemaKinds.td, clang/lib/Sema SemaDeclAttr.cpp

[Clang] Warn if both of `dllexport`/`dllimport` and `exclude_from_explicit_instantiation` are specified (#183515)

The attributes `exclude_from_explicit_instantiation` and
`dllexport`/`dllimport` serve opposite purposes.
Therefore, if an entity has both attributes, drop one with a warning,
depending on the context of the declaration.
In a template context, the `exclude_from_explicit_instantiation`
attribute takes precedence over the `dllexport` or `dllimport`
attribute. Conversely, the `dllexport` and `dllimport` attributes are
prioritized, in a non-template context.
DeltaFile
+172-0clang/test/SemaCXX/attr-exclude_from_explicit_instantiation.ignore-dllattr.cpp
+35-0clang/lib/Sema/SemaDeclAttr.cpp
+9-0clang/include/clang/Basic/DiagnosticSemaKinds.td
+216-03 files

LLVM/project 5cf09a6llvm/lib/Target/AArch64 AArch64ISelLowering.cpp, llvm/test/CodeGen/AArch64 clmul-fixed.ll clmul.ll

[AArch64][ISel] Use vector register for scalar CLMUL (#183282)

Even though there are only v8i8 and v1i64 variants for pmul/pmull, Using
them is faster than the current implementation for scalar CLMUL.
DeltaFile
+3,105-3,034llvm/test/CodeGen/AArch64/clmul-fixed.ll
+1,223-1,171llvm/test/CodeGen/AArch64/clmul.ll
+39-2llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+4,367-4,2073 files

LLVM/project 98ed417llvm/test/Transforms/LoopVectorize early_exit_with_stores.ll early_exit_store_legality.ll, llvm/test/Transforms/LoopVectorize/AArch64 early_exit_with_stores.ll

[LV] Transform tests for early-exit with stores (#183288)

Precommit of transform tests for #178454
DeltaFile
+1,126-0llvm/test/Transforms/LoopVectorize/early_exit_with_stores.ll
+1,125-0llvm/test/Transforms/LoopVectorize/RISCV/early_exit_with_stores.ll
+1,082-0llvm/test/Transforms/LoopVectorize/AArch64/early_exit_with_stores.ll
+281-45llvm/test/Transforms/LoopVectorize/early_exit_store_legality.ll
+78-0llvm/test/Transforms/LoopVectorize/VPlan/early_exit_with_stores_vplan.ll
+3,692-455 files

LLVM/project 697054dllvm/lib/Target/SPIRV SPIRVInstructionSelector.cpp

Review
DeltaFile
+17-14llvm/lib/Target/SPIRV/SPIRVInstructionSelector.cpp
+17-141 files

LLVM/project b935faallvm/lib/Target/RISCV RISCVISelLowering.cpp RISCVInstrInfo.cpp, llvm/test/CodeGen/RISCV/rvv fixed-vectors-sad.ll

[𝘀𝗽𝗿] initial version

Created using spr 1.3.6-beta.1
DeltaFile
+64-1llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+10-14llvm/test/CodeGen/RISCV/rvv/fixed-vectors-sad.ll
+4-0llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
+78-153 files

LLVM/project 802562allvm/lib/Target/RISCV RISCVInstrInfo.cpp

[𝘀𝗽𝗿] changes to main this commit is based on

Created using spr 1.3.6-beta.1

[skip ci]
DeltaFile
+4-0llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
+4-01 files

LLVM/project 8bb41c9llvm/lib/Target/AMDGPU/Utils AMDGPUBaseInfo.cpp

AMDGPU: Fix copy of Triple (#184594)

DeltaFile
+2-2llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp
+2-21 files

LLVM/project 8b3a725llvm/lib/Target/RISCV RISCVInstrInfo.cpp

[𝘀𝗽𝗿] initial version

Created using spr 1.3.6-beta.1
DeltaFile
+4-0llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
+4-01 files

LLVM/project 095e169clang/lib/AST ExprConstant.cpp, clang/lib/Driver/ToolChains Darwin.cpp

[clang] Turn misc copy-assign to move-assign (#184144)

That's an automated patch generated from clang-tidy
performance-use-std-move as a follow-up to #184136
DeltaFile
+2-2clang/lib/Sema/SemaLookup.cpp
+2-2clang/lib/AST/ExprConstant.cpp
+2-2clang/lib/Parse/ParseDecl.cpp
+1-1clang/lib/Driver/ToolChains/Darwin.cpp
+1-1clang/lib/Lex/LiteralSupport.cpp
+1-1clang/lib/Lex/PPExpressions.cpp
+9-914 files not shown
+23-2320 files

LLVM/project c2784e1flang/lib/Semantics resolve-directives.cpp, flang/test/Semantics/OpenMP resolve05.f90

[Flang][OpenMP] DEFAULT(NONE) error checking on implicit references (#182214)

A variable with an unspecified data-sharing attribute under a
DEFAULT(NONE) clause only emits an error if the variable is explicitly
referenced in the body of the construct with DEFAULT(NONE).

Ex:

```
!$omp parallel default(none)
!$omp task
a = 1
!$omp end task
!$omp end parallel
end
```
gfortran will error with `‘a’ not specified in enclosing ‘parallel’` on
the above. flang doesn't error.

Fix moves the error check to `CreateImplicitSymbols` and checks the
variable for a violation in any of its enclosing contexts.
DeltaFile
+33-22flang/lib/Semantics/resolve-directives.cpp
+25-0flang/test/Semantics/OpenMP/resolve05.f90
+58-222 files

LLVM/project 1f4074bmlir/lib/Dialect/LLVMIR/IR LLVMMemorySlot.cpp, mlir/test/Dialect/LLVMIR sroa.mlir

[mlir][llvm] Fix SROA crash on empty LLVM struct types (#184596)

When SROA runs on an alloca of an empty struct type (llvm.struct<()>),
it crashes with:

  Assertion `\!subelementIndexMap->empty()' failed.

The root cause is in LLVMStructType::getSubelementIndexMap(): for an
empty struct (no body fields), the loop doesn't execute and an empty
DenseMap is returned as a non-null optional. Later, getTypeAtIndex()
asserts the map is non-empty, triggering the crash.

Fix this by returning std::nullopt for empty structs, indicating they
cannot be destructured. This is consistent with how LLVMArrayType
handles the zero-element case.

Fixes #108366
DeltaFile
+15-0mlir/test/Dialect/LLVMIR/sroa.mlir
+3-0mlir/lib/Dialect/LLVMIR/IR/LLVMMemorySlot.cpp
+18-02 files

LLVM/project 4a46bdbllvm/lib/Target/AMDGPU/Utils AMDGPUBaseInfo.cpp

Update llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp

Co-authored-by: Juan Manuel Martinez Caamaño <jmartinezcaamao at gmail.com>
DeltaFile
+1-1llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp
+1-11 files

LLVM/project 0a1e395clang/test lit.cfg.py, clang/test/Analysis missing-z3-nocrash.c

[nfc][analyzer][test][z3] Replace "REQUIRES: no-z3" with "UNSUPPORTED: z3" (#184349)

Fixing D120325, continuing #183724

Lit feature "no-z3" is the opposite of "z3", requiring "no-z3" is the
same as unsupporting "z3".
DeltaFile
+1-1clang/test/Analysis/missing-z3-nocrash.c
+0-2clang/test/lit.cfg.py
+1-32 files

LLVM/project 8260200llvm/lib/Target/AMDGPU/Utils AMDGPUBaseInfo.cpp

AMDGPU: Fix copy of Triple
DeltaFile
+1-1llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp
+1-11 files

LLVM/project ee81845flang/lib/Lower PFTBuilder.cpp, flang/test/Lower mixed_loops.f90 while_loop.f90

Revert "[flang] make lowering to scf.while default" (#184592)

Reverts llvm/llvm-project#184234

This is breaking SPEC and other tests.

Reproducer:

```
subroutine foo()
  logical :: l1, l2
  do while (l1())
    if (l2()) then
      call bar()
    endif
  enddo
end
```


    [9 lines not shown]
DeltaFile
+36-25flang/test/Lower/mixed_loops.f90
+18-15flang/test/Lower/while_loop.f90
+3-12flang/test/Lower/OpenMP/target.f90
+5-5flang/test/Lower/pre-fir-tree02.f90
+10-0flang/lib/Lower/PFTBuilder.cpp
+3-3flang/test/Lower/loops.f90
+75-601 files not shown
+76-617 files

LLVM/project 9c2829fmlir/lib/Dialect/Func/IR FuncOps.cpp, mlir/test/Dialect/Func invalid.mlir

[mlir][Func] Use getMutableSuccessorOperands() in FuncOp verifier (#184589)

When verifying return-like terminators, use
getMutableSuccessorOperands() instead of getNumOperands() so that only
the operands passed to the parent region are checked against the
function result types. This handles terminators that implement
RegionBranchTerminatorOpInterface and carry additional operands for
other successor regions (e.g. loop back-edges).

Add tests using test.loop_block_term, which has both an iter operand
(passed back to the region) and an exit operand (passed to the parent).
DeltaFile
+12-13mlir/lib/Dialect/Func/IR/FuncOps.cpp
+22-0mlir/test/Dialect/Func/invalid.mlir
+14-0mlir/test/IR/test-region-branch-op-verifier.mlir
+48-133 files

LLVM/project 7f04494mlir/include/mlir/Dialect/Arith/IR ArithOps.td, mlir/include/mlir/Dialect/Vector/IR VectorOps.td

[MLIR][Arith][Vector] Reject i0 integer type in arith and vector ops (#183589)

Add ODS type constraints that exclude zero-bitwidth integers (i0) from
operations in the arith and vector dialects.  i0 has no meaningful
arithmetic representation and operations on it can trigger undefined
behavior (e.g. bitwidth calculations assuming non-zero width).

Changes:
- Add `AnyNonZeroBitwidthSignlessInteger` (as a `ConfinedType` over
  `AnySignlessInteger`) and `AnyNonZeroBitwidthSignlessIntegerOrIndex`
  to CommonTypeConstraints.td.
- Introduce `Arith_SignlessIntegerOrIndexLike` in ArithOps.td that wraps
  `AnyNonZeroBitwidthSignlessIntegerOrIndex` via
`TypeOrValueSemanticsContainer`, and update
`SignlessFixedWidthIntegerLike`
  to use `AnyNonZeroBitwidthSignlessInteger`.  Replace all uses of the
  shared `SignlessIntegerOrIndexLike` in ArithOps.td with the new
  dialect-local constraint.
- Update `IndexCastTypeConstraint` to use

    [23 lines not shown]
DeltaFile
+147-8mlir/test/Dialect/Arith/invalid.mlir
+0-81mlir/test/Dialect/Arith/canonicalize.mlir
+34-24mlir/include/mlir/Dialect/Arith/IR/ArithOps.td
+56-0mlir/test/Dialect/Vector/invalid.mlir
+23-12mlir/include/mlir/Dialect/Vector/IR/VectorOps.td
+11-0mlir/include/mlir/IR/CommonTypeConstraints.td
+271-1251 files not shown
+275-1257 files

LLVM/project ee92ac2mlir/lib/Dialect/NVGPU/Transforms OptimizeSharedMemory.cpp, mlir/test/Dialect/NVGPU optimize-shared-memory.mlir

[mlir][nvgpu] Fix crash in optimize-shared-memory pass with vector element types (#179111)

The `--nvgpu-optimize-shared-memory` pass crashed when processing
memrefs with vector element types (e.g., `memref<16x1xvector<16xf16>,
3>`). This occurred because getElementTypeBitWidth() calls
getIntOrFloatBitWidth(), which asserts the element type must be an
integer or float.
Thus, this PR adds an early-exit guard to return failure() when the
memref's element type is not a scalar int or float.

I wasn't sure if we should support vector types (by multiplying element
bit width by vector length) or just reject them. For now, I've
implemented it to return failure on non-scalar types.

Fixes #177823

Co-authored-by: rebel-jueonpark <jueonpark at rebellions.ai>
DeltaFile
+11-0mlir/test/Dialect/NVGPU/optimize-shared-memory.mlir
+5-0mlir/lib/Dialect/NVGPU/Transforms/OptimizeSharedMemory.cpp
+16-02 files

LLVM/project 943eb6fllvm/lib/Transforms/Vectorize VPlanConstruction.cpp, llvm/test/Transforms/LoopVectorize find-last.ll

[LV] Use make_early_inc_range in handleFindLastReductions (#184340)

Fixes #182152
DeltaFile
+68-0llvm/test/Transforms/LoopVectorize/find-last.ll
+2-1llvm/lib/Transforms/Vectorize/VPlanConstruction.cpp
+70-12 files

LLVM/project 72f4ddcllvm/utils/TableGen/Common CodeGenRegisters.h

Give the new key flag a comment.
DeltaFile
+5-0llvm/utils/TableGen/Common/CodeGenRegisters.h
+5-01 files

LLVM/project b2a6530flang/lib/Lower PFTBuilder.cpp, flang/test/Lower mixed_loops.f90 while_loop.f90

Revert "[flang] make lowering to scf.while default (#184234)"

This reverts commit 62144f48d43fc33d7e73fff68aea31dd287d5349.
DeltaFile
+36-25flang/test/Lower/mixed_loops.f90
+18-15flang/test/Lower/while_loop.f90
+3-12flang/test/Lower/OpenMP/target.f90
+5-5flang/test/Lower/pre-fir-tree02.f90
+10-0flang/lib/Lower/PFTBuilder.cpp
+3-3flang/test/Lower/loops.f90
+75-601 files not shown
+76-617 files

LLVM/project b7ab5bfllvm/test/CodeGen/AMDGPU amdgcn.bitcast.1024bit.ll amdgcn.bitcast.512bit.ll, llvm/test/CodeGen/RISCV/rvv clmulh-sdnode.ll

Merge branch 'main' into users/kosarev/fix-artifical-regs
DeltaFile
+84,317-78,372llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.1024bit.ll
+25,751-24,782llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.512bit.ll
+23,663-20,281llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.960bit.ll
+21,867-18,577llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.896bit.ll
+13,685-22,906llvm/test/CodeGen/RISCV/rvv/clmulh-sdnode.ll
+19,112-16,445llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.832bit.ll
+188,395-181,3633,332 files not shown
+449,882-325,0253,338 files

LLVM/project d0f50d5llvm/docs AMDGPUUsage.rst, llvm/lib/Target/AMDGPU AMDGPU.td

[AMDGPU] Remove DX10_CLAMP and IEEE bits from gfx1170 (#182107)

Add `DX10ClampAndIEEEMode` feature and set it for every subtarget prior
to gfx1170
DeltaFile
+100-0llvm/test/CodeGen/AMDGPU/amdpal-msgpack-dx10-clamp-on.ll
+16-8llvm/lib/Target/AMDGPU/AMDGPU.td
+18-3llvm/test/CodeGen/AMDGPU/amdpal-msgpack-dx10-clamp.ll
+9-5llvm/docs/AMDGPUUsage.rst
+8-4llvm/lib/Target/AMDGPU/Utils/AMDKernelCodeTUtils.cpp
+9-1llvm/test/CodeGen/AMDGPU/amdpal-msgpack-ieee.ll
+160-2120 files not shown
+201-6026 files

LLVM/project dac78e9llvm/test/CodeGen/AMDGPU amdgcn.bitcast.1024bit.ll amdgcn.bitcast.512bit.ll, llvm/test/CodeGen/RISCV/rvv clmulh-sdnode.ll

Merge remote-tracking branch 'external-upstream/main' into users/mariusz-sikora-at-amd/add-flat-offset-bits-feature
DeltaFile
+84,317-78,372llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.1024bit.ll
+25,751-24,782llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.512bit.ll
+23,663-20,281llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.960bit.ll
+21,867-18,577llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.896bit.ll
+13,685-22,906llvm/test/CodeGen/RISCV/rvv/clmulh-sdnode.ll
+19,112-16,445llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.832bit.ll
+188,395-181,3632,486 files not shown
+371,847-299,6842,492 files

LLVM/project de5e081flang/test/Lower/Intrinsics associated.f90 all.f90

[flang][NFC] Converted five tests from old lowering to new lowering (part 23) (#184533)

Tests converted from test/Lower/Intrinsics: adjustr.f90, all.f90,
any.f90, asinpi.f90, associated.f90
DeltaFile
+44-45flang/test/Lower/Intrinsics/associated.f90
+15-19flang/test/Lower/Intrinsics/all.f90
+15-19flang/test/Lower/Intrinsics/any.f90
+11-9flang/test/Lower/Intrinsics/adjustr.f90
+7-8flang/test/Lower/Intrinsics/asinpi.f90
+92-1005 files

LLVM/project 8ac00bamlir/lib/Conversion/SCFToEmitC SCFToEmitC.cpp, mlir/test/Conversion/SCFToEmitC scf-to-emitc-failed.mlir

[mlir][SCFToEmitC] Fix crash when scf.while carries a memref loop variable (#183944)

When a scf.while op has a loop-carried value whose type converts to
emitc::ArrayType (e.g. memref<1xf64>), the WhileLowering pattern
unconditionally called emitc::LValueType::get(arrayType), which
triggered an assertion because LValueType cannot wrap an array type.

Fix by returning a match failure in createVariablesForResults and
createVariablesForLoopCarriedValues when the converted type is an
emitc::ArrayType. This converts the crash into a proper legalization
failure.

Fixes #182649
DeltaFile
+24-0mlir/test/Conversion/SCFToEmitC/scf-to-emitc-failed.mlir
+7-0mlir/lib/Conversion/SCFToEmitC/SCFToEmitC.cpp
+31-02 files

LLVM/project f1aa7c3mlir/lib/Dialect/ControlFlow/IR ControlFlowOps.cpp, mlir/test/Dialect/ControlFlow canonicalize.mlir

[mlir][cf] Canonicalize block args with uniform incoming values (#183966)

Add a canonicalization pattern that replaces block arguments with a
common SSA value when all predecessors pass the same value for that
argument. This allows the block argument to be removed by dead code
elimination. First itteration

Idea from #182711
DeltaFile
+98-0mlir/test/Dialect/ControlFlow/canonicalize.mlir
+80-4mlir/lib/Dialect/ControlFlow/IR/ControlFlowOps.cpp
+178-42 files

LLVM/project f702ee8llvm/lib/Transforms/Vectorize VPlan.h

[VPlan] Fix partially uninitialized accesses after 17aaa0e590a7. (#184583)

17aaa0e590a7 adjusted how parts of the union members are managed. Make
sure the full union is initialized, to fix MSan failure in
https://lab.llvm.org/buildbot/#/builders/164/builds/19313.
DeltaFile
+34-17llvm/lib/Transforms/Vectorize/VPlan.h
+34-171 files

LLVM/project 2aab31allvm/test/CodeGen/X86 combine-fcopysign.ll

[X86] combine-fcopysign.ll - extend test coverage to all x86-64/x86-64-v2/x86-64-v3/x86-64-v4 levels (#184579)

DeltaFile
+172-95llvm/test/CodeGen/X86/combine-fcopysign.ll
+172-951 files