LLVM/project 76e7e9fllvm/test/CodeGen/LoongArch/lasx and-not-combine.ll, llvm/test/CodeGen/LoongArch/lsx and-not-combine.ll

[LoongArch][NFC] Add tests for combining vand(vnot) (#160830)

DeltaFile
+424-2llvm/test/CodeGen/LoongArch/lasx/and-not-combine.ll
+347-2llvm/test/CodeGen/LoongArch/lsx/and-not-combine.ll
+771-42 files

LLVM/project b53e46fmlir/include/mlir/Dialect/X86Vector/TransformOps X86VectorTransformOps.td, mlir/lib/Dialect/X86Vector/TransformOps X86VectorTransformOps.cpp

[mlir][x86vector] Lower vector.contract to FMA or packed type dot-product (#168074)

A `transform` pass to lower `vector.contract` to (a) `vector.fma` for
`F32`, (b) `x86vector.avx512.dot` for `BF16`, (c) `x86vector.avx.dot.i8`
for `Int8` packed types.

The lowering works on condition with `m`, `batch`, `k` dims to be `one`
and `vnni` dim should be `2` for `bf16`; `4` for `int8`.

**The lowering pattern**: `batch_reduce.matmul` (input) ->
register-tiling(M, N) -> Vectorization (to `vector.contract`) ->
`unroll` vector.contract (`unit` dims) -> `hoisting` transformation
(move `C` loads/store outside batch/k loop) -> apply `licm`,
`canonicalization`, and `bufferize`.
DeltaFile
+681-0mlir/test/Dialect/X86Vector/vector-contract-to-packed-type-dotproduct.mlir
+344-0mlir/test/Dialect/X86Vector/vector-contract-to-fma.mlir
+301-0mlir/lib/Dialect/X86Vector/Transforms/VectorContractToPackedTypeDotProduct.cpp
+143-0mlir/lib/Dialect/X86Vector/Transforms/VectorContractToFMA.cpp
+64-0mlir/lib/Dialect/X86Vector/TransformOps/X86VectorTransformOps.cpp
+43-0mlir/include/mlir/Dialect/X86Vector/TransformOps/X86VectorTransformOps.td
+1,576-08 files not shown
+1,647-014 files

LLVM/project 13a39eallvm/lib/Transforms/Vectorize VPlanTransforms.cpp

[Sema] Fix Wunused-but-set-variable warning(NFC) (#169220)

Fix warning: 
llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp:1455:23: warning:
variable 'Store' set but not used [-Wunused-but-set-variable]
DeltaFile
+1-0llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+1-01 files

LLVM/project 4df3450llvm/lib/Target/AMDGPU SIISelLowering.cpp

Hardcoding bit value.
DeltaFile
+4-7llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+4-71 files

LLVM/project 2e98cbcmlir/lib/Conversion/ArithToAPFloat ArithToAPFloat.cpp, mlir/lib/ExecutionEngine APFloatWrappers.cpp

[mlir][arith] Add support for `fptosi`, `fptoui` to `ArithToAPFloat`
DeltaFile
+58-0mlir/lib/Conversion/ArithToAPFloat/ArithToAPFloat.cpp
+26-0mlir/test/Conversion/ArithToApfloat/arith-to-apfloat.mlir
+14-0mlir/lib/ExecutionEngine/APFloatWrappers.cpp
+10-0mlir/test/Integration/Dialect/Arith/CPU/test-apfloat-emulation.mlir
+108-04 files

LLVM/project 202d784mlir/lib/Dialect/IRDL IRDLLoading.cpp, mlir/test/Dialect/IRDL variadics.mlir

[MLIR][IRDL] Support camelCase segment size attributes in IRDL verifier (#168836)

Two years ago, `operand_segment_sizes` and `result_segment_sizes` were
renamed to `operandSegmentSizes` and `resultSegmentSizes` (check related
commits, e.g.
https://github.com/llvm/llvm-project/commit/363b655920c49a4bcb0869f820ed40aac834eebd).

However, the op verifiers in IRDL loading phase is still using old
attributes like `operand_segment_sizes` and `result_segment_sizes`,
which causes some conflict, e.g. it is not compatible with the OpView
builder in MLIR python bindings (which generates camelCase segment
attributes).

This PR is to support to use camelCase segment size attributes in IRDL
verifier. Note that support of `operand_segment_sizes` and
`result_segment_sizes` is dropped.

I found this issue since I'm working on a new IRDL wrapper in the MLIR
python bindings.
DeltaFile
+34-34mlir/test/Dialect/IRDL/variadics.mlir
+2-2mlir/lib/Dialect/IRDL/IRDLLoading.cpp
+36-362 files

LLVM/project 23f7e74mlir/lib/Conversion/ArithToAPFloat ArithToAPFloat.cpp, mlir/lib/ExecutionEngine APFloatWrappers.cpp

[mlir][arith] Add support for `extf`, `truncf` to `ArithToAPFloat`
DeltaFile
+80-19mlir/lib/Conversion/ArithToAPFloat/ArithToAPFloat.cpp
+22-0mlir/test/Conversion/ArithToApfloat/arith-to-apfloat.mlir
+16-1mlir/lib/ExecutionEngine/APFloatWrappers.cpp
+11-4mlir/test/Integration/Dialect/Arith/CPU/test-apfloat-emulation.mlir
+129-244 files

LLVM/project 7851b8allvm/lib/Target/RISCV RISCVISelLowering.cpp, llvm/test/CodeGen/RISCV/rvv fixed-vectors-vector-splice.ll vector-splice.ll

[RISCV] Combine vslide{up,down} x, poison -> x (#169013)

The motivation for this is that it would be useful to express a
vslideup/vslidedown in a target independent way e.g. from the loop
vectorizer.

We can do this today with @llvm.vector.splice by setting one operand to
poison:

- A slide down can be achieved with @llvm.vector.splice(%x, poison,
slideamt)
- A slide up can be done by @llvm.vector.splice(poison, %x, -slideamt)

E.g.:

    splice(<a,b,c,d>, poison, 3) = <d,poison,poison,poison>
    splice(poison, <a,b,c,d>, -3) = <poison,poison,poison,a>

These splices get lowered to a vslideup + vslidedown pair with one of
the vs2s being poison. We can optimize this away so that we are just
left with a single slideup/slidedown.
DeltaFile
+91-0llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vector-splice.ll
+60-0llvm/test/CodeGen/RISCV/rvv/vector-splice.ll
+5-0llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+156-03 files

LLVM/project ee4f647llvm/lib/Target/AMDGPU AMDGPUISelLowering.cpp, llvm/test/CodeGen/AMDGPU si-split-load-store-alias-info.ll

[AMDGPU] Propagate AA info in vector load/store splitting. (#168871)

Fixes a bug in `AMDGPUISelLowering` where alias analysis info is not
propagated to split loads and stores.

This is required for #161375

---------

Co-authored-by: Leon Clark <leoclark at amd.com>
DeltaFile
+35-0llvm/test/CodeGen/AMDGPU/si-split-load-store-alias-info.ll
+11-11llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
+46-112 files

LLVM/project acab67bcompiler-rt/cmake builtin-config-ix.cmake, compiler-rt/lib/builtins CMakeLists.txt

[M68k][compiler-rt] Allow compiler-rt builtins to be built for M68k (#169256)

I've tested this locally, and the builtins build proceeds without a
hitch for m68k-none-none. This is part of a larger effort to establish a
working m68k baremetal toolchain.
DeltaFile
+2-1compiler-rt/cmake/builtin-config-ix.cmake
+2-0compiler-rt/lib/builtins/CMakeLists.txt
+4-12 files

LLVM/project c33e50bllvm/test/Transforms/GlobalOpt/X86 apx.ll

[GlobalOpt] Use `x86-registered-target` to fix Buildbot failures, 2nd try (#169266)

DeltaFile
+2-4llvm/test/Transforms/GlobalOpt/X86/apx.ll
+2-41 files

LLVM/project 9ae33a6llvm/lib/Target/LoongArch LoongArchISelLowering.cpp, llvm/test/CodeGen/LoongArch/lasx/ir-instruction shuffle-broadcast.ll

[LoongArch] Legalize broadcasting the first element of 256-bit vector using `xvreplve0`
DeltaFile
+19-0llvm/lib/Target/LoongArch/LoongArchISelLowering.cpp
+6-10llvm/test/CodeGen/LoongArch/lasx/ir-instruction/shuffle-broadcast.ll
+25-102 files

LLVM/project e71f243llvm/utils/TableGen/Common CodeGenDAGPatterns.h

[TableGen] Simplify MachineValueTypeSet::iterator::find_from_pos. NFC (#169227)

Merge the SkipBits!=0 handling into the first iteration of the word
loop. This is the same code structure used by BitVector::find_first_in.
DeltaFile
+9-15llvm/utils/TableGen/Common/CodeGenDAGPatterns.h
+9-151 files

LLVM/project 78783aemlir/lib/Transforms RemoveDeadValues.cpp, mlir/test/Transforms remove-dead-values.mlir

[mlir][Transforms] Fix crash in `-remove-dead-values` for private functions
DeltaFile
+38-0mlir/lib/Transforms/RemoveDeadValues.cpp
+11-0mlir/test/Transforms/remove-dead-values.mlir
+49-02 files

LLVM/project bcd302ellvm/test/CodeGen/LoongArch/lasx/ir-instruction shuffle-broadcast.ll

[LoongArch][NFC] Add tests for 256-bit vector broadcast
DeltaFile
+179-0llvm/test/CodeGen/LoongArch/lasx/ir-instruction/shuffle-broadcast.ll
+179-01 files

LLVM/project c4254cdclang/lib/Basic/Targets SPIR.h, clang/test/CodeGenOpenCL __bf16.cl

[Clang] Support __bf16 type for SPIR/SPIR-V (#169012)

SPIR/SPIR-V are generic targets. Assume they support __bf16.
DeltaFile
+31-0clang/test/CodeGenOpenCL/__bf16.cl
+4-5clang/lib/Basic/Targets/SPIR.h
+4-3clang/test/SemaSYCL/bf16.cpp
+39-83 files

LLVM/project ca6c590clang/test/Sema/AArch64 arm_sve_feature_dependent_sve___sme.c, llvm/test/CodeGen/AMDGPU amdgcn.bitcast.1024bit.ll amdgcn.bitcast.512bit.ll

Merge branch 'main' into users/ylzsx/precommit-andn-combine
DeltaFile
+53,205-51,210llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.1024bit.ll
+18,277-15,993llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.512bit.ll
+19,255-3,889llvm/test/CodeGen/RISCV/atomic-rmw.ll
+19,470-0clang/test/Sema/AArch64/arm_sve_feature_dependent_sve___sme.c
+5,981-8,885llvm/test/CodeGen/AMDGPU/shufflevector.v4p0.v4p0.ll
+5,981-8,885llvm/test/CodeGen/AMDGPU/shufflevector.v4i64.v4i64.ll
+122,169-88,86219,165 files not shown
+1,568,838-592,88919,171 files

LLVM/project 0482234mlir/include/mlir/Dialect/XeGPU/IR XeGPUOps.td, mlir/lib/Dialect/XeGPU/IR XeGPUOps.cpp

adding documentation
DeltaFile
+154-38mlir/include/mlir/Dialect/XeGPU/IR/XeGPUOps.td
+10-10mlir/lib/Dialect/XeGPU/Transforms/XeGPUPropagateLayout.cpp
+1-1mlir/lib/Dialect/XeGPU/IR/XeGPUOps.cpp
+165-493 files

LLVM/project fe56f5cmlir/include/mlir/Pass Pass.h, mlir/lib/Pass Pass.cpp

[mlir][Pass] Fix crash when applying a pass to an optional interface (#169262)

Interfaces can be optional: whether an op implements an interface or not
can depend on the state of the operation.

```
// An optional code block for adding additional "classof" logic. This can
// be used to better enable "optional" interfaces, where an entity only
// implements the interface if some dynamic characteristic holds.
// `$_attr`/`$_op`/`$_type` may be used to refer to an instance of the
// interface instance being checked.
code extraClassOf = "";
```

The current `Pass::canScheduleOn(RegisteredOperationName)` is
insufficient. This commit adds an additional overload to inspect
`Operation *`.

This commit fixes a crash when scheduling an `InterfacePass` for an

    [3 lines not shown]
DeltaFile
+10-0mlir/test/Pass/invalid-unsupported-operation.mlir
+8-0mlir/include/mlir/Pass/Pass.h
+3-3mlir/lib/Pass/Pass.cpp
+1-1mlir/test/Dialect/Transform/test-pass-application.mlir
+1-1mlir/test/Pass/pipeline-invalid.mlir
+23-55 files

LLVM/project 25c2cc4llvm/test/Transforms/GlobalOpt/X86 apx.ll

[GlobalOpt] Use `target triple` to fix Buildbot failures, NFCI (#169260)

This supposes to fix LLVM Buildbot failures after #164768. I don't have
the environment to verify though.
DeltaFile
+4-1llvm/test/Transforms/GlobalOpt/X86/apx.ll
+4-11 files

LLVM/project c715615mlir/test/Pass invalid-unsupported-operation.mlir pipeline-invalid.mlir

fix tests
DeltaFile
+1-1mlir/test/Pass/invalid-unsupported-operation.mlir
+1-1mlir/test/Pass/pipeline-invalid.mlir
+2-22 files

LLVM/project d51988emlir/include/mlir/Pass Pass.h, mlir/lib/Pass Pass.cpp

[mlir][Pass] Fix crash when applying a pass to an optional interface (#168499)

Interfaces can be optional: whether an op implements an interface or not
can depend on the state of the operation.

```
  // An optional code block for adding additional "classof" logic. This can
  // be used to better enable "optional" interfaces, where an entity only
  // implements the interface if some dynamic characteristic holds.
  // `$_attr`/`$_op`/`$_type` may be used to refer to an instance of the
  // interface instance being checked.
  code extraClassOf = "";
```

The current `Pass::canScheduleOn(RegisteredOperationName)` is
insufficient. This commit adds an additional overload to inspect
`Operation *`.

This commit fixes a crash when scheduling an `InterfacePass` for an
optional interface on an operation that does not actually implement the
interface.
DeltaFile
+10-0mlir/test/Pass/invalid-unsupported-operation.mlir
+8-0mlir/include/mlir/Pass/Pass.h
+3-3mlir/lib/Pass/Pass.cpp
+1-1mlir/test/Dialect/Transform/test-pass-application.mlir
+1-1mlir/test/Pass/pipeline-invalid.mlir
+23-55 files

LLVM/project a6cec3fllvm/test/CodeGen/AMDGPU local-atomicrmw-fsub.ll local-atomicrmw-fmin.ll, llvm/test/CodeGen/NVPTX atomics-b128.ll

Reland "[RegAlloc] Fix the terminal rule check for interfere with DstReg (#168661)" (#169219)

Reland d5f3ab8ec97786476a077b0c8e35c7c337dfddf2, fix testcases.
DeltaFile
+1,378-1,420llvm/test/CodeGen/AMDGPU/local-atomicrmw-fsub.ll
+1,231-1,259llvm/test/CodeGen/AMDGPU/local-atomicrmw-fmin.ll
+1,231-1,259llvm/test/CodeGen/AMDGPU/local-atomicrmw-fmax.ll
+1,156-1,182llvm/test/CodeGen/AMDGPU/local-atomicrmw-fadd.ll
+96-92llvm/test/CodeGen/X86/i128-mul.ll
+75-75llvm/test/CodeGen/NVPTX/atomics-b128.ll
+5,167-5,28749 files not shown
+5,925-6,04255 files

LLVM/project 28eee72llvm/docs LangRef.rst, llvm/include/llvm/Analysis TargetTransformInfo.h

[GlobalOpt] Add TTI interface useFastCCForInternalCall for FASTCC (#164768)

Background: X86 APX feature adds 16 registers within the same 64-bit
mode. PR #164638 is trying to extend such registers for FASTCC. However,
a blocker issue is calling convention cannot be changeable with or
without a feature.

The solution is to disable FASTCC if APX is not ready. This is an NFC
change to the final code generation, becasue X86 doesn't define an
alternative ABI for FASTCC in 64-bit mode. We can solve the potential
compatibility issue of #164638 with this patch.
DeltaFile
+54-0llvm/test/Transforms/GlobalOpt/X86/apx.ll
+16-0llvm/lib/Target/X86/X86TargetTransformInfo.cpp
+9-6llvm/lib/Transforms/IPO/GlobalOpt.cpp
+8-5llvm/docs/LangRef.rst
+4-0llvm/lib/Analysis/TargetTransformInfo.cpp
+4-0llvm/include/llvm/Analysis/TargetTransformInfo.h
+95-112 files not shown
+99-118 files

LLVM/project 3c3e2a2orc-rt/include/orc-rt WrapperFunction.h, orc-rt/unittests DirectCaller.h

[orc-rt] Remove unused Session argument from WrapperFunction::call. (#169255)

DeltaFile
+2-2orc-rt/include/orc-rt/WrapperFunction.h
+1-1orc-rt/unittests/DirectCaller.h
+3-32 files

LLVM/project b73a281llvm/utils/gn/secondary/llvm/lib/Target/LoongArch BUILD.gn

[gn] port b5812c0cf789aa4cb (LoongArch SDNodeInfo)
DeltaFile
+8-0llvm/utils/gn/secondary/llvm/lib/Target/LoongArch/BUILD.gn
+8-01 files

LLVM/project ded1311llvm/include/llvm/ExecutionEngine/Orc WaitingOnGraph.h

[ORC] Fix typo in comment.
DeltaFile
+2-2llvm/include/llvm/ExecutionEngine/Orc/WaitingOnGraph.h
+2-21 files

LLVM/project 65cf047clang/test/CodeGen memprof-pgho-thinlto.cpp

fix test

Created using spr 1.3.8-beta.1
DeltaFile
+2-2clang/test/CodeGen/memprof-pgho-thinlto.cpp
+2-21 files

LLVM/project 4996645clang-tools-extra/clang-tidy/objc AssertEqualsCheck.cpp, clang/docs LibASTMatchersReference.html

Revert "[ASTMatchers] Make isExpandedFromMacro accept llvm::StringRef… (#167060)" (#169238)

This reverts commit a52e1af7f766e26a78d10d31da98af041dd66410.

That commit reverted a change (making isExpandedFromMacro take a
std::string) that was explicitly added to avoid lifetime issues. We ran
into issues with some internal matchers due to this, and it probably is
not an uncommon downstream use case. This patch restroes the original
functionality and adds a test to ensure that the functionality is
preserved.

https://reviews.llvm.org/D90303 contains more discussion.
DeltaFile
+13-0clang/unittests/ASTMatchers/ASTMatchersNarrowingTest.cpp
+3-3clang/docs/LibASTMatchersReference.html
+1-1clang/include/clang/ASTMatchers/ASTMatchers.h
+1-1clang-tools-extra/clang-tidy/objc/AssertEqualsCheck.cpp
+18-54 files

LLVM/project f7ed15butils/bazel/llvm-project-overlay/clang BUILD.bazel, utils/bazel/llvm-project-overlay/lldb BUILD.bazel

[bazel] Fully port 3773bbe9e7916ec89fb3e3cd02e29c54cabac82b (#169247)

e5edb512072bc040face27ed6c9e92f4a5f1e910 attempted to port this, but
seemed to miss a couple things that still showed up on CI. This patch
fixes up the missing pieces.
DeltaFile
+1-1utils/bazel/llvm-project-overlay/lldb/BUILD.bazel
+1-0utils/bazel/llvm-project-overlay/lldb/source/Plugins/BUILD.bazel
+1-0utils/bazel/llvm-project-overlay/clang/BUILD.bazel
+3-13 files