LLVM/project e5d2358polly/include/polly ScopDetectionDiagnostic.h ScopDetection.h, polly/lib/Analysis ScopDetectionDiagnostic.cpp ScopDetection.cpp

[Polly] Reject scalable vector types (#177871)

Polly currently does not consider types without fixed length, which can
be encountered if an input source uses e.g. ARM SVE builtins. Such
programs have already been optimized manually. Non-fixed type lengths
also add to the difficulty of dependency analysis. Skip such types
entirely for now.
 
Fixes: #177859
DeltaFile
+95-0polly/test/ScopDetectionDiagnostics/ReportIncompatibleType.ll
+32-0polly/lib/Analysis/ScopDetectionDiagnostic.cpp
+28-0polly/include/polly/ScopDetectionDiagnostic.h
+17-0polly/lib/Analysis/ScopDetection.cpp
+4-0polly/include/polly/ScopDetection.h
+176-05 files

LLVM/project 14bdd06mlir/lib/Dialect/SCF/Transforms LoopSpecialization.cpp, mlir/lib/Dialect/Utils StaticValueUtils.cpp

[mlir][DialectUtils] Fix 0 step handling in `constantTripCount` (#177329)

A step size of "zero" does not indicate "zero iterations". It may
indicate an infinite number of iterations.

This commit makes some transformations more conservative. We used to
fold away some loops with step size 0 and that's now no longer the case.

Relation discussion:
https://discourse.llvm.org/t/infinite-loops-and-dead-code/89530
DeltaFile
+11-3mlir/lib/Dialect/Utils/StaticValueUtils.cpp
+8-2mlir/test/Dialect/SCF/for-loop-peeling.mlir
+4-3mlir/test/Dialect/SCF/canonicalize.mlir
+3-0mlir/lib/Dialect/SCF/Transforms/LoopSpecialization.cpp
+26-84 files

LLVM/project 544c300llvm/lib/CodeGen/SelectionDAG DAGCombiner.cpp

DAG: Use poison instead of undef in some vector combines (#177612)

Use poison for the unused or out of bounds vector components.
DeltaFile
+48-48llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+48-481 files

LLVM/project 0666a77llvm/lib/Transforms/Vectorize SLPVectorizer.cpp, llvm/test/Transforms/SLPVectorizer/X86 vec_list_bias-inseltpoison.ll

[SLP]Support for tree throttling in SLP graphs with gathered loads

Gathered loads forming DAG instead of trees in SLP vectorizer. When
doing the throttling analysis for such graphs, need to consider partially
matched gathered loads DAG nodes and consider extract and/or gather
operations and their costs.
The patch adds this analysis and allows cutting off the expensive
sub-graphs with gathered loads.

Reviewers: hiraditya, RKSimon

Pull Request: https://github.com/llvm/llvm-project/pull/177855
DeltaFile
+99-14llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+12-13llvm/test/Transforms/SLPVectorizer/X86/vec_list_bias-inseltpoison.ll
+111-272 files

LLVM/project 73ebadaclang/docs ReleaseNotes.rst, clang/include/clang/Sema Overload.h

[clang] Don't assert on perfect overload match with _Atomic (#176619)

An assertion incorrectly treated difference in _Atomic qualification as
different types for the purpose of verifying a perfect match in overload
resolution in C++.

Fixes #170433
DeltaFile
+16-0clang/test/SemaCXX/crash-GH170433.cpp
+2-1clang/include/clang/Sema/Overload.h
+1-0clang/docs/ReleaseNotes.rst
+19-13 files

LLVM/project 9d6f011llvm/include/llvm/IR PatternMatch.h, llvm/lib/Transforms/Vectorize VectorCombine.cpp

[VectorCombine] Fold vector.reduce.OP(F(X)) == 0 -> OP(X) == 0 (#173069)

This commit introduces a pattern to do the following fold:

  vector.reduce.OP f(X_i) == 0 -> vector.reduce.OP X_i == 0

In order to decide on this fold, we use the following properties:

1. OP X_i == 0 <=> \forall i \in [1, N] X_i == 0 1'. OP X_i == 0 <=>
\exists j \in [1, N] X_j == 0
  2.  f(x) == 0 <=> x == 0

From 1 and 2 (or 1' and 2), we can infer that

  OP f(X_i) == 0 <=> OP X_i == 0.

For some of the OP's and f's, we need to have domain constraints on X to
ensure properties 1 (or 1') and 2.


    [52 lines not shown]
DeltaFile
+672-0llvm/test/Transforms/VectorCombine/X86/icmp-vector-reduce.ll
+672-0llvm/test/Transforms/VectorCombine/AArch64/icmp-vector-reduce.ll
+183-0llvm/lib/Transforms/Vectorize/VectorCombine.cpp
+9-0llvm/include/llvm/IR/PatternMatch.h
+1,536-04 files

LLVM/project 9eaa1ffclang/test/CodeGen builtin-rotate.c

[clang][test] Fix builtin-rotate.c test __int128 test failure on ARM32 (#177732)

- Run the INT128 prefix checks on 64-bit targets since __int128 is not
supported on ARM32

Fixes https://lab.llvm.org/buildbot/#/builders/154/builds/26813

DeltaFile
+4-3clang/test/CodeGen/builtin-rotate.c
+4-31 files

LLVM/project 029efa6utils/bazel/llvm-project-overlay/mlir BUILD.bazel

[bazel] Add missing dependencies for 778a2491149512109541cd5d59bad2d55024bdb7
DeltaFile
+2-0utils/bazel/llvm-project-overlay/mlir/BUILD.bazel
+2-01 files

LLVM/project 13d82f3llvm/lib/CodeGen/SelectionDAG DAGCombiner.cpp

update comments
DeltaFile
+5-5llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+5-51 files

LLVM/project 4e1d431llvm/lib/CodeGen/SelectionDAG DAGCombiner.cpp

DAG: Use poison instead of undef in some vector combines

Use poison for the unused or out of bounds vector components.
DeltaFile
+43-43llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+43-431 files

LLVM/project a80d432llvm/lib/Transforms/InstCombine InstCombineSimplifyDemanded.cpp, llvm/test/Transforms/InstCombine simplify-demanded-fpclass.ll

InstCombine: Apply parameter nofpclass in SimplifyDemandedFPClass (#176104)

Apply the use operand's nofpclass to the demanded mask.
DeltaFile
+11-0llvm/test/Transforms/InstCombine/simplify-demanded-fpclass.ll
+5-0llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
+16-02 files

LLVM/project b1b8410llvm/include/llvm/Support KnownFPClass.h, llvm/lib/Transforms/InstCombine InstCombineSimplifyDemanded.cpp

InstCombine: Handle multiple use copysign

Handle multiple use copysign in SimplifyDemandedFPClass
DeltaFile
+36-3llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
+7-7llvm/test/Transforms/InstCombine/simplify-demanded-fpclass.ll
+7-0llvm/include/llvm/Support/KnownFPClass.h
+50-103 files

LLVM/project 9a0bca2llvm/lib/Transforms/InstCombine InstCombineSimplifyDemanded.cpp, llvm/test/Transforms/InstCombine simplify-demanded-fpclass.ll

InstCombine: Handle nsz in copysign SimplifyDemandedFPClass

If the only sign bit difference is for 0, fold through the source.
DeltaFile
+31-1llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
+2-4llvm/test/Transforms/InstCombine/simplify-demanded-fpclass.ll
+33-52 files

LLVM/project 78ce56cllvm/lib/Transforms/InstCombine InstCombineSimplifyDemanded.cpp

Address comments
DeltaFile
+3-3llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
+3-31 files

LLVM/project 025f150llvm/test/Transforms/InstCombine simplify-demanded-fpclass.ll

InstCombine: Add baseline tests for SimplifyDemandedFPClass copysign improvements

Prepare to support more folds and multiple uses.
DeltaFile
+651-0llvm/test/Transforms/InstCombine/simplify-demanded-fpclass.ll
+651-01 files

LLVM/project 0b24eacllvm/lib/Transforms/InstCombine InstCombineSimplifyDemanded.cpp, llvm/test/Transforms/InstCombine simplify-demanded-fpclass.ll

InstCombine: Improve single-use fneg(fabs(x)) SimplifyDemandedFPClass handling

Match the multi-use case's logic for understanding no-nan/no-inf context.
Also only apply the nsz handling in the single use case. alive2 seems to treat
nsz as nondeterministic for each use.
DeltaFile
+244-11llvm/test/Transforms/InstCombine/simplify-demanded-fpclass.ll
+74-20llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
+318-312 files

LLVM/project 2370bf2llvm/include/llvm/CodeGen SDPatternMatch.h, llvm/lib/CodeGen/SelectionDAG DAGCombiner.cpp

[DAG] Extend MinMax matchers to detect flippable sign (#177504)

Fixes #174328
DeltaFile
+115-0llvm/unittests/CodeGen/SelectionDAGPatternMatchTest.cpp
+68-0llvm/test/CodeGen/AArch64/abds.ll
+22-8llvm/include/llvm/CodeGen/SDPatternMatch.h
+8-8llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+213-164 files

LLVM/project 61c1621mlir/lib/CAPI/Transforms Rewrite.cpp, mlir/lib/Dialect/ArmNeon/Transforms LowerContractToNeonPatterns.cpp

[MLIR] Fix GCC's `-Wreturn-type` warnings (#177654)

This patch fixes `-Wreturn-type` warnings which happens if MLIR is built
with GCC compiler (11.5 is used for detecting)


Founded errors
```
build/llvm-llvmorg-21.1.8/mlir/lib/CAPI/Transforms/Rewrite.cpp: In function ‘MlirGreedyRewriteStrictness mlirGreedyRewriteDriverConfigGetStrictness(MlirGreedyRewriteDriverConfig)’:
build/llvm-llvmorg-21.1.8/mlir/lib/CAPI/Transforms/Rewrite.cpp:399:1: warning: control reaches end of non-void function [-Wreturn-type]
  399 | }
      | ^
build/llvm-llvmorg-21.1.8/mlir/lib/CAPI/Transforms/Rewrite.cpp: In function ‘MlirGreedySimplifyRegionLevel mlirGreedyRewriteDriverConfigGetRegionSimplificationLevel(MlirGreedyRewriteDriverConfig)’:
build/llvm-llvmorg-21.1.8/mlir/lib/CAPI/Transforms/Rewrite.cpp:414:1: warning: control reaches end of non-void function [-Wreturn-type]
  414 | }
      | ^
build/llvm-llvmorg-21.1.8/mlir/lib/Dialect/GPU/IR/GPUDialect.cpp: In member function ‘mlir::Speculation::Speculatability mlir::gpu::SubgroupBroadcastOp::getSpeculatability()’:
build/llvm-llvmorg-21.1.8/mlir/lib/Dialect/GPU/IR/GPUDialect.cpp:2522:1: warning: control reaches end of non-void function [-Wreturn-type]
 2522 | }

    [20 lines not shown]
DeltaFile
+2-0mlir/lib/CAPI/Transforms/Rewrite.cpp
+2-0mlir/lib/Dialect/GPU/IR/GPUDialect.cpp
+1-0mlir/test/lib/Dialect/Test/TestOpDefs.cpp
+1-0mlir/lib/Dialect/Linalg/IR/LinalgOps.cpp
+1-0mlir/lib/Dialect/ArmNeon/Transforms/LowerContractToNeonPatterns.cpp
+7-05 files

LLVM/project 2297e0dllvm/lib/Transforms/Utils MoveAutoInit.cpp, llvm/test/Transforms/MoveAutoInit loop-store.ll

[MoveAutoInit] Fix for miscompilation for #150120 (#173961)

Fixes the miscompilation discussed for the PR #164882 as part of
generalizing the optimization for the issue #150120.

Without this commit, MoveAutoInit moves the store instruction to a
different branch which does not dominate the user dominator node. This
results in UB at runtime. The example in the test case is specifically
for an irreducible loop, in which all the predecessor may not dominate
user dominator head.

To fix this problem, we've introduced a new check to verify if the
predecessor of the user dominator node does in fact dominate user
dominator node before deciding that it is the node where the instruction
will be moved to.
DeltaFile
+59-0llvm/test/Transforms/MoveAutoInit/loop-store.ll
+2-1llvm/lib/Transforms/Utils/MoveAutoInit.cpp
+61-12 files

LLVM/project 2cc4d45mlir/include/mlir/Bindings/Python IRCore.h, mlir/python CMakeLists.txt

[MLIR][Python] Add a DSL for defining dialects in Python bindings (#169045)

Python bindings for the IRDL dialect were introduced in #158488. They
are currently usable—for constructing IR and dynamically loading modules
that contain `irdl.dialect` into MLIR. However, there are still several
pain points when working with them:

* The IRDL IR-building interface is not very intuitive and tends to be
quite verbose.
* We do not yet have the corresponding `OpView` classes for IRDL-defined
operations.

To address these issues, I propose creating a wrapper (effectively a
small “DSL”) on top of the existing IRDL Python bindings. This wrapper
aims to simplify IR construction and automatically generate the
corresponding `OpView` types. A simple example is shown below.

Currently, using the IRDL bindings looks like this:


    [72 lines not shown]
DeltaFile
+471-0mlir/python/mlir/dialects/ext.py
+340-0mlir/test/python/dialects/ext.py
+4-3mlir/include/mlir/Bindings/Python/IRCore.h
+1-0mlir/python/CMakeLists.txt
+816-34 files

LLVM/project db4405ellvm/lib/Transforms/InstCombine InstCombineSimplifyDemanded.cpp, llvm/test/Transforms/InstCombine simplify-demanded-fpclass-rounding-intrinsics.ll

InstCombine: Infer nnan/ninf on rounding intrinsics (#177770)

DeltaFile
+52-42llvm/test/Transforms/InstCombine/simplify-demanded-fpclass-rounding-intrinsics.ll
+8-0llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
+60-422 files

LLVM/project b98e160llvm/lib/Transforms/Vectorize SLPVectorizer.cpp

Fix a crash in TTI

Created using spr 1.3.7
DeltaFile
+4-1llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+4-11 files

LLVM/project 2f94635llvm/lib/Transforms/InstCombine InstCombineSimplifyDemanded.cpp, llvm/test/Transforms/InstCombine simplify-demanded-fpclass-fptrunc-round.ll simplify-demanded-fpclass-fptrunc.ll

InstCombine: Infer nnan and ninf on fptrunc (#177769)

Teach SimplifyDemandedFPClass to do this, although this is
not yet applied directly to the cast.
DeltaFile
+31-21llvm/test/Transforms/InstCombine/simplify-demanded-fpclass-fptrunc-round.ll
+19-19llvm/test/Transforms/InstCombine/simplify-demanded-fpclass-fptrunc.ll
+11-11llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
+61-513 files

LLVM/project 786a207llvm/lib/Target/AMDGPU GCNSubtarget.h AMDGPUSubtarget.h

[NFCI][AMDGPU] Use `GET_SUBTARGETINFO_MACRO` in `GCNSubtarget.h` and `R600Subtarget.h` (#177402)

We can finally get rid of the manually defined boolean variables, like
other targets. Even though most of them are now defined by macros, we
still need to add the entries.
DeltaFile
+9-295llvm/lib/Target/AMDGPU/GCNSubtarget.h
+18-44llvm/lib/Target/AMDGPU/AMDGPUSubtarget.h
+14-11llvm/lib/Target/AMDGPU/R600Subtarget.h
+7-5llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+3-3llvm/lib/Target/AMDGPU/R600Processors.td
+0-2llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp
+51-3602 files not shown
+53-3628 files

LLVM/project 431dea9llvm/lib/Transforms/InstCombine InstCombineSimplifyDemanded.cpp, llvm/test/Transforms/InstCombine simplify-demanded-fpclass-fpext.ll

InstCombine: Infer nnan and ninf on fpext (#177768)

Teach SimplifyDemandedFPClass to do this, although this is
not yet applied directly to the cast.
DeltaFile
+12-12llvm/test/Transforms/InstCombine/simplify-demanded-fpclass-fpext.ll
+4-3llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
+16-152 files

LLVM/project d77ced0llvm/lib/Transforms/InstCombine InstCombineSimplifyDemanded.cpp, llvm/test/Transforms/InstCombine simplify-demanded-fpclass-canonicalize.ll

InstCombine: Infer nnan/ninf on canonicalize (#177771)

DeltaFile
+22-12llvm/test/Transforms/InstCombine/simplify-demanded-fpclass-canonicalize.ll
+8-0llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
+30-122 files

LLVM/project 60b1e95llvm/lib/Transforms/InstCombine InstCombineSimplifyDemanded.cpp, llvm/test/Transforms/InstCombine simplify-demanded-fpclass-log.ll simplify-demanded-fpclass.ll

InstCombine: Infer nnan and ninf flags on log intrinsics (#177767)

Use the new common utility function to try fold to constant
or introduce flags.
DeltaFile
+8-8llvm/test/Transforms/InstCombine/simplify-demanded-fpclass-log.ll
+2-2llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
+2-2llvm/test/Transforms/InstCombine/simplify-demanded-fpclass.ll
+12-123 files

LLVM/project e9aae6allvm/lib/Transforms/InstCombine InstCombineSimplifyDemanded.cpp, llvm/test/Transforms/InstCombine simplify-demanded-fpclass-exp.ll simplify-demanded-fpclass.ll

InstCombine: Infer nnan and ninf flags on exp intrinsics (#177766)

Use the new common utility function to try fold to constant
or introduce flags.
DeltaFile
+11-11llvm/test/Transforms/InstCombine/simplify-demanded-fpclass-exp.ll
+3-3llvm/test/Transforms/InstCombine/simplify-demanded-fpclass.ll
+2-2llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
+16-163 files

LLVM/project 51a262fllvm/utils/gn/secondary/clang-tools-extra/clang-tidy/llvm BUILD.gn

[gn build] Port 49d464ccaf44
DeltaFile
+1-0llvm/utils/gn/secondary/clang-tools-extra/clang-tidy/llvm/BUILD.gn
+1-01 files

LLVM/project 94b90c6llvm/utils/gn/secondary/libcxx/include BUILD.gn

[gn build] Port 9311996261e1
DeltaFile
+1-0llvm/utils/gn/secondary/libcxx/include/BUILD.gn
+1-01 files