LLVM/project 2db7110llvm/lib/Target/RISCV RISCVRegisterInfo.cpp, llvm/test/CodeGen/RISCV xqcibm-regalloc-hints.ll

[RISCV] Add regalloc hints for BSETI/BEXTI (#173964)

This patch hints the register allocator to use the same source and
destination registers for the `BEXTI/BSETI` instructions when the
`Xqcibm` vendor extension is enabled. This enables the generation of the
compressed `QC_C_BEXTI/QC_C_BSETI` instructions when possible.
DeltaFile
+65-0llvm/test/CodeGen/RISCV/xqcibm-regalloc-hints.ll
+5-0llvm/lib/Target/RISCV/RISCVRegisterInfo.cpp
+70-02 files

LLVM/project a5d7611mlir/include/mlir/Interfaces ControlFlowInterfaces.td ControlFlowInterfaces.h, mlir/lib/Analysis/DataFlow SparseAnalysis.cpp

[mlir][Interfaces] Add `RegionBranchOpInterface` helper for forwarded values
DeltaFile
+21-29mlir/lib/Analysis/DataFlow/SparseAnalysis.cpp
+22-23mlir/include/mlir/Interfaces/ControlFlowInterfaces.td
+44-0mlir/lib/Interfaces/ControlFlowInterfaces.cpp
+6-31mlir/lib/Dialect/Bufferization/Transforms/BufferViewFlowAnalysis.cpp
+9-14mlir/lib/Dialect/XeGPU/Transforms/XeGPUPropagateLayout.cpp
+13-0mlir/include/mlir/Interfaces/ControlFlowInterfaces.h
+115-976 files

LLVM/project 50645a9mlir/include/mlir/Interfaces ControlFlowInterfaces.td ControlFlowInterfaces.h, mlir/lib/Analysis/DataFlow SparseAnalysis.cpp

[mlir][Interfaces] Add `RegionBranchOpInterface` helper for forwarded values
DeltaFile
+21-29mlir/lib/Analysis/DataFlow/SparseAnalysis.cpp
+22-23mlir/include/mlir/Interfaces/ControlFlowInterfaces.td
+44-0mlir/lib/Interfaces/ControlFlowInterfaces.cpp
+6-31mlir/lib/Dialect/Bufferization/Transforms/BufferViewFlowAnalysis.cpp
+13-18mlir/lib/Dialect/XeGPU/Transforms/XeGPUPropagateLayout.cpp
+13-0mlir/include/mlir/Interfaces/ControlFlowInterfaces.h
+119-1016 files

LLVM/project 60606b2mlir/include/mlir/Interfaces ControlFlowInterfaces.td, mlir/lib/Interfaces ControlFlowInterfaces.cpp

[mlir][Interfaces] Add `RegionBranchOpInterface::getSuccessorOperands` helper (#173971)

Add a helper for querying the successor operands for a region branch
`src -> dst`. Both `src` and `dst` may be the region branch op itself or
a terminator.

This helper allows users to query successor operands for the region
branch op and the terminators in a uniform way. This is similar to
`getSuccessorRegions(RegionBranchPoint)`, which works both for region
branch ops and terminators.
DeltaFile
+19-21mlir/lib/Transforms/RemoveDeadValues.cpp
+10-0mlir/lib/Interfaces/ControlFlowInterfaces.cpp
+9-0mlir/include/mlir/Interfaces/ControlFlowInterfaces.td
+38-213 files

LLVM/project d72a2f9llvm/lib/Transforms/InstCombine InstCombineSimplifyDemanded.cpp, llvm/test/Transforms/InstCombine simplify-demanded-fpclass-maximum.ll simplify-demanded-fpclass-minimum.ll

InstCombine: Introduce nsz flag on minimum/maximum in SimplifyDemandedFPClass

Alive isn't particularly happy with this in the case where
one of the inputs could be zero, but I think
it's wrong: https://alive2.llvm.org/ce/z/dF7V6k

nsz shouldn't permit introducing a -0 result where
there wasn't one in the input here.
DeltaFile
+30-30llvm/test/Transforms/InstCombine/simplify-demanded-fpclass-maximum.ll
+30-30llvm/test/Transforms/InstCombine/simplify-demanded-fpclass-minimum.ll
+18-2llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
+78-623 files

LLVM/project 86aace1llvm/lib/Transforms/InstCombine InstCombineSimplifyDemanded.cpp, llvm/test/Transforms/InstCombine simplify-demanded-fpclass-minimumnum.ll simplify-demanded-fpclass-maximumnum.ll

InstCombine: Handle minimumnum/maximumnum in SimplifyDemandedFPClass
DeltaFile
+36-59llvm/test/Transforms/InstCombine/simplify-demanded-fpclass-minimumnum.ll
+34-55llvm/test/Transforms/InstCombine/simplify-demanded-fpclass-maximumnum.ll
+64-12llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
+134-1263 files

LLVM/project 87371adllvm/test/Transforms/InstCombine simplify-demanded-fpclass-maximumnum.ll simplify-demanded-fpclass-minimumnum.ll

InstCombine: Add baseline minimumnum/maximumnum SimplifyDemandedFPClass tests
DeltaFile
+1,625-0llvm/test/Transforms/InstCombine/simplify-demanded-fpclass-maximumnum.ll
+1,625-0llvm/test/Transforms/InstCombine/simplify-demanded-fpclass-minimumnum.ll
+3,250-02 files

LLVM/project 2ba4cb9llvm/include/llvm/Support KnownFPClass.h, llvm/lib/Analysis ValueTracking.cpp

InstCombine: Handle minimum/maximum in SimplifyDemandedFPClass
DeltaFile
+51-80llvm/test/Transforms/InstCombine/simplify-demanded-fpclass-maximum.ll
+49-76llvm/test/Transforms/InstCombine/simplify-demanded-fpclass-minimum.ll
+26-87llvm/lib/Analysis/ValueTracking.cpp
+94-1llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
+92-0llvm/lib/Support/KnownFPClass.cpp
+14-0llvm/include/llvm/Support/KnownFPClass.h
+326-2446 files

LLVM/project bb4c880libcxx/include/__ranges iota_view.h, libcxx/test/libcxx/ranges/range.factories/range.iota.view nodiscard.verify.cpp

[libc++][ranges] Applied `[[nodiscard]]` to `iota_view` (#173612)

`[[nodiscard]]` should be applied to functions where discarding the
return value is most likely a correctness issue.

- https://libcxx.llvm.org/CodingGuidelines.html
- https://wg21.link/range.iota

Towards #172124
DeltaFile
+125-0libcxx/test/libcxx/ranges/range.factories/range.iota.view/nodiscard.verify.cpp
+19-16libcxx/include/__ranges/iota_view.h
+144-162 files

LLVM/project fc30dc4libcxx/include/__ranges drop_view.h, libcxx/test/libcxx/ranges/range.adaptors/range.drop nodiscard.verify.cpp

[libc++][ranges] Applied `[[nodiscard]]` to `drop_view` (#173557)

`[[nodiscard]]` should be applied to functions where discarding the
return value is most likely a correctness issue.

- https://libcxx.llvm.org/CodingGuidelines.html
- https://wg21.link/range.drop

Towards #172124
DeltaFile
+144-0libcxx/test/libcxx/ranges/range.adaptors/range.drop/nodiscard.verify.cpp
+8-8libcxx/include/__ranges/drop_view.h
+152-82 files

LLVM/project 437ae34libcxx/include/__numeric gcd_lcm.h

[libc++][NFC] Simplify `gcd` a bit (#173570)

1. With `if constexpr` we can avoid partial specializations of
`__ct_gcd`. This patch changes it to a function template and renames it
to `__abs_in_type` to slightly improve readability.
2. `__gcd` was made non-recursive by
27a062e9ca7c92e89ed4084c3c3affb9fa39aabb, so this patch simply inlines
it into `gcd`.
DeltaFile
+24-34libcxx/include/__numeric/gcd_lcm.h
+24-341 files

LLVM/project baf029cllvm/lib/Transforms/Utils CloneFunction.cpp

remove outdated comment
DeltaFile
+0-3llvm/lib/Transforms/Utils/CloneFunction.cpp
+0-31 files

LLVM/project 8be2c19mlir/lib/Dialect/Tensor/IR TensorOps.cpp, mlir/test/Dialect/Tensor invalid.mlir

[MLIR] Fix mlir-opt crash in ReshapeOpsUtils.cpp when collapse_shape index is invalid (#173791)

This patch fixes a crash occurring in mlir-opt when running
collapse_shape with an invalid index configuration. Instead of crashing,
an error message is returned to the user.
Fixes: #173567

---------

Co-authored-by: Bazinga! <akparmar004>
DeltaFile
+9-0mlir/test/Dialect/Tensor/invalid.mlir
+5-0mlir/lib/Dialect/Tensor/IR/TensorOps.cpp
+14-02 files

LLVM/project 343c951llvm/test/Transforms/InstCombine simplify-demanded-fpclass-maximum.ll simplify-demanded-fpclass-minimum.ll

InstCombine: Add baseline tests for minimum/maximum SimplifyDemandedFPClass handling
DeltaFile
+1,625-0llvm/test/Transforms/InstCombine/simplify-demanded-fpclass-maximum.ll
+1,625-0llvm/test/Transforms/InstCombine/simplify-demanded-fpclass-minimum.ll
+3,250-02 files

LLVM/project 0ef90f7llvm/lib/Transforms/Utils CloneFunction.cpp

[CloneFunction] Fix non-deterministic PHI cleanup using PHINode::removeIncomingValueIf()

Previously, we use `std::map<BasicBlock *, unsigned> PredCount` to track excess incoming blocks and removed them one by one using `removeIncomingValue`.
then we remove the excess incoming blocks one by one.

Since `PredCount` use `BasicBlock *` as key, the iteration order depends on the memory addresses of the blocks.
With `PHINode::removeIncomingValue()` changed to use the swapping strategy, the order in which operands are removed affects the final order of the remaining operands in the PHI node. this will cause non-determinism in compiles.

This patch uses `PHINode::removeIncomingValueIf()` to remove invalid incoming blocks that no longer
go to `NewBB` block, fixes the non-determinism.
DeltaFile
+8-12llvm/lib/Transforms/Utils/CloneFunction.cpp
+8-121 files

LLVM/project 98d8211llvm/test/CodeGen/AArch64 sme-framelower-use-bp.ll

[AArch64][SME] Vastly simplify and fix `sme-framelower-use-bp.ll` (NFC) (#172999)

This test was added in:
https://github.com/llvm/llvm-project/commit/d4c86e7f3ea298b259e673142470a7b838f5f302

However, over time this test has stopped testing that change. That
change ensures that LLVM sets up the base-pointer in functions with only
+sme (no sve) and dynamic allocas + SVE stack objects.

The original test did not intend to have dynamic allocas or SVE stack
objects though. They were introduced by the IR-based SME ABI pass
unintentionally pushing allocas outside the entry block and SVE spills.

Both of these have been resolved, so this test was not testing the
original change. This patch simplifies the test, and corrects it so
tests the intended functionality.
DeltaFile
+28-742llvm/test/CodeGen/AArch64/sme-framelower-use-bp.ll
+28-7421 files

LLVM/project 13a8974mlir/include/mlir/Interfaces ControlFlowInterfaces.td, mlir/lib/Interfaces ControlFlowInterfaces.cpp

[mlir][Interfaces] Add `RegionBranchOpInterface::getSuccessorOperands` helper
DeltaFile
+19-21mlir/lib/Transforms/RemoveDeadValues.cpp
+10-0mlir/lib/Interfaces/ControlFlowInterfaces.cpp
+9-0mlir/include/mlir/Interfaces/ControlFlowInterfaces.td
+38-213 files

LLVM/project ff1f362llvm/lib/Target/AArch64 AArch64SchedNeoverseN2.td, llvm/test/tools/llvm-mca/AArch64/Neoverse N2-basic-instructions.s

[MCA][AArch64] Model single-register EXTR as ROR on Neoverse N2 (#172831)

As per the SWOG for [Neoverse
N2](https://developer.arm.com/documentation/109914/latest/), the latency
of a one register bitfield extract should be 1 and the throughput should
be 4. This patch models the single register EXTR (alias ROR) for the
Neoverse N2 model.
DeltaFile
+7-7llvm/test/tools/llvm-mca/AArch64/Neoverse/N2-basic-instructions.s
+5-4llvm/lib/Target/AArch64/AArch64SchedNeoverseN2.td
+12-112 files

LLVM/project f78d143llvm/test/tools/llvm-mca mcpu-help.test, llvm/tools/llvm-mca llvm-mca.cpp

[MCA] Fix -mcpu=help flag (#173399)

Previously, using the `-mcpu=help` flag would require an empty stdin to
be passed to print the CPU/Features
list.

- Moves the `MemoryBuffer::getFileOrSTDIN` call below an early return.
- Adds a test mcpu-help.test is included which tests the flag with a
missing file. Previously, this would have resulted in an error with no
outputted help list, but now provides the help list and ignores the
missing file input.
DeltaFile
+10-7llvm/tools/llvm-mca/llvm-mca.cpp
+11-0llvm/test/tools/llvm-mca/mcpu-help.test
+21-72 files

LLVM/project 7e24e86clang/include/clang/Basic BuiltinsAMDGPU.def, clang/test/SemaHIP amdgpu-global-atomic-fmin-err.hip amdgpu-global-atomic-fmax-err.hip

[Clang] Remove 't' from __builtin_amdgcn_global_atomic_fmin/fmax_f64
DeltaFile
+2-3clang/test/SemaHIP/amdgpu-global-atomic-fmin-err.hip
+2-3clang/test/SemaHIP/amdgpu-global-atomic-fmax-err.hip
+2-2clang/include/clang/Basic/BuiltinsAMDGPU.def
+6-83 files

LLVM/project 956bda1clang/test/SemaHIP amdgpu-global-atomic-fmax-err.hip amdgpu-global-atomic-fmin-err.hip

Pre-commit tests: [Clang] Remove 't' from __builtin_amdgcn_global_atomic_fmin/fmax_f64
DeltaFile
+23-0clang/test/SemaHIP/amdgpu-global-atomic-fmax-err.hip
+23-0clang/test/SemaHIP/amdgpu-global-atomic-fmin-err.hip
+46-02 files

LLVM/project 54faa75llvm/cmake/modules AddLLVM.cmake

[LLVM][CMake][NFC] Use generator expression to separate CXXFLAGS (#173869)

This avoids looking at the individual sources for mixed C/C++ libraries.

The previous code was written ~2014. Generator expressions were added in
CMake 3.3 (2015). We currently require CMake 3.20 and therefore can rely
on more modern features.

Apart from simplifying the code, this is preliminary work to make more
use of pre-compiled headers (#173868).
DeltaFile
+14-42llvm/cmake/modules/AddLLVM.cmake
+14-421 files

LLVM/project 5f05793mlir/test/mlir-tblgen op-attribute.td, mlir/tools/mlir-tblgen OpDefinitionsGen.cpp

[mlir][ods] Fix ODS bug for usePropertiesForAttributes = 0 (#173006)

This fixes invalid cpp generated in the `verifyInvariantsImpl` method
for operations generated from ODS when `usePropertiesForAttributes = 0`
is set on the Dialect.

Fixes the bug introduced in
- https://github.com/llvm/llvm-project/pull/153603

Closes #171217
DeltaFile
+46-0mlir/test/mlir-tblgen/op-attribute.td
+1-1mlir/tools/mlir-tblgen/OpDefinitionsGen.cpp
+47-12 files

LLVM/project f0582f7mlir/lib/Dialect/Tensor/IR ValueBoundsOpInterfaceImpl.cpp, mlir/test/Dialect/Tensor value-bounds-op-interface-impl.mlir

Reland "[mlir][tensor] Add ValueBoundsOpInterface for ExpandShapeOp and CollapseShapeOp #173356" (#173857)

The original PR #173356 was reverted (commit 5d6c40b) due to an
AddressSanitizer failure
(https://lab.llvm.org/buildbot/#/builders/52/builds/13831).

The failure was caused by incorrect use of a const reference
https://github.com/llvm/llvm-project/pull/173356#discussion_r2643027667,
which bound a reference to a temporary value returned by
`getReassociationIndices()`.

This reland drops the const reference and uses a copy instead.

Signed-off-by: Yu-Zhewen <zhewenyu at amd.com>
DeltaFile
+36-0mlir/lib/Dialect/Tensor/IR/ValueBoundsOpInterfaceImpl.cpp
+32-0mlir/test/Dialect/Tensor/value-bounds-op-interface-impl.mlir
+68-02 files

LLVM/project 931c0fcmlir/include/mlir/Transforms Passes.td Passes.h, mlir/lib/Transforms RemoveDeadValues.cpp

tmp commit

simple test working

draft: do not erase IR, just replace uses
DeltaFile
+203-313mlir/lib/Transforms/RemoveDeadValues.cpp
+110-45mlir/test/Transforms/remove-dead-values.mlir
+10-0mlir/include/mlir/Transforms/Passes.td
+1-0mlir/include/mlir/Transforms/Passes.h
+324-3584 files

LLVM/project abfac95mlir/docs Canonicalization.md

[mlir][docs] Add more examples for the "canonical form" (#173667)

Mention that there is no formal definition of the canonical form. Also
add more examples for users to understand what kind of transformations
the community has agreed upon in the past.

---------

Co-authored-by: Mehdi Amini <joker.eph at gmail.com>
DeltaFile
+37-20mlir/docs/Canonicalization.md
+37-201 files

LLVM/project ca73d19mlir/lib/Transforms RemoveDeadValues.cpp

[mlir][Transforms][NFC] `remove-dead-values`: Simplify dropped value handling (#173540)

`RDVFinalCleanupList::values` is used only for function op handling. The
functionality for dropping function arg uses can be incorporated into
Step 5 (function op handling). There is no need for a separate step.
DeltaFile
+7-14mlir/lib/Transforms/RemoveDeadValues.cpp
+7-141 files

LLVM/project f04dc3bclang/include/clang/Basic BuiltinsAMDGPU.def, clang/test/CodeGenOpenCL builtins-fp-atomics-gfx90a.cl

[Clang] Remove 't' from __builtin_amdgcn_flat_atomic_fmin/fmax_f64 (#173839)

Allows for type checking depending on the built-in signature.

There is no `f32` version for both builtins
DeltaFile
+17-0clang/test/SemaHIP/amdgpu-flat-atomic-fmax-err.hip
+17-0clang/test/SemaHIP/amdgpu-flat-atomic-fmin-err.hip
+4-4clang/test/CodeGenOpenCL/builtins-fp-atomics-gfx90a.cl
+2-2clang/include/clang/Basic/BuiltinsAMDGPU.def
+40-64 files

LLVM/project e44210bclang-tools-extra/clang-doc Serialize.cpp JSONGenerator.cpp, clang-tools-extra/clang-doc/assets class-template.mustache

[clang-doc] Add friends to class template

This patch also allows comments to be associated with friend
declarations. Currently, it seems like the comments for friend `RecordDecl`
are taken from the actual class declaration, while a friend
function's comments are taken from the actual `friend` declaration.
DeltaFile
+59-3clang-tools-extra/test/clang-doc/json/class.cpp
+35-0clang-tools-extra/clang-doc/assets/class-template.mustache
+5-2clang-tools-extra/clang-doc/Serialize.cpp
+5-2clang-tools-extra/clang-doc/JSONGenerator.cpp
+4-0clang-tools-extra/clang-doc/BitcodeReader.cpp
+2-0clang-tools-extra/clang-doc/BitcodeWriter.cpp
+110-71 files not shown
+111-77 files

LLVM/project 61aadebclang-tools-extra/clang-doc JSONGenerator.cpp Serialize.cpp, clang-tools-extra/clang-doc/assets class-template.mustache

[clang-doc] Add friends to class template

This patch also allows comments to be associated with friend
declarations. Currently, it seems like the comments for friend `RecordDecl`
are taken from the actual class declaration, while a friend
function's comments are taken from the actual `friend` declaration.
DeltaFile
+59-3clang-tools-extra/test/clang-doc/json/class.cpp
+35-0clang-tools-extra/clang-doc/assets/class-template.mustache
+5-2clang-tools-extra/clang-doc/JSONGenerator.cpp
+5-2clang-tools-extra/clang-doc/Serialize.cpp
+4-0clang-tools-extra/clang-doc/BitcodeReader.cpp
+2-0clang-tools-extra/clang-doc/BitcodeWriter.cpp
+110-71 files not shown
+111-77 files