LLVM/project 08de4fdllvm/lib/CodeGen/SelectionDAG SelectionDAGISel.cpp, llvm/test/TableGen RegClassByHwMode.td

[SelectionDAG] Move HwMode expansion from tablegen to SelectionISel. (#174471)

The way HwMode is currently implemented, tablegen duplicates each
pattern that is dependent on hardware mode. The HwMode predicate is
added as a pattern predicate on the duplicated pattern.
    
RISC-V uses HwMode on the GPR register class which means almost every
isel pattern is affected by HwMode. This results in the isel table
being nearly twice the size it would be if we only had a single GPR
size.

This patch proposes to do the expansion at instruction selection time
instead. To accomplish this new opcodes like OPC_CheckTypeByHwMode
are added to the isel table. The unique combinations of types and HwMode
are converted to an index that is the payload for the new opcodes.
TableGen emits a new virtual function getValueTypeByHwMode that uses
this index and the current HwMode to look up the type.

This reduces the size of the isel table on RISC-V from ~2.38 million

    [13 lines not shown]
DeltaFile
+191-62llvm/utils/TableGen/DAGISelMatcherEmitter.cpp
+108-28llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
+31-27llvm/utils/TableGen/DAGISelMatcher.h
+6-48llvm/test/TableGen/RegClassByHwMode.td
+19-14llvm/utils/TableGen/DAGISelMatcher.cpp
+24-7llvm/utils/TableGen/DAGISelEmitter.cpp
+379-1865 files not shown
+433-20911 files

LLVM/project 6bb2043mlir/include/mlir/Dialect/Tosa/Transforms Passes.td Passes.h, mlir/lib/Dialect/Tosa/Transforms TosaInputShape.cpp CMakeLists.txt

[mlir][tosa] Add pass to assign static input shape to TOSA functions (#171156)

This commit introduces the `--tosa-experimental-input-shape` pass, which
allows a user to convert dynamically shaped input arguments of TOSA
functions to a user defined static shape. Here is a simple example:
```bash
func.func @test(%arg0: tensor<2x?xi32>, %arg1: tensor<?x256xf32>, %arg2: tensor<?x9xf32>) -> (tensor<2x?xi32>, tensor<?x256xf32>, tensor<?x9xf32>) {
    %0 = tosa.add %arg0, %arg0 : (tensor<2x?xi32>, tensor<2x?xi32>) -> tensor<2x?xi32>
    %1 = tosa.reciprocal %arg1 : (tensor<?x256xf32>) -> tensor<?x256xf32>
    %2 = tosa.sub %arg2, %arg2 : (tensor<?x9xf32>, tensor<?x9xf32>) -> tensor<?x9xf32>
    return %0, %1, %2 : tensor<2x?xi32>, tensor<?x256xf32>, tensor<?x9xf32>
}

$ mlir-opt --tosa-experimental-input-shape="args=arg0:2x16,arg2:64x9" test.mlir
func.func @test(%arg0: tensor<2x16xi32>, %arg1: tensor<?x256xf32>, %arg2: tensor<64x9xf32>) -> (tensor<2x?xi32>, tensor<?x256xf32>, tensor<?x9xf32>) {
    %0 = tosa.add %arg0, %arg0 : (tensor<2x16xi32>, tensor<2x16xi32>) -> tensor<2x?xi32>
    %1 = tosa.reciprocal %arg1 : (tensor<?x256xf32>) -> tensor<?x256xf32>
    %2 = tosa.sub %arg2, %arg2 : (tensor<64x9xf32>, tensor<64x9xf32>) -> tensor<?x9xf32>
    return %0, %1, %2 : tensor<2x?xi32>, tensor<?x256xf32>, tensor<?x9xf32>

    [21 lines not shown]
DeltaFile
+182-0mlir/lib/Dialect/Tosa/Transforms/TosaInputShape.cpp
+72-0mlir/test/Dialect/Tosa/tosa-input-shape.mlir
+19-0mlir/include/mlir/Dialect/Tosa/Transforms/Passes.td
+2-0mlir/include/mlir/Dialect/Tosa/Transforms/Passes.h
+1-0mlir/lib/Dialect/Tosa/Transforms/CMakeLists.txt
+276-05 files

LLVM/project d5a5678llvm/lib/Target/X86 X86ISelLowering.cpp, llvm/test/CodeGen/X86 combine-pclmul.ll

[X86] SimplifyDemandedVectorEltsForTargetNode - reduce instruction size if upper half of X86ISD::PCLMULQDQ isn't demanded (#176199)

If the upper subvector half of a 256/512-bit X86ISD::PCLMULQDQ node
isn't demanded, then split the operands and perform using a smaller
instruction
DeltaFile
+2-4llvm/test/CodeGen/X86/combine-pclmul.ll
+1-0llvm/lib/Target/X86/X86ISelLowering.cpp
+3-42 files

LLVM/project 6299598llvm/test/Transforms/InstCombine simplify-demanded-fpclass-fadd.ll

Baseline test
DeltaFile
+21-0llvm/test/Transforms/InstCombine/simplify-demanded-fpclass-fadd.ll
+21-01 files

LLVM/project a51a9edllvm/lib/Transforms/InstCombine InstCombineSimplifyDemanded.cpp, llvm/test/Transforms/InstCombine simplify-demanded-fpclass-fadd.ll

InstCombine: Fix SimplifyDemandedFPClass for fadd with known-inf source

Ensure the result cannot be nan.

Split out from https://github.com/llvm/llvm-project/pull/175852
DeltaFile
+4-2llvm/test/Transforms/InstCombine/simplify-demanded-fpclass-fadd.ll
+2-2llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
+6-42 files

LLVM/project 1d4f9acflang/include/flang/Optimizer/Dialect/MIF MIFOps.td, flang/lib/Lower Bridge.cpp MultiImageFortran.cpp

[flang] Fix crash with coarray teams #171048 (#172259)

This PR updates the `CHANGE TEAM` construct to fix the bug mentioned in
the issue #171048.
When a construct such as `IfConstruct` was present in the `CHANGE TEAM`
region, several BB were created but outside the region.
DeltaFile
+40-20flang/lib/Lower/Bridge.cpp
+29-0flang/test/Lower/MIF/change_team2.f90
+6-11flang/lib/Lower/MultiImageFortran.cpp
+6-7flang/lib/Optimizer/Dialect/MIF/MIFOps.cpp
+6-6flang/include/flang/Optimizer/Dialect/MIF/MIFOps.td
+5-3flang/lib/Optimizer/Builder/IntrinsicCall.cpp
+92-471 files not shown
+95-497 files

LLVM/project ec6b7a3llvm/lib/CodeGen CFIInstrInserter.cpp

[CFIInstrInserter][NFC] Move `class CSRSavedLocation` definition. (#176053)

This is needed to minimize diff for the future commit where we plan to
use `CSRSavedLocation` in `stuct MBBCFAInfo`.
DeltaFile
+20-20llvm/lib/CodeGen/CFIInstrInserter.cpp
+20-201 files

LLVM/project fcc0ae1llvm/lib/Transforms/IPO MemProfContextDisambiguation.cpp, llvm/test/ThinLTO/X86 memprof-weak-alias.ll

[MemProf] Handle weak alias and aliasee prevailing in different modules (#176083)

For ThinLTO we only have the cloning information in the FunctionSummary,
so for aliases we create as many clones as there are aliasee clones in
the LTO backend. However, that information is only in the prevailing
symbol's summary, as we don't keep the memprof summary information for
other copies (to reduce memory and compile time).

In the case of weak aliases, it is possible that the prevailing copy
of the alias may be in a different module than the prevailing copy of
the aliasee (e.g. when a module with a weak_odr aliasee definition does
not have a def of the weak_odr alias and is listed first on the link
line). In that case, we were not creating the expected clones of the
alias.

Rather than a more complex solution that adds additional summary
information, detect this case and simply don't add the callsites in the
aliasee function to the callsite context graph. This will result in
conservativeness (because we can't clone through that function), but
this should be a corner case.
DeltaFile
+188-0llvm/test/ThinLTO/X86/memprof-weak-alias.ll
+47-0llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp
+235-02 files

LLVM/project e3560f2llvm/lib/Target/X86 X86SpeculativeExecutionSideEffectSuppression.cpp X86.h

[NewPM] Port x86-seses to new pass manager (#176096)

DeltaFile
+28-13llvm/lib/Target/X86/X86SpeculativeExecutionSideEffectSuppression.cpp
+11-2llvm/lib/Target/X86/X86.h
+2-2llvm/lib/Target/X86/X86TargetMachine.cpp
+1-1llvm/lib/Target/X86/X86PassRegistry.def
+42-184 files

LLVM/project b998e58llvm/lib/Transforms/InstCombine InstCombineSimplifyDemanded.cpp, llvm/test/Transforms/InstCombine simplify-demanded-fpclass.ll

Check IsCanonicalizing
DeltaFile
+85-1llvm/test/Transforms/InstCombine/simplify-demanded-fpclass.ll
+2-1llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
+87-22 files

LLVM/project 509fe22llvm/lib/Transforms/InstCombine InstCombineSimplifyDemanded.cpp

Move isAggregateType, although this can't break for any existing case
DeltaFile
+4-4llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
+4-41 files

LLVM/project 76a5fefllvm/test/Transforms/InstCombine simplify-demanded-fpclass-sqrt.ll simplify-demanded-fpclass-insertelement.ll

InstCombine: Fold known-qnan results to a literal nan

Previously we only considered fcNan to fold to qnan for canonicalizing
results, ignoring the simpler case where we know the nan is already
quiet.
DeltaFile
+3-9llvm/test/Transforms/InstCombine/simplify-demanded-fpclass-sqrt.ll
+3-4llvm/test/Transforms/InstCombine/simplify-demanded-fpclass-insertelement.ll
+3-4llvm/test/Transforms/InstCombine/simplify-demanded-fpclass-frexp.ll
+3-4llvm/test/Transforms/InstCombine/simplify-demanded-fpclass-shufflevector.ll
+2-4llvm/test/Transforms/InstCombine/simplify-demanded-fpclass-fmul.ll
+1-2llvm/test/Transforms/InstCombine/simplify-demanded-fpclass-fpext.ll
+15-2712 files not shown
+28-4718 files

LLVM/project 22bf72dllvm/lib/Transforms/InstCombine InstCombineSimplifyDemanded.cpp

Use m_Extractvalue
DeltaFile
+5-7llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
+5-71 files

LLVM/project 9fc8654llvm/include/llvm/Support KnownFPClass.h, llvm/lib/Analysis ValueTracking.cpp

InstCombine: Implement SimplifyDemandedFPClass for frexp
DeltaFile
+16-38llvm/test/Transforms/InstCombine/simplify-demanded-fpclass-frexp.ll
+49-0llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
+27-0llvm/lib/Support/KnownFPClass.cpp
+3-21llvm/lib/Analysis/ValueTracking.cpp
+4-0llvm/include/llvm/Support/KnownFPClass.h
+99-595 files

LLVM/project 9f763f4llvm/test/Transforms/InstCombine simplify-demanded-fpclass-frexp.ll

InstCombine: Add baseline frexp test for SimplifyDemandedFPClass
DeltaFile
+612-0llvm/test/Transforms/InstCombine/simplify-demanded-fpclass-frexp.ll
+612-01 files

LLVM/project 0cdaa8fllvm/include/llvm/IR IntrinsicsNVVM.td

[NVPTX] Update various intrinsic attributes, nfc cleanup (#175660)

This patch migrates the intrinsic properties back to "PureIntrinsic"
from "NVVMPureIntrinsic" (after PR #166450).

While we are there:
* Refactor a few mbarrier intrinsics definitions (NFC)
* Update mbarrier.pending_count properties. (trivial)
* Formatting changes over a few fence intrinsics (NFC)
DeltaFile
+121-138llvm/include/llvm/IR/IntrinsicsNVVM.td
+121-1381 files

LLVM/project 5acb608.ci utils.sh

[CI] Make premerge jobs support GHA postcommit (#176180)

This was causing failures in the release branch as the premerge jobs
there are also run postcommit through GHA. We were expecting a PR number
to always be present when it was not.
DeltaFile
+9-5.ci/utils.sh
+9-51 files

LLVM/project 3150b73mlir/lib/Dialect/XeGPU/Transforms XeGPUPropagateLayout.cpp

[MLIR][XeGPU] Clean up helpers in XeGPUPropagateLayout (#175857)

In XeGPUPropagateLayout.cpp, the helper getDefaultSIMTLayoutInfo is
implemented via multiple overloads that differ significantly in
semantics, not just parameter types.
Reusing the same function name for these semantically different
behaviors makes call sites harder to read and reason about and increases
the maintenance burden. This PR improves readability and maintainability
of layout propagation logic.
DeltaFile
+35-48mlir/lib/Dialect/XeGPU/Transforms/XeGPUPropagateLayout.cpp
+35-481 files

LLVM/project 1727337llvm/test lit.cfg.py

[profcheck] Reorder the FileCheck substitution. (#176098)

In the profcheck build, FileCheck commands are substituted with cat > /dev/null to disable output verification. In test/Transforms/SamplePrfile/remarks-hotness.ll we have both "FileCheck"
and "not FileCheck" statements. Replacing the positive one first results in "not cat". 
Run the not substitution first to fix this.
DeltaFile
+1-1llvm/test/lit.cfg.py
+1-11 files

LLVM/project 2aec54eclang/include/clang/Analysis/Analyses/LifetimeSafety LifetimeAnnotations.h, clang/lib/Analysis/LifetimeSafety LifetimeAnnotations.cpp

 Merge lifetimebound attribute on implicit 'this' across method redeclarations
DeltaFile
+138-0clang/test/Sema/warn-lifetime-analysis-nocfg.cpp
+20-12clang/lib/Analysis/LifetimeSafety/LifetimeAnnotations.cpp
+22-0clang/test/Sema/warn-lifetime-safety.cpp
+21-0clang/test/SemaCXX/attr-lifetimebound.cpp
+7-0clang/include/clang/Analysis/Analyses/LifetimeSafety/LifetimeAnnotations.h
+208-125 files

LLVM/project f4d4caallvm/include/llvm/Target CGPassBuilderOption.h, llvm/lib/CodeGen TargetPassConfig.cpp

[LLVM][CodeGen] Rename `gc-empty-basic-blocks` to `enable-gc-empty-basic-blocks` (#176018)

Rename the `gc-empty-basic-blocks` command line option to
`enable-gc-empty-basic-blocks` in preparation of adding calls to
initializing the pass in `initializeCodeGen` and also make the flag more
consistent with other existing flags to enable or disable passes.

Keep `gc-empty-basic-blocks` as an alias to allow all users to migrate
to the new option.
DeltaFile
+12-5llvm/lib/CodeGen/TargetPassConfig.cpp
+4-4llvm/test/CodeGen/X86/gc-empty-basic-blocks.ll
+2-2llvm/test/CodeGen/X86/basic-block-address-map-empty-block.ll
+1-1llvm/include/llvm/Target/CGPassBuilderOption.h
+19-124 files

LLVM/project 8e493b8llvm/include/llvm/Support Compiler.h

[Support] Suppress old MSVC warning for [[msvc::no_unique_address]] (#176130)

MSVC versions prior to 19.43 (Visual Studio 2022 version 17.13) emit a
warning when using the [[msvc::no_unique_address]] attribute prior to
C++20.

This is now considered a bug and fixed in later releases of MSVC.
Suppress the warning for older MSVC versions by disabling the warning
around the attribute usage. This allows for warning-free builds when
targeting older MSVC versions.

More details and discussion about the warning can be found here:
https://developercommunity.visualstudio.com/t/msvc::no_unique_address-Should-Not-W/10118435
DeltaFile
+16-1llvm/include/llvm/Support/Compiler.h
+16-11 files

LLVM/project 6309cd8llvm/docs MIRLangRef.rst, llvm/include/llvm/CodeGen MachineInstrBuilder.h

Revert "[NFC][MI] Tidy Up RegState enum use (1/2)" (#176190)

Reverts llvm/llvm-project#176091

Reverting because some compilers were erroring on the call to
`Reg.isReg()` (which is not `constexpr`) in a `constexpr` function.
DeltaFile
+51-66llvm/include/llvm/CodeGen/MachineInstrBuilder.h
+17-17llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
+15-15llvm/docs/MIRLangRef.rst
+10-14llvm/lib/CodeGen/MIRParser/MIParser.cpp
+9-8llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
+8-8llvm/lib/Target/ARM/ARMExpandPseudoInsts.cpp
+110-12817 files not shown
+142-15623 files

LLVM/project e39d44cllvm/test/CodeGen/X86 combine-pclmul.ll

[X86] Add tests showing failure to split/concat X86ISD::PCLMULQDQ nodes (#176179)

DeltaFile
+96-0llvm/test/CodeGen/X86/combine-pclmul.ll
+96-01 files

LLVM/project 1d616cdllvm/docs MIRLangRef.rst, llvm/include/llvm/CodeGen MachineInstrBuilder.h

[NFC][MI] Tidy Up RegState enum use (1/2) (#176091)

This Change is to prepare to make RegState into an enum class. It:
- Updates documentation to match the order in the code.
- Brings the `get<>RegState` functions together and makes them
`constexpr`.
- Adopts the `get<>RegState` where RegStates were being chosen with
ternary operators in backend code.
- Introduces `hasRegState` to make querying RegState easier once it is
an enum class.
- Adopts `hasRegState` where equivalent was done with bitwise
arithmetic.
- Introduces `RegState::NoFlags`, which will be used for the lack of
flags.
- Documents that `0x1` is a reserved flag value used to detect if
someone is passing `true` instead of flags (due to implicit bool to
unsigned conversions).
- Updates two calls to `MachineInstrBuilder::addReg` which were passing
`false` to the flags operand, to no longer pass a value.
- Documents that `getRegState` seems to have forgotten a call to
`getEarlyClobberRegState`.
DeltaFile
+66-51llvm/include/llvm/CodeGen/MachineInstrBuilder.h
+17-17llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
+15-15llvm/docs/MIRLangRef.rst
+14-10llvm/lib/CodeGen/MIRParser/MIParser.cpp
+8-9llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
+8-8llvm/lib/Target/ARM/ARMExpandPseudoInsts.cpp
+128-11017 files not shown
+156-14223 files

LLVM/project 1ea3bd4clang/include/clang/Analysis/Analyses/LifetimeSafety LifetimeAnnotations.h, clang/lib/Analysis/LifetimeSafety LifetimeAnnotations.cpp

 Merge lifetimebound attribute on implicit 'this' across method redeclarations
DeltaFile
+138-0clang/test/Sema/warn-lifetime-analysis-nocfg.cpp
+18-12clang/lib/Analysis/LifetimeSafety/LifetimeAnnotations.cpp
+22-0clang/test/Sema/warn-lifetime-safety.cpp
+21-0clang/test/SemaCXX/attr-lifetimebound.cpp
+7-0clang/include/clang/Analysis/Analyses/LifetimeSafety/LifetimeAnnotations.h
+206-125 files

LLVM/project 444adbellvm/lib/Target/RISCV RISCVRegisterInfo.td

[RISCV] Change FPR256 to use the same allocation order as FPR16/32/64/128. (#176097)

The previous order was the LLVM 11 order for FPR16/32/64/128.
DeltaFile
+1-7llvm/lib/Target/RISCV/RISCVRegisterInfo.td
+1-71 files

LLVM/project 282a065llvm/include/llvm/Support KnownFPClass.h, llvm/lib/Transforms/InstCombine InstCombineSimplifyDemanded.cpp

InstCombine: Handle multiple uses fabs in SimplifyDemandedFPClass (#176035)

DeltaFile
+85-0llvm/test/Transforms/InstCombine/simplify-demanded-fpclass.ll
+28-0llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
+6-0llvm/include/llvm/Support/KnownFPClass.h
+119-03 files

LLVM/project a866030clang/test/Sema warn-lifetime-analysis-nocfg.cpp, clang/test/Sema/Inputs lifetime-analysis.h

[LifetimeSafety] Test lifetime safety on stmt-local analysis test suite (#175906)

Add CFG-based lifetime analysis tests for dangling pointer detection
alongside the existing AST-based analysis.

This change helps validate that the new CFG-based lifetime analysis
correctly detects the same dangling pointer issues as the existing
AST-based analysis. It also documents current limitations of the
CFG-based approach with FIXME comments, providing a roadmap for future
improvements. The test ensures that both analysis methods can work
side-by-side, with the CFG-based analysis eventually intended to replace
the AST-based approach.
DeltaFile
+219-56clang/test/Sema/warn-lifetime-analysis-nocfg.cpp
+10-1clang/test/Sema/Inputs/lifetime-analysis.h
+229-572 files

LLVM/project 513062dclang-tools-extra/clang-tidy/performance MoveConstArgCheck.cpp, clang-tools-extra/docs ReleaseNotes.rst

[clang-tidy] Fix performance-move-const-arg for trivially copyable types with private copy constructor (#175449)

Closes [#174826](https://github.com/llvm/llvm-project/issues/174826)
DeltaFile
+34-0clang-tools-extra/test/clang-tidy/checkers/performance/move-const-arg.cpp
+4-0clang-tools-extra/docs/ReleaseNotes.rst
+2-1clang-tools-extra/clang-tidy/performance/MoveConstArgCheck.cpp
+40-13 files