LLVM/project c4b49c7clang/lib/CIR/CodeGen CIRGenBuiltinX86.cpp, clang/test/CIR/CodeGenBuiltins/X86 keylocker.c

[CIR][X86] Add support for `aes` and `aeswide` builtins (#175892)

- Support CIR codegen for follow builtin: `aesenc`, `aesdec`,
`aesencwide` and `aesdecwide`.
- Part of https://github.com/llvm/llvm-project/issues/167752
DeltaFile
+1,176-0clang/test/CIR/CodeGenBuiltins/X86/keylocker.c
+163-3clang/lib/CIR/CodeGen/CIRGenBuiltinX86.cpp
+1,339-32 files

LLVM/project 7364ff5flang/lib/Lower OpenACC.cpp, flang/test/Lower/OpenACC acc-declare-common-present-in-function.f90

[acc] `acc declare` + `present` clause for COMMON blocks (#175588)

Fix: `!$acc declare present(/COMMON/)` no longer adds
`acc.declare(dataClause=acc_present)` attribute to the fir.global
common.

Lowering change: COMMON+present is lowered through the structured
declare path (fir.address_of + acc.present operand) to preserve scope.
DeltaFile
+74-55flang/lib/Lower/OpenACC.cpp
+27-0flang/test/Lower/OpenACC/acc-declare-common-present-in-function.f90
+101-552 files

LLVM/project a72958allvm/lib/Target/AArch64 AArch64FrameLowering.cpp AArch64PrologueEpilogue.cpp, llvm/test/CodeGen/AArch64 stack-probing.ll stack-probing-64k.ll

[AArch64] Use a load instead of a store for inline stack probes (#170855)

Frequently, when big buffers are put on the stack we end up
with multiple virtual pages Copy-On-Write mapped to single physical zero page.
Stack probes would unnecessarily trigger a Copy-On-Write on such pages. Avoid this
by using loads into the XZR.
DeltaFile
+21-21llvm/test/CodeGen/AArch64/stack-probing.ll
+19-19llvm/test/CodeGen/AArch64/stack-probing-64k.ll
+21-9llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
+12-12llvm/test/CodeGen/AArch64/stack-probing-dynamic.ll
+10-10llvm/test/CodeGen/AArch64/stack-probing-sve.ll
+14-6llvm/lib/Target/AArch64/AArch64PrologueEpilogue.cpp
+97-778 files not shown
+123-9714 files

LLVM/project b29ee6ellvm/lib/Transforms/InstCombine InstCombineSimplifyDemanded.cpp, llvm/test/Transforms/InstCombine simplify-demanded-fpclass-fadd.ll

InstCombine: Fix SimplifyDemandedFPClass for fadd with known-inf source
 (#176204)

Ensure the result cannot be nan.

Split out from https://github.com/llvm/llvm-project/pull/175852
DeltaFile
+23-0llvm/test/Transforms/InstCombine/simplify-demanded-fpclass-fadd.ll
+2-2llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
+25-22 files

LLVM/project 5dccab0llvm/lib/Transforms/Utils CodeMoverUtils.cpp

[LoopFusion] Removing dead code leftover after PR #171889 (NFC) (#176020)

Removed unused functions in order to fix 'unused function' warnings, as
mentioned in PR 171889. This involved the two original functions
```ControlConditions::isEquivalent(const ControlConditions &Other)
const``` and ```ControlConditions::collectControlConditions(const
llvm::BasicBlock&, const llvm::BasicBlock&, const llvm::DominatorTree&,
const llvm::PostDominatorTree&, unsigned int)``` plus all the functions
that became unused as the result of deleting the two original ones.

Co-authored-by: Szymon Sobieszek <szymon.sobieszek1 at huawei.com>
DeltaFile
+0-186llvm/lib/Transforms/Utils/CodeMoverUtils.cpp
+0-1861 files

LLVM/project 1eb5279llvm/lib/Target/X86 X86ISelLowering.cpp, llvm/test/CodeGen/X86 combine-pclmul.ll

[X86] combineConcatVectorOps - add X86ISD::PCLMULQDQ handling for VPCLMULQDQ targets (#176203)

DeltaFile
+3-23llvm/test/CodeGen/X86/combine-pclmul.ll
+15-0llvm/lib/Target/X86/X86ISelLowering.cpp
+18-232 files

LLVM/project eb82ddcclang/lib/Analysis UnsafeBufferUsage.cpp, clang/test/SemaCXX warn-unsafe-buffer-usage.cpp

[clang][-Wunsafe-buffer-usage] Ignore consteval functions (#171503)

We dont need to visit or warn on consteval functions as they can't have
UB.

---------

Co-authored-by: mxms <mxms at google.com>
DeltaFile
+7-0clang/test/SemaCXX/warn-unsafe-buffer-usage.cpp
+4-0clang/lib/Analysis/UnsafeBufferUsage.cpp
+11-02 files

LLVM/project 08de4fdllvm/lib/CodeGen/SelectionDAG SelectionDAGISel.cpp, llvm/test/TableGen RegClassByHwMode.td

[SelectionDAG] Move HwMode expansion from tablegen to SelectionISel. (#174471)

The way HwMode is currently implemented, tablegen duplicates each
pattern that is dependent on hardware mode. The HwMode predicate is
added as a pattern predicate on the duplicated pattern.
    
RISC-V uses HwMode on the GPR register class which means almost every
isel pattern is affected by HwMode. This results in the isel table
being nearly twice the size it would be if we only had a single GPR
size.

This patch proposes to do the expansion at instruction selection time
instead. To accomplish this new opcodes like OPC_CheckTypeByHwMode
are added to the isel table. The unique combinations of types and HwMode
are converted to an index that is the payload for the new opcodes.
TableGen emits a new virtual function getValueTypeByHwMode that uses
this index and the current HwMode to look up the type.

This reduces the size of the isel table on RISC-V from ~2.38 million

    [13 lines not shown]
DeltaFile
+191-62llvm/utils/TableGen/DAGISelMatcherEmitter.cpp
+108-28llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
+31-27llvm/utils/TableGen/DAGISelMatcher.h
+6-48llvm/test/TableGen/RegClassByHwMode.td
+19-14llvm/utils/TableGen/DAGISelMatcher.cpp
+24-7llvm/utils/TableGen/DAGISelEmitter.cpp
+379-1865 files not shown
+433-20911 files

LLVM/project 6bb2043mlir/include/mlir/Dialect/Tosa/Transforms Passes.td Passes.h, mlir/lib/Dialect/Tosa/Transforms TosaInputShape.cpp CMakeLists.txt

[mlir][tosa] Add pass to assign static input shape to TOSA functions (#171156)

This commit introduces the `--tosa-experimental-input-shape` pass, which
allows a user to convert dynamically shaped input arguments of TOSA
functions to a user defined static shape. Here is a simple example:
```bash
func.func @test(%arg0: tensor<2x?xi32>, %arg1: tensor<?x256xf32>, %arg2: tensor<?x9xf32>) -> (tensor<2x?xi32>, tensor<?x256xf32>, tensor<?x9xf32>) {
    %0 = tosa.add %arg0, %arg0 : (tensor<2x?xi32>, tensor<2x?xi32>) -> tensor<2x?xi32>
    %1 = tosa.reciprocal %arg1 : (tensor<?x256xf32>) -> tensor<?x256xf32>
    %2 = tosa.sub %arg2, %arg2 : (tensor<?x9xf32>, tensor<?x9xf32>) -> tensor<?x9xf32>
    return %0, %1, %2 : tensor<2x?xi32>, tensor<?x256xf32>, tensor<?x9xf32>
}

$ mlir-opt --tosa-experimental-input-shape="args=arg0:2x16,arg2:64x9" test.mlir
func.func @test(%arg0: tensor<2x16xi32>, %arg1: tensor<?x256xf32>, %arg2: tensor<64x9xf32>) -> (tensor<2x?xi32>, tensor<?x256xf32>, tensor<?x9xf32>) {
    %0 = tosa.add %arg0, %arg0 : (tensor<2x16xi32>, tensor<2x16xi32>) -> tensor<2x?xi32>
    %1 = tosa.reciprocal %arg1 : (tensor<?x256xf32>) -> tensor<?x256xf32>
    %2 = tosa.sub %arg2, %arg2 : (tensor<64x9xf32>, tensor<64x9xf32>) -> tensor<?x9xf32>
    return %0, %1, %2 : tensor<2x?xi32>, tensor<?x256xf32>, tensor<?x9xf32>

    [21 lines not shown]
DeltaFile
+182-0mlir/lib/Dialect/Tosa/Transforms/TosaInputShape.cpp
+72-0mlir/test/Dialect/Tosa/tosa-input-shape.mlir
+19-0mlir/include/mlir/Dialect/Tosa/Transforms/Passes.td
+2-0mlir/include/mlir/Dialect/Tosa/Transforms/Passes.h
+1-0mlir/lib/Dialect/Tosa/Transforms/CMakeLists.txt
+276-05 files

LLVM/project d5a5678llvm/lib/Target/X86 X86ISelLowering.cpp, llvm/test/CodeGen/X86 combine-pclmul.ll

[X86] SimplifyDemandedVectorEltsForTargetNode - reduce instruction size if upper half of X86ISD::PCLMULQDQ isn't demanded (#176199)

If the upper subvector half of a 256/512-bit X86ISD::PCLMULQDQ node
isn't demanded, then split the operands and perform using a smaller
instruction
DeltaFile
+2-4llvm/test/CodeGen/X86/combine-pclmul.ll
+1-0llvm/lib/Target/X86/X86ISelLowering.cpp
+3-42 files

LLVM/project 6299598llvm/test/Transforms/InstCombine simplify-demanded-fpclass-fadd.ll

Baseline test
DeltaFile
+21-0llvm/test/Transforms/InstCombine/simplify-demanded-fpclass-fadd.ll
+21-01 files

LLVM/project a51a9edllvm/lib/Transforms/InstCombine InstCombineSimplifyDemanded.cpp, llvm/test/Transforms/InstCombine simplify-demanded-fpclass-fadd.ll

InstCombine: Fix SimplifyDemandedFPClass for fadd with known-inf source

Ensure the result cannot be nan.

Split out from https://github.com/llvm/llvm-project/pull/175852
DeltaFile
+4-2llvm/test/Transforms/InstCombine/simplify-demanded-fpclass-fadd.ll
+2-2llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
+6-42 files

LLVM/project 1d4f9acflang/include/flang/Optimizer/Dialect/MIF MIFOps.td, flang/lib/Lower Bridge.cpp MultiImageFortran.cpp

[flang] Fix crash with coarray teams #171048 (#172259)

This PR updates the `CHANGE TEAM` construct to fix the bug mentioned in
the issue #171048.
When a construct such as `IfConstruct` was present in the `CHANGE TEAM`
region, several BB were created but outside the region.
DeltaFile
+40-20flang/lib/Lower/Bridge.cpp
+29-0flang/test/Lower/MIF/change_team2.f90
+6-11flang/lib/Lower/MultiImageFortran.cpp
+6-7flang/lib/Optimizer/Dialect/MIF/MIFOps.cpp
+6-6flang/include/flang/Optimizer/Dialect/MIF/MIFOps.td
+5-3flang/lib/Optimizer/Builder/IntrinsicCall.cpp
+92-471 files not shown
+95-497 files

LLVM/project ec6b7a3llvm/lib/CodeGen CFIInstrInserter.cpp

[CFIInstrInserter][NFC] Move `class CSRSavedLocation` definition. (#176053)

This is needed to minimize diff for the future commit where we plan to
use `CSRSavedLocation` in `stuct MBBCFAInfo`.
DeltaFile
+20-20llvm/lib/CodeGen/CFIInstrInserter.cpp
+20-201 files

LLVM/project fcc0ae1llvm/lib/Transforms/IPO MemProfContextDisambiguation.cpp, llvm/test/ThinLTO/X86 memprof-weak-alias.ll

[MemProf] Handle weak alias and aliasee prevailing in different modules (#176083)

For ThinLTO we only have the cloning information in the FunctionSummary,
so for aliases we create as many clones as there are aliasee clones in
the LTO backend. However, that information is only in the prevailing
symbol's summary, as we don't keep the memprof summary information for
other copies (to reduce memory and compile time).

In the case of weak aliases, it is possible that the prevailing copy
of the alias may be in a different module than the prevailing copy of
the aliasee (e.g. when a module with a weak_odr aliasee definition does
not have a def of the weak_odr alias and is listed first on the link
line). In that case, we were not creating the expected clones of the
alias.

Rather than a more complex solution that adds additional summary
information, detect this case and simply don't add the callsites in the
aliasee function to the callsite context graph. This will result in
conservativeness (because we can't clone through that function), but
this should be a corner case.
DeltaFile
+188-0llvm/test/ThinLTO/X86/memprof-weak-alias.ll
+47-0llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp
+235-02 files

LLVM/project e3560f2llvm/lib/Target/X86 X86SpeculativeExecutionSideEffectSuppression.cpp X86.h

[NewPM] Port x86-seses to new pass manager (#176096)

DeltaFile
+28-13llvm/lib/Target/X86/X86SpeculativeExecutionSideEffectSuppression.cpp
+11-2llvm/lib/Target/X86/X86.h
+2-2llvm/lib/Target/X86/X86TargetMachine.cpp
+1-1llvm/lib/Target/X86/X86PassRegistry.def
+42-184 files

LLVM/project b998e58llvm/lib/Transforms/InstCombine InstCombineSimplifyDemanded.cpp, llvm/test/Transforms/InstCombine simplify-demanded-fpclass.ll

Check IsCanonicalizing
DeltaFile
+85-1llvm/test/Transforms/InstCombine/simplify-demanded-fpclass.ll
+2-1llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
+87-22 files

LLVM/project 509fe22llvm/lib/Transforms/InstCombine InstCombineSimplifyDemanded.cpp

Move isAggregateType, although this can't break for any existing case
DeltaFile
+4-4llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
+4-41 files

LLVM/project 76a5fefllvm/test/Transforms/InstCombine simplify-demanded-fpclass-sqrt.ll simplify-demanded-fpclass-insertelement.ll

InstCombine: Fold known-qnan results to a literal nan

Previously we only considered fcNan to fold to qnan for canonicalizing
results, ignoring the simpler case where we know the nan is already
quiet.
DeltaFile
+3-9llvm/test/Transforms/InstCombine/simplify-demanded-fpclass-sqrt.ll
+3-4llvm/test/Transforms/InstCombine/simplify-demanded-fpclass-insertelement.ll
+3-4llvm/test/Transforms/InstCombine/simplify-demanded-fpclass-frexp.ll
+3-4llvm/test/Transforms/InstCombine/simplify-demanded-fpclass-shufflevector.ll
+2-4llvm/test/Transforms/InstCombine/simplify-demanded-fpclass-fmul.ll
+1-2llvm/test/Transforms/InstCombine/simplify-demanded-fpclass-fpext.ll
+15-2712 files not shown
+28-4718 files

LLVM/project 22bf72dllvm/lib/Transforms/InstCombine InstCombineSimplifyDemanded.cpp

Use m_Extractvalue
DeltaFile
+5-7llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
+5-71 files

LLVM/project 9fc8654llvm/include/llvm/Support KnownFPClass.h, llvm/lib/Analysis ValueTracking.cpp

InstCombine: Implement SimplifyDemandedFPClass for frexp
DeltaFile
+16-38llvm/test/Transforms/InstCombine/simplify-demanded-fpclass-frexp.ll
+49-0llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
+27-0llvm/lib/Support/KnownFPClass.cpp
+3-21llvm/lib/Analysis/ValueTracking.cpp
+4-0llvm/include/llvm/Support/KnownFPClass.h
+99-595 files

LLVM/project 9f763f4llvm/test/Transforms/InstCombine simplify-demanded-fpclass-frexp.ll

InstCombine: Add baseline frexp test for SimplifyDemandedFPClass
DeltaFile
+612-0llvm/test/Transforms/InstCombine/simplify-demanded-fpclass-frexp.ll
+612-01 files

LLVM/project 0cdaa8fllvm/include/llvm/IR IntrinsicsNVVM.td

[NVPTX] Update various intrinsic attributes, nfc cleanup (#175660)

This patch migrates the intrinsic properties back to "PureIntrinsic"
from "NVVMPureIntrinsic" (after PR #166450).

While we are there:
* Refactor a few mbarrier intrinsics definitions (NFC)
* Update mbarrier.pending_count properties. (trivial)
* Formatting changes over a few fence intrinsics (NFC)
DeltaFile
+121-138llvm/include/llvm/IR/IntrinsicsNVVM.td
+121-1381 files

LLVM/project 5acb608.ci utils.sh

[CI] Make premerge jobs support GHA postcommit (#176180)

This was causing failures in the release branch as the premerge jobs
there are also run postcommit through GHA. We were expecting a PR number
to always be present when it was not.
DeltaFile
+9-5.ci/utils.sh
+9-51 files

LLVM/project 3150b73mlir/lib/Dialect/XeGPU/Transforms XeGPUPropagateLayout.cpp

[MLIR][XeGPU] Clean up helpers in XeGPUPropagateLayout (#175857)

In XeGPUPropagateLayout.cpp, the helper getDefaultSIMTLayoutInfo is
implemented via multiple overloads that differ significantly in
semantics, not just parameter types.
Reusing the same function name for these semantically different
behaviors makes call sites harder to read and reason about and increases
the maintenance burden. This PR improves readability and maintainability
of layout propagation logic.
DeltaFile
+35-48mlir/lib/Dialect/XeGPU/Transforms/XeGPUPropagateLayout.cpp
+35-481 files

LLVM/project 1727337llvm/test lit.cfg.py

[profcheck] Reorder the FileCheck substitution. (#176098)

In the profcheck build, FileCheck commands are substituted with cat > /dev/null to disable output verification. In test/Transforms/SamplePrfile/remarks-hotness.ll we have both "FileCheck"
and "not FileCheck" statements. Replacing the positive one first results in "not cat". 
Run the not substitution first to fix this.
DeltaFile
+1-1llvm/test/lit.cfg.py
+1-11 files

LLVM/project 2aec54eclang/include/clang/Analysis/Analyses/LifetimeSafety LifetimeAnnotations.h, clang/lib/Analysis/LifetimeSafety LifetimeAnnotations.cpp

 Merge lifetimebound attribute on implicit 'this' across method redeclarations
DeltaFile
+138-0clang/test/Sema/warn-lifetime-analysis-nocfg.cpp
+20-12clang/lib/Analysis/LifetimeSafety/LifetimeAnnotations.cpp
+22-0clang/test/Sema/warn-lifetime-safety.cpp
+21-0clang/test/SemaCXX/attr-lifetimebound.cpp
+7-0clang/include/clang/Analysis/Analyses/LifetimeSafety/LifetimeAnnotations.h
+208-125 files

LLVM/project f4d4caallvm/include/llvm/Target CGPassBuilderOption.h, llvm/lib/CodeGen TargetPassConfig.cpp

[LLVM][CodeGen] Rename `gc-empty-basic-blocks` to `enable-gc-empty-basic-blocks` (#176018)

Rename the `gc-empty-basic-blocks` command line option to
`enable-gc-empty-basic-blocks` in preparation of adding calls to
initializing the pass in `initializeCodeGen` and also make the flag more
consistent with other existing flags to enable or disable passes.

Keep `gc-empty-basic-blocks` as an alias to allow all users to migrate
to the new option.
DeltaFile
+12-5llvm/lib/CodeGen/TargetPassConfig.cpp
+4-4llvm/test/CodeGen/X86/gc-empty-basic-blocks.ll
+2-2llvm/test/CodeGen/X86/basic-block-address-map-empty-block.ll
+1-1llvm/include/llvm/Target/CGPassBuilderOption.h
+19-124 files

LLVM/project 8e493b8llvm/include/llvm/Support Compiler.h

[Support] Suppress old MSVC warning for [[msvc::no_unique_address]] (#176130)

MSVC versions prior to 19.43 (Visual Studio 2022 version 17.13) emit a
warning when using the [[msvc::no_unique_address]] attribute prior to
C++20.

This is now considered a bug and fixed in later releases of MSVC.
Suppress the warning for older MSVC versions by disabling the warning
around the attribute usage. This allows for warning-free builds when
targeting older MSVC versions.

More details and discussion about the warning can be found here:
https://developercommunity.visualstudio.com/t/msvc::no_unique_address-Should-Not-W/10118435
DeltaFile
+16-1llvm/include/llvm/Support/Compiler.h
+16-11 files

LLVM/project 6309cd8llvm/docs MIRLangRef.rst, llvm/include/llvm/CodeGen MachineInstrBuilder.h

Revert "[NFC][MI] Tidy Up RegState enum use (1/2)" (#176190)

Reverts llvm/llvm-project#176091

Reverting because some compilers were erroring on the call to
`Reg.isReg()` (which is not `constexpr`) in a `constexpr` function.
DeltaFile
+51-66llvm/include/llvm/CodeGen/MachineInstrBuilder.h
+17-17llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
+15-15llvm/docs/MIRLangRef.rst
+10-14llvm/lib/CodeGen/MIRParser/MIParser.cpp
+9-8llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
+8-8llvm/lib/Target/ARM/ARMExpandPseudoInsts.cpp
+110-12817 files not shown
+142-15623 files