LLVM/project 2734c46llvm/include/llvm/CodeGen SelectionDAG.h, llvm/lib/CodeGen/SelectionDAG SelectionDAGDumper.cpp

[DAG] Add back SelectionDAG::dump() without parameter (#187001)

Usually `dump()`s are without parameter, so the practice is calling
`XXX::dump()` when debugging.

But we will get an error like below after #161097:

```
error: <user expression 128>:1:10: too few arguments to function call,
expected 1, have 0
    1 | DAG.dump()
      | ~~~~~~~~ ^
```

So to not surprise users, I added back the `SelectionDAG::dump()`
without parameter.
DeltaFile
+6-1llvm/include/llvm/CodeGen/SelectionDAG.h
+2-0llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp
+8-12 files

LLVM/project f9d2d8bclang/test/CXX/drs cwg14xx.cpp cwg787.cpp

[clang] Enable '-verify-directives' mode in C++ DR tests (#187219)

This patch enables recently implemented `-verify-directives` mode
(#179835) in C++ DR tests to automate some of the work I've been doing
manually while reviewing PRs touching those tests. As highlighted in
that PR, all the errors this mode found were addressed in #179813 and
#179674, so this PR just flips the switch.
DeltaFile
+14-14clang/test/CXX/drs/cwg14xx.cpp
+8-7clang/test/CXX/drs/cwg787.cpp
+7-7clang/test/CXX/drs/cwg12xx.cpp
+7-7clang/test/CXX/drs/cwg13xx.cpp
+7-7clang/test/CXX/drs/cwg15xx.cpp
+7-7clang/test/CXX/drs/cwg16xx.cpp
+50-4931 files not shown
+266-26537 files

LLVM/project fef74e1mlir/include/mlir/Dialect/SPIRV/IR SPIRVStructureOps.td, mlir/lib/Dialect/SPIRV/IR SPIRVOps.cpp

[mlir][spirv] add ExecutionModeIdOp (#186241)

Adds OpExecutionModeId from spirv 1.2

---------

Co-authored-by: Jakub Kuderski <kubakuderski at gmail.com>
DeltaFile
+70-0mlir/lib/Dialect/SPIRV/IR/SPIRVOps.cpp
+68-0mlir/test/Dialect/SPIRV/IR/structure-ops.mlir
+65-0mlir/include/mlir/Dialect/SPIRV/IR/SPIRVStructureOps.td
+37-0mlir/lib/Target/SPIRV/Deserialization/DeserializeOps.cpp
+28-0mlir/lib/Target/SPIRV/Serialization/SerializeOps.cpp
+18-0mlir/test/Target/SPIRV/execution-mode-id.mlir
+286-01 files not shown
+288-07 files

LLVM/project 8c4f4e8clang/include/clang/Analysis/Analyses/LifetimeSafety FactsGenerator.h, clang/lib/Analysis/LifetimeSafety FactsGenerator.cpp

[LifetimeSafety] Track origins through array subscript and array-to-pointer decay (#186902)

Array element accesses and array-to-pointer decay were not tracked
because `CK_ArrayToPointerDecay` dropped origins and
`ArraySubscriptExpr` had no visitor. This patch adds both to propagate
origins through array operations.

Fixes #186075
DeltaFile
+122-0clang/test/Sema/warn-lifetime-safety.cpp
+15-1clang/lib/Analysis/LifetimeSafety/FactsGenerator.cpp
+16-0clang/test/Sema/warn-lifetime-safety-suggestions.cpp
+3-2clang/test/Sema/warn-lifetime-analysis-nocfg.cpp
+1-0clang/include/clang/Analysis/Analyses/LifetimeSafety/FactsGenerator.h
+157-35 files

LLVM/project 4f63790mlir/include/mlir/Conversion/TosaToLinalg TosaToLinalg.h, mlir/lib/Conversion/TosaToLinalg TosaToLinalgPass.cpp

[mlir][tosa][tosa-to-linalg] Fix rescale with double rounding failing validation (#184787)

The validation pass added attribute checks on rescale rounding mode, but
the tosa-to-linalg-pipeline did not specify support for the doubleround
extension, causing rescale with doubleround to be rejected by the
validation in the tosa-to-linalg-pipeline.

One method of fixing this would be to only enable the attribute checks
when the "strictOpSpecAlignment" validation option is enabled. However,
I feel this is the wrong direction of travel. Long-term it would be nice
if the tosa-to-linalg-pipeline specified all the extensions it supports,
gracefully rejecting operations that require unsupported extensions.

Therefore, this change declares support for the doubleround extension to
fix the legalization failure with the ambition of adding more extensions
in the future.
DeltaFile
+26-1mlir/test/Conversion/TosaToLinalg/tosa-to-linalg-pipeline.mlir
+11-1mlir/lib/Conversion/TosaToLinalg/TosaToLinalgPass.cpp
+2-1mlir/include/mlir/Conversion/TosaToLinalg/TosaToLinalg.h
+39-33 files

LLVM/project 23a0c9flldb/test/API/symstore TestSymStoreLocal.py

[lldb] Skip file cleanup to avoid permission issue in API test (#187227)

Deleting anything in the build directory of a test-case is causing an
issue on one of the Windows bots. After the previous attempts in
ca15db1cd509c236cd8138bcd098117d0106db56 and
fdd2437af3cdc6d5fe199fcc9d991ccf503b55bd didn't help, we now skip the
file cleanup altogether.
DeltaFile
+5-8lldb/test/API/symstore/TestSymStoreLocal.py
+5-81 files

LLVM/project f776357llvm/lib/Transforms/Vectorize VPlanPatternMatch.h VPlanRecipes.cpp

[VPlan] Improve code in VPlanRecipes using VPlanPatternMatch (NFC) (#187130)
DeltaFile
+6-6llvm/lib/Transforms/Vectorize/VPlanPatternMatch.h
+4-6llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
+10-122 files

LLVM/project 30c962cllvm/lib/Transforms/Instrumentation NumericalStabilitySanitizer.cpp, llvm/test/Instrumentation/NumericalStabilitySanitizer intrinsics.ll libfuncs.ll

[Instrumentation][nsan] Add maximumnum to NSAN (#186345)

Add support for the min/maximumnum intrinsics and the corresponding
libfuncs to the NumericalStabilitySanitizer.
DeltaFile
+150-0llvm/test/Instrumentation/NumericalStabilitySanitizer/intrinsics.ll
+150-0llvm/test/Instrumentation/NumericalStabilitySanitizer/libfuncs.ll
+12-0llvm/lib/Transforms/Instrumentation/NumericalStabilitySanitizer.cpp
+312-03 files

LLVM/project 25abe22llvm/lib/Target/X86 X86ISelLowering.cpp, llvm/test/CodeGen/X86 shift-i512.ll

[X86] Improve handling of i512 SRA(MSB,Amt) "highbits" mask creation (#187141)

This can be folded from ((1 << 511) >>s Amt) -> (-1 << (511-Amt)) to make use of the existing optimal codegen

Alive2: https://alive2.llvm.org/ce/z/9UMQkm

Last i512 pattern described in #132601
DeltaFile
+38-51llvm/test/CodeGen/X86/shift-i512.ll
+10-0llvm/lib/Target/X86/X86ISelLowering.cpp
+48-512 files

LLVM/project 386d70fllvm/lib/Analysis DependenceAnalysis.cpp, llvm/test/Analysis/DependenceAnalysis strong-siv-addrec-wrap.ll exact-siv-addrec-wrap.ll

[DA] Remove calls to the GCD MIV test from `testSIV`
DeltaFile
+9-19llvm/test/Analysis/DependenceAnalysis/strong-siv-addrec-wrap.ll
+9-19llvm/test/Analysis/DependenceAnalysis/exact-siv-addrec-wrap.ll
+9-16llvm/test/Analysis/DependenceAnalysis/infer_affine_domain_ovlf.ll
+12-12llvm/test/Analysis/DependenceAnalysis/run-specific-dependence-test.ll
+4-8llvm/lib/Analysis/DependenceAnalysis.cpp
+2-2llvm/test/Analysis/DependenceAnalysis/exact-siv-overflow.ll
+45-762 files not shown
+47-788 files

LLVM/project 1407fc6llvm/lib/Analysis DependenceAnalysis.cpp, llvm/test/Analysis/DependenceAnalysis rdiv-large-btc.ll

[DA] Add precondition `0 <=s UB` to function `inferAffineDomain`
DeltaFile
+23-12llvm/lib/Analysis/DependenceAnalysis.cpp
+2-2llvm/test/Analysis/DependenceAnalysis/rdiv-large-btc.ll
+25-142 files

LLVM/project 1a61739llvm/lib/Analysis DependenceAnalysis.cpp, llvm/test/Analysis/DependenceAnalysis exact-siv-addrec-wrap.ll non-monotonic.ll

[DA] Check nsw flags for addrecs in the Exact SIV test
DeltaFile
+4-0llvm/lib/Analysis/DependenceAnalysis.cpp
+1-1llvm/test/Analysis/DependenceAnalysis/exact-siv-addrec-wrap.ll
+1-1llvm/test/Analysis/DependenceAnalysis/non-monotonic.ll
+1-1llvm/test/Analysis/DependenceAnalysis/symbolic-rdiv-addrec-wrap.ll
+7-34 files

LLVM/project 9cb9081mlir/test/Integration/Dialect/Vector/CPU gather.mlir

[mlir][vector] Extend vector.gather e2e test (#187071)

Extend the vector.gather e2e test to cover both available lowering
paths:

* Direct lowering to LLVM (via -test-lower-to-llvm)
* Lowering via vector.load (via -test-vector-gather-lowering)

This is a follow-up to https://github.com/llvm/llvm-project/pull/184706,
which updated a pattern used by -test-vector-gather-lowering.

The test is extended to operate on 2D memrefs so that the changes
in https://github.com/llvm/llvm-project/pull/184706 are meaningfully
exercised.
DeltaFile
+49-32mlir/test/Integration/Dialect/Vector/CPU/gather.mlir
+49-321 files

LLVM/project 570c388llvm/utils git-llvm-push

[llvm][utils] Give git-llvm-push u+x permissions (#187211)

There's a hashbang at the top of the script so I presume the intention
is that it can be executed directly, but it seems to be lacking
executable permissions. This sets the user executable bit so running
./llvm/utils/git-llvm-push works
DeltaFile
+0-0llvm/utils/git-llvm-push
+0-01 files

LLVM/project a8ff7e1clang/lib/Serialization ASTReaderInternals.h

[NFCI] [Serialization] Deduplicate DeclID properly (#187212)

In the original code, the operation to iterate Found is meaningless, as
it is guarded by Found.empty(). So this is always a noop for 10 years.

According to the context, I believe the intention is to avoid duplicated
DeclID to be in Data. So changing iterating Found to Data.

Just found by looking around the code.

This is not strictly NFC but NFC intentionally. I will be surprised if
this breaks anything.
DeltaFile
+2-2clang/lib/Serialization/ASTReaderInternals.h
+2-21 files

LLVM/project e762078llvm/lib/Transforms/Vectorize VPlanPatternMatch.h

[VPlan] Use auto return in VPlanPatternMatch (NFC) (#187210)
DeltaFile
+10-33llvm/lib/Transforms/Vectorize/VPlanPatternMatch.h
+10-331 files

LLVM/project b83fd4dllvm/lib/Target/AArch64/GISel AArch64RegisterBankInfo.cpp, llvm/test/CodeGen/AArch64 arm64-int-neon.ll

[AArch64][GlobalISel] Fix uqadd/sub with scalar operands (#186999)

Previously, neon uqadd/uqsub would not lower when given s32/s64
operands, as GlobalISel would wrongly try to put the operands on
general-purpose register banks. Changing this in RegBankSelection allows
the intrinsics to lower just like their signed versions.
DeltaFile
+0-4llvm/test/CodeGen/AArch64/arm64-int-neon.ll
+2-0llvm/lib/Target/AArch64/GISel/AArch64RegisterBankInfo.cpp
+2-42 files

LLVM/project 054e92bllvm/lib/Target/AMDGPU SIISelLowering.cpp, llvm/test/CodeGen/AMDGPU llvm.amdgcn.reduce.fmax.ll llvm.amdgcn.reduce.fmin.ll

[AMDGPU] DPP implementations for Wave Reduction

Add support for DPP wave reduction for floating
point numbers.
DeltaFile
+693-234llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.fmax.ll
+693-234llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.fmin.ll
+600-130llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.fsub.ll
+589-130llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.fadd.ll
+50-25llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+2,625-7535 files

LLVM/project 9b2fe0cllvm/lib/Target/X86 X86ISelLowering.cpp

[X86] Remove extranous I in comment. NFC (#187209)

Seems to have slipped in in c63d2953a08b9
DeltaFile
+0-1llvm/lib/Target/X86/X86ISelLowering.cpp
+0-11 files

LLVM/project 7e4fdfcllvm/lib/Analysis DependenceAnalysis.cpp, llvm/test/Analysis/DependenceAnalysis exact-siv-addrec-wrap.ll strong-siv-addrec-wrap.ll

[DA] Remove calls to the GCD MIV test from `testSIV`
DeltaFile
+9-19llvm/test/Analysis/DependenceAnalysis/exact-siv-addrec-wrap.ll
+9-19llvm/test/Analysis/DependenceAnalysis/strong-siv-addrec-wrap.ll
+9-16llvm/test/Analysis/DependenceAnalysis/infer_affine_domain_ovlf.ll
+12-12llvm/test/Analysis/DependenceAnalysis/run-specific-dependence-test.ll
+4-8llvm/lib/Analysis/DependenceAnalysis.cpp
+2-2llvm/test/Analysis/DependenceAnalysis/exact-siv-overflow.ll
+45-762 files not shown
+47-788 files

LLVM/project 1dfabe1llvm/lib/Analysis DependenceAnalysis.cpp, llvm/test/Analysis/DependenceAnalysis rdiv-large-btc.ll

[DA] Add precondition `0 <=s UB` to function `inferAffineDomain`
DeltaFile
+23-12llvm/lib/Analysis/DependenceAnalysis.cpp
+2-2llvm/test/Analysis/DependenceAnalysis/rdiv-large-btc.ll
+25-142 files

LLVM/project a684566llvm/lib/Analysis DependenceAnalysis.cpp, llvm/test/Analysis/DependenceAnalysis exact-siv-addrec-wrap.ll non-monotonic.ll

[DA] Check nsw flags for addrecs in the Exact SIV test
DeltaFile
+4-0llvm/lib/Analysis/DependenceAnalysis.cpp
+1-1llvm/test/Analysis/DependenceAnalysis/exact-siv-addrec-wrap.ll
+1-1llvm/test/Analysis/DependenceAnalysis/non-monotonic.ll
+1-1llvm/test/Analysis/DependenceAnalysis/symbolic-rdiv-addrec-wrap.ll
+7-34 files

LLVM/project a56e4f1llvm/lib/Analysis DependenceAnalysis.cpp, llvm/test/Analysis/DependenceAnalysis exact-siv-addrec-wrap.ll non-monotonic.ll

[DA] Check nsw flags for addrecs in the Exact SIV test
DeltaFile
+4-0llvm/lib/Analysis/DependenceAnalysis.cpp
+1-1llvm/test/Analysis/DependenceAnalysis/exact-siv-addrec-wrap.ll
+1-1llvm/test/Analysis/DependenceAnalysis/non-monotonic.ll
+1-1llvm/test/Analysis/DependenceAnalysis/symbolic-rdiv-addrec-wrap.ll
+7-34 files

LLVM/project ea8fb06llvm/include/llvm/IR Instructions.h, llvm/lib/AsmParser LLParser.cpp

[atomicrmw] fminimumnum/fmaximumnum support (#187030)

Adds support for `atomicrmw` `fminimumnum`/`fmaximumnum` operations.
These were added to C++ in P3008, and are exposed in libc++ in #186716 .
Adding LLVM IR support for these unblocks work in both backends with HW
support, and frontends.
DeltaFile
+210-0llvm/test/Transforms/AtomicExpand/AArch64/atomicrmw-fp.ll
+66-66llvm/test/TableGen/GlobalISelCombinerEmitter/match-table-cxx.td
+18-0llvm/test/Assembler/atomic.ll
+12-0llvm/lib/Target/AMDGPU/AMDGPULowerBufferFatPointers.cpp
+10-0llvm/include/llvm/IR/Instructions.h
+8-0llvm/lib/AsmParser/LLParser.cpp
+324-6622 files not shown
+412-6928 files

LLVM/project fdd2437lldb/test/API/symstore TestSymStoreLocal.py

[lldb] Avoid permission issue in API test with SHARED_BUILD_TESTCASE (#187072)

Deleting the inferior binary after an API test-case causes issues on one
of the Windows bots. The previous the fix attempt in ca15db1cd509c236
didn't succeed. We have to use isolated subfolders for each test-case.
This is achieved easily by disabling SHARED_BUILD_TESTCASE.
DeltaFile
+2-5lldb/test/API/symstore/TestSymStoreLocal.py
+2-51 files

LLVM/project ec1c08allvm/test/Analysis/DependenceAnalysis symbolic-rdiv-overflow.ll

[DA] Regenerate assertions for the tests (NFC) (#187207)

Delete the tailing space introduced in #185805 that is noisy when using
UTC.
DeltaFile
+1-1llvm/test/Analysis/DependenceAnalysis/symbolic-rdiv-overflow.ll
+1-11 files

LLVM/project b3fdcacllvm/lib/Target/AArch64 AArch64ISelLowering.cpp AArch64Combine.td, llvm/lib/Target/AArch64/GISel AArch64PostLegalizerLowering.cpp

[AArch64] Remove vector REV16, use BSWAP instead (#186414)

This removes the generation of vector REV16 nodes, generating a bswap
instead. This allows us to remove most uses of AArch64ISD::REV16 and all
uses of G_REV16.
DeltaFile
+18-6llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+16-3llvm/lib/Target/AArch64/GISel/AArch64PostLegalizerLowering.cpp
+10-6llvm/test/CodeGen/AArch64/GlobalISel/select-rev.mir
+5-5llvm/lib/Target/AArch64/AArch64Combine.td
+0-9llvm/lib/Target/AArch64/AArch64InstrGISel.td
+1-1llvm/lib/Target/AArch64/AArch64InstrInfo.td
+50-306 files

LLVM/project 77ad2c2llvm/test/Analysis/DependenceAnalysis exact-siv-large-btc.ll

[DA] Add test that represents an edge case for the Exact SIV test (NFC) (#186389)

To prevent a regression that could be caused by #186388.
DeltaFile
+60-0llvm/test/Analysis/DependenceAnalysis/exact-siv-large-btc.ll
+60-01 files

LLVM/project 0f622c5orc-rt/include CMakeLists.txt, orc-rt/include/orc-rt TaskGroup.h

[orc-rt] Add TaskGroup for tracking completion of a set of tasks. (#187205)

TaskGroup provides a mechanism for tracking execution of multiple
concurrent tasks and receiving notification when all tasks have
completed. This is useful for coordinating asynchronous operations in
the ORC runtime.

TaskGroup::Token is an RAII handle representing participation in a
group. The group cannot complete while any valid (non-default) Token
exists.

TaskGroup::addOnComplete registers callbacks to run when the group
closes and all tokens are released. (Callbacks registered after
completion run immediately).

TalkGroup::close seals the group: no new tokens can be acquired after
close is called.

All methods may be called concurrently from multiple threads.
DeltaFile
+378-0orc-rt/unittests/TaskGroupTest.cpp
+203-0orc-rt/include/orc-rt/TaskGroup.h
+1-0orc-rt/include/CMakeLists.txt
+1-0orc-rt/unittests/CMakeLists.txt
+583-04 files

LLVM/project 76d5704llvm/test/CodeGen/PowerPC bswap64.ll

[NFC][PowerPC] Update check lines to include power 9 label (#187193)

The current check lines do not provide a clear distinction between
`power 9` and `power 8` as power 8 label was introduced recently through
#181776. Added `power-9` label to the RUN lines to make it more readable
and understandable.

Co-authored-by: himadhith <himadhith.v at ibm.com>
DeltaFile
+26-26llvm/test/CodeGen/PowerPC/bswap64.ll
+26-261 files