LLVM/project cda2ea3llvm/test/CodeGen/AMDGPU maximumnum.bf16.ll minimumnum.bf16.ll, llvm/test/CodeGen/X86 wide-scalar-shift-by-byte-multiple-legalization.ll shift-i512.ll

Merge branch 'main' into users/rovka/relax-callers-for-chain-funcs
DeltaFile
+17,522-20,773llvm/test/CodeGen/X86/wide-scalar-shift-by-byte-multiple-legalization.ll
+8,857-10,952llvm/test/CodeGen/AMDGPU/maximumnum.bf16.ll
+8,840-10,957llvm/test/CodeGen/AMDGPU/minimumnum.bf16.ll
+4,725-0llvm/test/tools/llvm-mca/RISCV/SpacemitX60/vlseg-vsseg.s
+4,091-0llvm/test/CodeGen/AMDGPU/atomicrmw_usub_sat.ll
+2,027-185llvm/test/CodeGen/X86/shift-i512.ll
+46,062-42,8673,206 files not shown
+192,075-102,9463,212 files

LLVM/project bab4d1ellvm/test/CodeGen/X86 shift-i512.ll

[X86] shift-i512.ll - extend test coverage (#171125)

Remove v8i64 dependency from original shift-by-1 tests - this was added for #132601 but is unlikely to be necessary

Add tests for general shifts as well as shift-by-constant and shift-of-constant examples
DeltaFile
+2,027-185llvm/test/CodeGen/X86/shift-i512.ll
+2,027-1851 files

LLVM/project 11866c4llvm/lib/CodeGen/SelectionDAG DAGCombiner.cpp, llvm/test/CodeGen/RISCV mul.ll

[DAGCombiner] Don't peek through bitcast when checking isMulAddWithConstProfitable (#171056)

Fixes https://github.com/llvm/llvm-project/issues/171035
Peeking through bitcast may cause type mismatch between `AddNode` and
`ConstNode` in `isMulAddWithConstProfitable`.
DeltaFile
+28-0llvm/test/CodeGen/RISCV/mul.ll
+2-2llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+30-22 files

LLVM/project f1af9b0.github/workflows libcxx-build-and-test.yaml hlsl-test-all.yaml

Update [Github] Update GHA Dependencies (#171064)

This PR contains the following updates:

| Package | Type | Update | Change | Pending |
|---|---|---|---|---|
| [actions/checkout](https://redirect.github.com/actions/checkout) |
action | patch | `v6.0.0` -> `v6.0.1` | |
| [actions/setup-node](https://redirect.github.com/actions/setup-node) |
action | minor | `v6.0.0` -> `v6.1.0` | |
|
[github/codeql-action](https://redirect.github.com/github/codeql-action)
| action | patch | `v4.31.5` -> `v4.31.6` | `v4.31.7` |
DeltaFile
+5-5.github/workflows/libcxx-build-and-test.yaml
+4-4.github/workflows/hlsl-test-all.yaml
+3-3.github/workflows/premerge.yaml
+3-3.github/workflows/gha-codeql.yml
+3-3.github/workflows/release-binaries.yml
+2-2.github/workflows/release-tasks.yml
+20-2039 files not shown
+67-6745 files

LLVM/project f29f01dcompiler-rt/test/sanitizer_common/TestCases/Linux soft_rss_limit_mb_test.cpp

[Sanitizer] Bump soft_rss_limit_mb in test (#170911)

This test is failing on some buildbots now that the internal shell has
been turned on and was failing previously on some ppc bots when turning
it on a while back (before it got reverted).

At least one X86 bot is barely hitting the limit
(https://lab.llvm.org/buildbot/#/builders/174/builds/28487 224MB-235MB).

This likely needs to be bumped due to changes in the process tree (now
that we invoke things through python rather than a bash shell) with the
enablement of the internal shell.
DeltaFile
+3-3compiler-rt/test/sanitizer_common/TestCases/Linux/soft_rss_limit_mb_test.cpp
+3-31 files

LLVM/project 7fbd443lldb/source/Commands CommandObjectBreakpoint.cpp

[lldb] Remove printf in breakpoint add command

Added in 2110db0f49593 / #156067.
DeltaFile
+0-1lldb/source/Commands/CommandObjectBreakpoint.cpp
+0-11 files

LLVM/project c1d030emlir/lib/ExecutionEngine ExecutionEngine.cpp

[MLIR][ExecutionEngine] Don't create a `_mlir_` wrapper function for internal linkage (#171115)

This is somehow NFC, we were creating wrapper for interal functions,
which are de-facto not callable.
DeltaFile
+2-4mlir/lib/ExecutionEngine/ExecutionEngine.cpp
+2-41 files

LLVM/project 6bc0d37clang/lib/Headers __clang_hip_libdevice_declares.h, clang/test/Headers __clang_hip_math_deprecated.hip

clang/HIP: Remove deprecated rcp pseudo-intrinsics

These shouldn't have been used by external users in the first place,
but have also been marked as deprecated for a number of releases.
DeltaFile
+0-29clang/test/Headers/__clang_hip_math_deprecated.hip
+0-22clang/lib/Headers/__clang_hip_libdevice_declares.h
+0-512 files

LLVM/project 07bafabllvm/lib/Target/AMDGPU VOP2Instructions.td AMDGPU.td, llvm/test/CodeGen/AMDGPU llvm.amdgcn.fma.legacy.ll

[AMDGPU] Do not generate V_FMAC_DX9_ZERO_F32 on GFX12 (#171116)

GFX12 does not have the FMAC form of this instruction, only the FMA
form.

Fixes: #170437
DeltaFile
+82-0llvm/test/CodeGen/AMDGPU/llvm.amdgcn.fma.legacy.ll
+2-2llvm/lib/Target/AMDGPU/VOP2Instructions.td
+3-0llvm/lib/Target/AMDGPU/AMDGPU.td
+1-1llvm/lib/Target/AMDGPU/SIInstructions.td
+88-34 files

LLVM/project 33d779dopenmp/runtime/src kmp_ftn_entry.h

[OpenMP] Fix undefined symbol for Darwin builds (#170999)

cf.
https://github.com/llvm/llvm-project/pull/168554#issuecomment-3617253169
DeltaFile
+1-1openmp/runtime/src/kmp_ftn_entry.h
+1-11 files

LLVM/project 7c832fclldb/source/Commands CommandObjectTarget.cpp

[lldb] Fix command line of `target frame-provider register` (#167803)

So far, the syntax was `target frame-provider register <cmd-options>
[<run-args>]`. Note the optional `run-args` at the end. They are
completely ignored by the actual command, but the command line parser
still accepts them.

This commit removes them.

This was probably a copy-paste error from `CommandObjectProcessLaunch`
which was probably used as a blue-print for `target frame-provider
register`.
DeltaFile
+1-3lldb/source/Commands/CommandObjectTarget.cpp
+1-31 files

LLVM/project 8cbe846clang/include/clang/CIR/Dialect/IR CIROps.td, clang/lib/CIR/CodeGen CIRGenStmt.cpp

Add switch case covered cirops.td
DeltaFile
+2-1clang/include/clang/CIR/Dialect/IR/CIROps.td
+2-0clang/lib/CIR/CodeGen/CIRGenStmt.cpp
+4-12 files

LLVM/project 7c35ff5llvm/unittests/ADT BitVectorTest.cpp

Add unit tests
DeltaFile
+92-0llvm/unittests/ADT/BitVectorTest.cpp
+92-01 files

LLVM/project 24957f7bolt/include/bolt/Passes LivenessAnalysis.h, bolt/lib/Passes ShrinkWrapping.cpp RegReAssign.cpp

[ADT] Make use of subsetOf and anyCommon methods of BitVector (NFC)

Replace the code along these lines

    BitVector Tmp = LHS;
    Tmp &= RHS;
    return Tmp.any();

and

    BitVector Tmp = LHS;
    Tmp.reset(RHS);
    return Tmp.none();

with `LHS.anyCommon(RHS)` and `LHS.subsetOf(RHS)`, correspondingly, which
do not require creating temporary BitVector and can return early.
DeltaFile
+4-6bolt/lib/Passes/ShrinkWrapping.cpp
+2-6bolt/lib/Passes/RegReAssign.cpp
+4-4bolt/lib/Passes/TailDuplication.cpp
+2-4llvm/lib/CodeGen/RDFRegisters.cpp
+2-4llvm/tools/llvm-exegesis/lib/SnippetGenerator.cpp
+2-3bolt/include/bolt/Passes/LivenessAnalysis.h
+16-272 files not shown
+18-318 files

LLVM/project a5e8e77bolt/lib/Passes PointerAuthCFIAnalyzer.cpp, bolt/test/AArch64 pacret-cfi-incorrect.s

[BOLT][PAC] Warn about synchronous unwind tables (#165227)

BOLT currently ignores functions with synchronous PAuth DWARF info.
If more than 10% of functions get ignored for inconsistencies, we
should emit a warning to only use asynchronous unwind tables.

See related issue: #165215
DeltaFile
+39-11bolt/lib/Passes/PointerAuthCFIAnalyzer.cpp
+33-0bolt/test/runtime/AArch64/pacret-synchronous-unwind.cpp
+1-1bolt/test/AArch64/pacret-cfi-incorrect.s
+73-123 files

LLVM/project a1b78a4llvm/lib/Target/AArch64 AArch64CollectCPSpillInfo.cpp AArch64TargetMachine.cpp, llvm/test/CodeGen/AArch64 fptosi-sat-vector.ll fptoui-sat-vector.ll

Constant pool spilling
DeltaFile
+503-525llvm/test/CodeGen/AArch64/fptosi-sat-vector.ll
+931-0llvm/lib/Target/AArch64/AArch64CollectCPSpillInfo.cpp
+177-177llvm/test/CodeGen/AArch64/fptoui-sat-vector.ll
+19-44llvm/test/CodeGen/AArch64/arm64-fp128.ll
+11-0llvm/lib/Target/AArch64/AArch64TargetMachine.cpp
+2-7llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-stores.ll
+1,643-7534 files not shown
+1,653-75510 files

LLVM/project 52dbe76bolt/include/bolt/Passes LivenessAnalysis.h, bolt/lib/Passes ShrinkWrapping.cpp TailDuplication.cpp

[ADT] Make use of subsetOf and anyCommon methods of BitVector (NFC)

Replace the code along these lines

    BitVector Tmp = LHS;
    Tmp &= RHS;
    return Tmp.any();

and

    BitVector Tmp = LHS;
    Tmp.reset(RHS);
    return Tmp.none();

with `LHS.anyCommon(RHS)` and `LHS.subsetOf(RHS)`, correspondingly, which
do not require creating temporary BitVector and can return early.
DeltaFile
+4-6bolt/lib/Passes/ShrinkWrapping.cpp
+4-4bolt/lib/Passes/TailDuplication.cpp
+2-6bolt/lib/Passes/RegReAssign.cpp
+2-4llvm/tools/llvm-exegesis/lib/SnippetGenerator.cpp
+2-4llvm/lib/CodeGen/RDFRegisters.cpp
+2-3bolt/include/bolt/Passes/LivenessAnalysis.h
+16-272 files not shown
+18-318 files

LLVM/project dcefe31llvm/unittests/ADT BitVectorTest.cpp

Add unit tests
DeltaFile
+92-0llvm/unittests/ADT/BitVectorTest.cpp
+92-01 files

LLVM/project 6049289mlir/lib/Dialect/Shard/IR ShardOps.cpp

[MLIR] Apply clang-tidy fixes for readability-identifier-naming in ShardOps.cpp (NFC)
DeltaFile
+39-41mlir/lib/Dialect/Shard/IR/ShardOps.cpp
+39-411 files

LLVM/project 1bbff72mlir/lib/ExecutionEngine VulkanRuntimeWrappers.cpp

[MLIR] Apply clang-tidy fixes for llvm-qualified-auto in VulkanRuntimeWrappers.cpp (NFC)
DeltaFile
+2-2mlir/lib/ExecutionEngine/VulkanRuntimeWrappers.cpp
+2-21 files

LLVM/project e678fc5bolt/lib/Passes PointerAuthCFIAnalyzer.cpp

Review nits
DeltaFile
+6-5bolt/lib/Passes/PointerAuthCFIAnalyzer.cpp
+6-51 files

LLVM/project d94958bllvm/lib/Transforms/InstCombine InstCombineCompares.cpp, llvm/test/Transforms/InstCombine icmp-add.ll

[InstCombine] Fold `icmp samesign u{gt/lt} (X +nsw C2), C` -> `icmp s{gt/lt} X, (C - C2)` (#169960)

Fixes #166973

Partially addresses #134028

Alive2 proof: https://alive2.llvm.org/ce/z/BqHQNN
DeltaFile
+76-0llvm/test/Transforms/InstCombine/icmp-add.ll
+17-10llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
+93-102 files

LLVM/project 3a6781ellvm/test/CodeGen/X86 vector-shuffle-combining-avx512f.ll

[X86] vector-shuffle-combining-avx512f.ll - add tests showing failure to simplify expand/compress nodes (#171113)

DeltaFile
+69-0llvm/test/CodeGen/X86/vector-shuffle-combining-avx512f.ll
+69-01 files

LLVM/project 32ff710llvm/lib/Target/AArch64 AArch64ISelLowering.cpp, llvm/test/CodeGen/AArch64 bf16-v8-instructions.ll fixed-length-bf16-arith.ll

[AArch64] Lower v8bf16 FMUL to BFMLAL top/bottom with +sve (#169655)

Assuming the predicate is hoisted, this should have a slightly better
throughput: https://godbolt.org/z/jb7aP7Efc

Note: SVE must be used to convert back to bf16 as the bfmlalb/t
instructions operate on even/odd lanes, but the neon bfcvtn/2 process
the top/bottom halves of vectors.
DeltaFile
+40-17llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+25-12llvm/test/CodeGen/AArch64/bf16-v8-instructions.ll
+8-8llvm/test/CodeGen/AArch64/fixed-length-bf16-arith.ll
+73-373 files

LLVM/project 5e3ffd6mlir/lib/ExecutionEngine ArmRunnerUtils.cpp

[MLIR] Apply clang-tidy fixes for readability-identifier-naming in ArmRunnerUtils.cpp (NFC)
DeltaFile
+2-2mlir/lib/ExecutionEngine/ArmRunnerUtils.cpp
+2-21 files

LLVM/project f41edb3llvm/test/CodeGen/AMDGPU llvm.amdgcn.fma.legacy.ll

[AMDGPU] Add test cases for v_fmac_dx9_zero_f32 aka v_fmac_legacy_f32 (#171108)

DeltaFile
+36-4llvm/test/CodeGen/AMDGPU/llvm.amdgcn.fma.legacy.ll
+36-41 files

LLVM/project 56beac9llvm/lib/Target/SPIRV SPIRVEmitIntrinsics.cpp, llvm/test/CodeGen/SPIRV const-array-gep.ll

[SPIRV] Fix assertion violation caused by unexpected ConstantExpr. (#170524)

`SPIRVEmitIntrinsics::simplifyZeroLengthArrayGepInst` asserted that it
always expected a `GetElementPtrInst` from `IRBuilder::CreateGEP` (which
returns a `Value`). `IRBuilder` can fold and return a `ConstantExpr`
instead, thus violating the assertion. The patch fixes this by using
`GetElementPtrInst::Create` to always return a `GetElementPtrInst`.

This LLVM defect was identified via the AMD Fuzzing project.
DeltaFile
+19-0llvm/test/CodeGen/SPIRV/const-array-gep.ll
+3-5llvm/lib/Target/SPIRV/SPIRVEmitIntrinsics.cpp
+22-52 files

LLVM/project e52cddc.github/workflows release-binaries.yml

workflows/release-binaries: Use upload-release-artifact action for uploading (#170528)

DeltaFile
+29-44.github/workflows/release-binaries.yml
+29-441 files

LLVM/project 405403cmlir/lib/Dialect/Transform/TuneExtension TuneExtensionOps.cpp

[mlir] Fix GCC compilation warning in TuneExtensionOps.cpp (#168850)

Building with GCC produces:
```
<...>/TuneExtensionOps.cpp:180:26: warning: comparison of unsigned expression in ‘< 0’ is always false [-Wtype-limits]
  180 |   if (*selectedRegionIdx < 0 || *selectedRegionIdx >= getNumRegions())
      |       ~~~~~~~~~~~~~~~~~~~^~~
<...>/TuneExtensionOps.cpp: In member function ‘llvm::LogicalResult mlir::transform::tune::AlternativesOp::verify()’:
/home/david.spickett/llvm-project/mlir/lib/Dialect/Transform/TuneExtension/TuneExtensionOps.cpp:236:19: warning: comparison of unsigned expression in ‘< 0’ is always false [-Wtype-limits]
  236 |     if (regionIdx < 0 || regionIdx >= getNumRegions())
      |         ~~~~~~~~~~^~~
```

As we are sign extending these variables, use int64_t instead of size_t
for their type.
DeltaFile
+2-2mlir/lib/Dialect/Transform/TuneExtension/TuneExtensionOps.cpp
+2-21 files

LLVM/project f9e0fa8clang/lib/StaticAnalyzer/Checkers MoveChecker.cpp, clang/test/Analysis use-after-move-invalidation.cpp

[analyzer] MoveChecker: correct invalidation of this-regions (#169626)

By completely omitting invalidation in the case of InstanceCall, we do
not clear the moved state of the fields of the this object after an
opaque call to a member function of the object itself.
DeltaFile
+51-0clang/test/Analysis/use-after-move-invalidation.cpp
+8-9clang/lib/StaticAnalyzer/Checkers/MoveChecker.cpp
+59-92 files