LLVM/project f8f799cllvm/lib/CodeGen/SelectionDAG DAGCombiner.cpp, llvm/test/CodeGen/NVPTX i1-ext-load.ll

[DAG] Fold  (X +/- Y) & Y --> ~X & Y when Y is a power of 2 (or zero). (#181677)

Same as InstCombinerImpl::visitAnd

To prevent RISCV falling back to a mul call in known-never-zero.ll I've
had to tweak the (sub X, (vscale * C)) to (add X, (vscale * -C)) fold to
not occur if C is power-of-2 and the target has poor mul support.

Alive2: https://alive2.llvm.org/ce/z/Khvs5H
DeltaFile
+19-21llvm/test/CodeGen/RISCV/rvv/known-never-zero.ll
+11-11llvm/test/CodeGen/RISCV/idiv_large.ll
+8-12llvm/test/CodeGen/X86/known-pow2.ll
+11-1llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+3-4llvm/test/CodeGen/NVPTX/i1-ext-load.ll
+52-495 files

LLVM/project 9a4a38fllvm/test/CodeGen/AMDGPU llvm.amdgcn.image.sample.g16.a16.dim.ll, llvm/test/CodeGen/AMDGPU/GlobalISel llvm.amdgcn.image.atomic.dim.a16.ll llvm.amdgcn.image.load.3d.a16.ll

AMDGPU/GlobalISel: Regbanklegalize rules for INTRIN_IMAGE

Regbanklegalize rules for INTRIN_IMAGE loads and stores.
Because of very large number of different type signatures, rule specifies
only function for lowering (waterfall lowering of RsrcIdx operand if needed)
and this function also applies register banks.
DeltaFile
+268-52llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.image.atomic.dim.a16.ll
+128-112llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.image.load.3d.a16.ll
+114-50llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.image.gather4.a16.dim.ll
+78-84llvm/test/CodeGen/AMDGPU/GlobalISel/regbankselect-amdgcn.image.sample.1d.ll
+58-70llvm/test/CodeGen/AMDGPU/GlobalISel/regbankselect-amdgcn.image.load.1d.ll
+86-36llvm/test/CodeGen/AMDGPU/llvm.amdgcn.image.sample.g16.a16.dim.ll
+732-40429 files not shown
+1,082-53235 files

LLVM/project 17b4a72llvm/test/tools/llvm-reduce unconditional-br-phi.ll unconditional-br.ll, llvm/tools/llvm-reduce DeltaPasses.def

[llvm-reduce] Add a pass to replace unconditional branches with returns (#180993)

Unconditional branches could end up in infinite loops in the reduced
code, while the code could have been reduce further.

This patch implements a simple pass that replaces unconditional branches
with returns.
DeltaFile
+89-0llvm/test/tools/llvm-reduce/unconditional-br-phi.ll
+40-0llvm/test/tools/llvm-reduce/unconditional-br.ll
+37-0llvm/tools/llvm-reduce/deltas/ReduceUsingSimplifyCFG.cpp
+2-2llvm/test/tools/llvm-reduce/reduce-invoke.ll
+2-0llvm/tools/llvm-reduce/DeltaPasses.def
+1-0llvm/tools/llvm-reduce/deltas/ReduceUsingSimplifyCFG.h
+171-26 files

LLVM/project ead7563llvm/lib/Target/AArch64 AArch64InstrInfo.td AArch64ISelLowering.cpp, llvm/test/CodeGen/AArch64 store-float-conversion.ll tbl-loops.ll

[AArch64] Improve post-inc stores of SIMD/FP values

Add patterns to match post-increment truncating stores from lane 0 of
wide integer vectors (v4i32/v2i64) to narrower types (i8/i16/i32).
This avoids transferring the value through a GPR when storing.

Also remove the pre-legalization early-exit in combineStoreValueFPToInt
as it prevented the optimization from applying in some cases.
DeltaFile
+260-0llvm/test/CodeGen/AArch64/store-float-conversion.ll
+7-0llvm/lib/Target/AArch64/AArch64InstrInfo.td
+0-3llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+1-2llvm/test/CodeGen/AArch64/tbl-loops.ll
+268-54 files

LLVM/project a210b35llvm/lib/Target/AMDGPU AMDGPURegBankLegalizeHelper.cpp, llvm/test/CodeGen/AMDGPU fptoi.i128.ll global-saddr-load.ll

AMDGPU/GlobalISel: Regbanklegalize rules for G_PHI

Move G_PHI handling to AMDGPURegBankLegalizeRules.cpp.
Support all legal types.
DeltaFile
+183-157llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.memcpy.ll
+130-114llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.set.inactive.ll
+70-65llvm/test/CodeGen/AMDGPU/fptoi.i128.ll
+45-48llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.memset.ll
+38-50llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeHelper.cpp
+37-43llvm/test/CodeGen/AMDGPU/global-saddr-load.ll
+503-4779 files not shown
+580-53015 files

LLVM/project 48002eblldb/packages/Python/lldbsuite/test/tools/lldb-dap dap_server.py lldbdap_testcase.py, lldb/tools/lldb-dap JSONUtils.h JSONUtils.cpp

[lldb-dap] Remove dead code. (#181947)

It seems we have dead from the raw json days.
DeltaFile
+0-129lldb/unittests/DAP/JSONUtilsTest.cpp
+0-129lldb/tools/lldb-dap/JSONUtils.h
+2-105lldb/tools/lldb-dap/JSONUtils.cpp
+0-42lldb/packages/Python/lldbsuite/test/tools/lldb-dap/dap_server.py
+0-35lldb/packages/Python/lldbsuite/test/tools/lldb-dap/lldbdap_testcase.py
+0-25lldb/tools/lldb-dap/LLDBUtils.cpp
+2-4653 files not shown
+2-5089 files

LLVM/project 4506982llvm/lib/CodeGen/GlobalISel GISelValueTracking.cpp, llvm/test/CodeGen/AArch64 rem.ll arm64-neon-mul-div.ll

[GlobalISel] Add G_UDIV/G_SDIV computeKnownBits (#181307)

Code ported from `SelectionDAG::computeKnownBits`.

Related: #150515
DeltaFile
+362-364llvm/test/CodeGen/AArch64/rem.ll
+154-150llvm/test/CodeGen/AArch64/arm64-neon-mul-div.ll
+67-0llvm/test/CodeGen/AArch64/GlobalISel/knownbits-sdiv.mir
+55-0llvm/test/CodeGen/AArch64/GlobalISel/knownbits-udiv.mir
+18-0llvm/lib/CodeGen/GlobalISel/GISelValueTracking.cpp
+2-2llvm/test/CodeGen/AArch64/funnel-shift.ll
+658-5166 files

LLVM/project b26ee7bllvm/test/Transforms/LoopInterchange phi-ordering.ll

[LoopInterchange] Fix test phi-ordering.ll (NFC)
DeltaFile
+37-32llvm/test/Transforms/LoopInterchange/phi-ordering.ll
+37-321 files

LLVM/project 2f708a9llvm/lib/Transforms/Scalar LoopInterchange.cpp, llvm/test/Transforms/LoopInterchange profitability-instorder.ll interchangeable-outerloop-multiple-indvars.ll

[LoopInterchange] Fix instorder profitability check
DeltaFile
+50-41llvm/lib/Transforms/Scalar/LoopInterchange.cpp
+40-30llvm/test/Transforms/LoopInterchange/profitability-instorder.ll
+1-1llvm/test/Transforms/LoopInterchange/interchangeable-outerloop-multiple-indvars.ll
+91-723 files

LLVM/project 9d7ca79llvm/test/Transforms/LoopInterchange profitability-instorder.ll

[LoopInterchange] Add a test for simple profitable case (NFC)
DeltaFile
+180-0llvm/test/Transforms/LoopInterchange/profitability-instorder.ll
+180-01 files

LLVM/project 10ccf11llvm/test/TableGen regunit-intervals.td, llvm/utils/TableGen RegisterInfoEmitter.cpp

[Tablegen] Patch RegUnitIntervals Initialization (#181173)

There were a few places it was missing some code-generation to properly
initialize it if enabled, and also it was missing the sentinel value.
DeltaFile
+13-2llvm/utils/TableGen/RegisterInfoEmitter.cpp
+3-0llvm/test/TableGen/regunit-intervals.td
+16-22 files

LLVM/project a13e04allvm/test/Transforms/LoopInterchange pr57148.ll lcssa-preheader.ll

[LoopInterchange] Update UTC version (NFC) (#181988)

This is a follow-up PR to #181804. While working on the stacked PRs, I
encountered some noisy diffs in the CHECK lines that don't change the
meaning of the tests. To avoid such changes and make the review easier,
this patch updates the UTC version. It also renames some BBs to suppress
warnings emitted by UTC.
DeltaFile
+99-99llvm/test/Transforms/LoopInterchange/pr57148.ll
+89-85llvm/test/Transforms/LoopInterchange/lcssa-preheader.ll
+82-82llvm/test/Transforms/LoopInterchange/interchangeable-outerloop-multiple-indvars.ll
+82-82llvm/test/Transforms/LoopInterchange/interchangeable-innerloop-multiple-indvars.ll
+69-67llvm/test/Transforms/LoopInterchange/update-condbranch-duplicate-successors.ll
+65-63llvm/test/Transforms/LoopInterchange/interchangeable.ll
+486-4784 files not shown
+639-62610 files

LLVM/project 6012aa1llvm/lib/Target/AMDGPU AMDGPUInstructionSelector.cpp

[AMDGPU] Fix opcode comparison logic for G_INTRINSIC (#156008)

The check `(Opc < TargetOpcode::GENERIC_OP_END)` incorrectly
includes `G_INTRINSIC` (129), which is less than
`GENERIC_OP_END` (313), leading to logically dead code.

This patch reorders the conditionals to first check for `G_INTRINSIC`,
ensuring
correct handling of the `amdgcn_fdot2` intrinsic.
DeltaFile
+4-4llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
+4-41 files

LLVM/project e6cff75llvm/lib/Target/X86 X86ISelLowering.cpp

[X86] combineMOVMSK - pull out repeated SDLoc (#181986)

DeltaFile
+3-6llvm/lib/Target/X86/X86ISelLowering.cpp
+3-61 files

LLVM/project 176928cclang/cmake/caches VectorEngine.cmake, openmp CMakeLists.txt

[OpenMP] Remove standalone build mode (#149878)

Remove all the CMake code for openmp standalone builds. Standalone
builds have been superseded by the runtimes default build (also
sometimes called the standalone runtimes build). The runtimes default
build can be thought of a standalone build with the standalone
boilerplate contained in <llvm-project>/runtimes/CMakeLists.txt. There
is no need for each runtime to contain the same boilerplate code again.

Builds still using the standalone build via
```sh
cmake -S <llvm-project>/openmp ...
```
can switch over to the runtimes default build using
```sh
cmake -S <llvm-project>/runtimes -DLLVM_ENABLE_RUNTIMES=openmp ...
```
Options that were valid for the standalone build are also valid for
default runtimes build, unless handled only in

    [8 lines not shown]
DeltaFile
+39-134openmp/cmake/OpenMPTesting.cmake
+46-90openmp/runtime/CMakeLists.txt
+43-76openmp/CMakeLists.txt
+1-20openmp/runtime/unittests/CMakeLists.txt
+3-5openmp/runtime/src/CMakeLists.txt
+0-8clang/cmake/caches/VectorEngine.cmake
+132-3334 files not shown
+136-33610 files

LLVM/project 3d3ad01clang/lib/Driver/ToolChains Clang.cpp, clang/lib/Driver/ToolChains/Arch AArch64.cpp AArch64.h

[Clang][AArch64] set default mtune for macOS (#179136)

This patch sets a default tune-cpu on macOS targets to `apple-m5`.

The implementation adds a helper in
`clang/lib/Driver/ToolChains/Arch/AArch64.h` called by
`clang/lib/Driver/ToolChains/Clang.cpp`. It doesnt follow a "check then
get" flow because its very concise, and returns an optional instead. It
adds a missing test file for mtune on Apple macOS targets, including the
new logic.
DeltaFile
+43-3clang/lib/Driver/ToolChains/Arch/AArch64.cpp
+28-0clang/test/Driver/aarch64-mtune-apple-macos.c
+3-5clang/lib/Driver/ToolChains/Clang.cpp
+5-0clang/lib/Driver/ToolChains/Arch/AArch64.h
+79-84 files

LLVM/project c0c2ac0clang/lib/Headers amdhsa_abi.h CMakeLists.txt, clang/test/Headers amdhsa_abi.cl

clang: Add builtin header for amdhsa abi

This is place to put definitions for various ABI structs.
Currently device libs is just hardcoding magic numbers and casting
and it's incomprehensible.
DeltaFile
+166-0clang/test/Headers/amdhsa_abi.cl
+80-0clang/lib/Headers/amdhsa_abi.h
+1-0clang/lib/Headers/CMakeLists.txt
+247-03 files

LLVM/project 9057af9llvm/test/CodeGen/X86 known-pow2.ll

[X86] Add additional test coverage for #147216 (#181980)

DeltaFile
+57-0llvm/test/CodeGen/X86/known-pow2.ll
+57-01 files

LLVM/project 5a4d15ellvm/lib/Transforms/Scalar LoopInterchange.cpp, llvm/test/Transforms/LoopInterchange profitability-instorder.ll interchangeable-outerloop-multiple-indvars.ll

[LoopInterchange] Fix instorder profitability check
DeltaFile
+50-41llvm/lib/Transforms/Scalar/LoopInterchange.cpp
+40-30llvm/test/Transforms/LoopInterchange/profitability-instorder.ll
+1-1llvm/test/Transforms/LoopInterchange/interchangeable-outerloop-multiple-indvars.ll
+91-723 files

LLVM/project 3478cdcllvm/test/Transforms/LoopInterchange profitability-instorder.ll

[LoopInterchange] Add a test for simple profitable case (NFC)
DeltaFile
+180-0llvm/test/Transforms/LoopInterchange/profitability-instorder.ll
+180-01 files

LLVM/project 97c34e3llvm/test/Transforms/LoopInterchange phi-ordering.ll

[LoopInterchange] Fix test phi-ordering.ll (NFC)
DeltaFile
+37-32llvm/test/Transforms/LoopInterchange/phi-ordering.ll
+37-321 files

LLVM/project 776ae18llvm/test/Transforms/LoopInterchange pr57148.ll lcssa-preheader.ll

[LoopInterchange] Update UTC version (NFC)
DeltaFile
+99-99llvm/test/Transforms/LoopInterchange/pr57148.ll
+89-85llvm/test/Transforms/LoopInterchange/lcssa-preheader.ll
+82-82llvm/test/Transforms/LoopInterchange/interchangeable-innerloop-multiple-indvars.ll
+82-82llvm/test/Transforms/LoopInterchange/interchangeable-outerloop-multiple-indvars.ll
+69-67llvm/test/Transforms/LoopInterchange/update-condbranch-duplicate-successors.ll
+65-63llvm/test/Transforms/LoopInterchange/interchangeable.ll
+486-4784 files not shown
+639-62610 files

LLVM/project 07931d4llvm/lib/Target/AArch64 AArch64ISelLowering.cpp

Fixups
DeltaFile
+1-1llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+1-11 files

LLVM/project 051d125llvm/lib/Target/X86 X86ISelLowering.cpp

Fix MSVC "result of 32-bit shift implicitly converted to 64 bits" warning. NFC. (#181979)

DeltaFile
+1-1llvm/lib/Target/X86/X86ISelLowering.cpp
+1-11 files

LLVM/project 27ddcfallvm/test/CodeGen/AArch64 v2i64-min-max.ll

Update checks
DeltaFile
+73-72llvm/test/CodeGen/AArch64/v2i64-min-max.ll
+73-721 files

LLVM/project 1aa42bellvm/lib/Target/AArch64 AArch64ISelLowering.cpp, llvm/test/CodeGen/AArch64 v2i64-min-max.ll

[AArch64] Fold MIN/MAX(Vec[0], Vec[1]) to VECREDUCE_MIN/MAX(Vec)

If we have a lowering for `VECREDUCE_MIN/MAX` this is generally more
efficient than the scalar expansion.
DeltaFile
+99-0llvm/test/CodeGen/AArch64/v2i64-min-max.ll
+48-10llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+147-102 files

LLVM/project 9afeb19llvm/lib/Target/AArch64 AArch64ISelLowering.cpp, llvm/test/CodeGen/AArch64 v2i64-min-max.ll

Fixups
DeltaFile
+39-25llvm/test/CodeGen/AArch64/v2i64-min-max.ll
+1-2llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+40-272 files

LLVM/project 3d369ccllvm/lib/Target/AArch64 AArch64ISelLowering.cpp AArch64ISelLowering.h, llvm/test/CodeGen/AArch64 aarch64-minmaxv.ll sve-fixed-length-int-reduce.ll

[AArch64] Prefer SVE2 for fixed-length i64 [S|U][MIN|MAX] reductions (#181161)

With SVE2/SME we can lower the v2i64 min/max reductions to an SVE2
pairwise instruction. The throughput is about the same, but the SVE code
is smaller than the NEON expansion.
DeltaFile
+254-210llvm/test/CodeGen/AArch64/aarch64-minmaxv.ll
+128-70llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+65-31llvm/test/CodeGen/AArch64/sve-fixed-length-int-reduce.ll
+13-13llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-reduce.ll
+8-2llvm/lib/Target/AArch64/AArch64ISelLowering.h
+468-3265 files

LLVM/project 03393aaclang/lib/Parse ParseTentative.cpp, clang/test/Interpreter disambiguate-decl-stmt.cpp

REAPPLY [clang-repl] Ensure clang-repl accepts all C keywords supported in all language models (#181335)

https://github.com/llvm/llvm-project/pull/142749 was reverted because
`_Float16` is only supported on the following targets

(https://clang.llvm.org/docs/LanguageExtensions.html#half-precision-floating-point)
& the previous PR wasn't guarding it to expect a failure on some
targets.

Hence the CI failed with errors like 
```
/home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-rhel-test/clang-ppc64le-rhel/build/bin/clang -cc1 -internal-isystem /home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-rhel-test/clang-ppc64le-rhel/build/lib/clang/21/include -nostdsysteminc -fsyntax-only -verify -fincremental-extensions -std=c++20 /home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/clang/test/Interpreter/disambiguate-decl-stmt.cpp # RUN: at line 1

/home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-rhel-test/clang-ppc64le-rhel/build/bin/clang -cc1 -internal-isystem /home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-rhel-test/clang-ppc64le-rhel/build/lib/clang/21/include -nostdsysteminc -fsyntax-only -verify -fincremental-extensions -std=c++20 /home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/clang/test/Interpreter/disambiguate-decl-stmt.cpp
error: 'expected-error' diagnostics seen but not expected:
File /home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/clang/test/Interpreter/disambiguate-decl-stmt.cpp Line 113: _Float16 is not supported on this target
1 error generated.
```

This should now be fixed as we are expecting an error (or no error)
based on the target through the `expected-error 0-1` framework
DeltaFile
+13-0clang/test/Interpreter/disambiguate-decl-stmt.cpp
+2-0clang/lib/Parse/ParseTentative.cpp
+15-02 files

LLVM/project 3d8fffelibclc/clc/lib/r600/math clc_sw_fma.cl, libclc/opencl/lib/r600/image get_image_attributes_impl.ll write_image_impl.ll

libclc: Remove r600 support (#181976)

DeltaFile
+0-174libclc/clc/lib/r600/math/clc_sw_fma.cl
+0-95libclc/opencl/lib/r600/image/get_image_attributes_impl.ll
+0-60libclc/opencl/lib/r600/image/write_image_impl.ll
+0-54libclc/opencl/lib/r600/image/read_image_impl.ll
+0-31libclc/opencl/lib/r600/image/read_imagei.cl
+0-31libclc/opencl/lib/r600/image/read_imageui.cl
+0-44529 files not shown
+5-91735 files