LLVM/project 617f3c2mlir/python CMakeLists.txt replace_text.cmake, mlir/python/mlir/_mlir_libs _mlirExecutionEngine.pyi

[mlir][Python] generate type stubs for dialect extensions
DeltaFile
+0-142mlir/python/mlir/_mlir_libs/_mlir/dialects/quant.pyi
+61-33mlir/python/CMakeLists.txt
+0-63mlir/python/mlir/_mlir_libs/_mlir/dialects/pdl.pyi
+0-25mlir/python/mlir/_mlir_libs/_mlir/dialects/transform/__init__.pyi
+0-24mlir/python/mlir/_mlir_libs/_mlirExecutionEngine.pyi
+9-0mlir/python/replace_text.cmake
+70-2876 files

LLVM/project cd2caf6llvm/lib/Transforms/Vectorize VPlan.h VPlanTransforms.cpp, llvm/test/Transforms/LoopVectorize pr43166-fold-tail-by-masking.ll

[LV] Simplify extract-lane with scalar operand to the scalar value itself. (#174534)

This patch simplifies extract-lane(%lane_num, %X) to %X when %X is a
scalar value. Extracting from a scalar is redundant since there is only
one value to extract.
DeltaFile
+2-12llvm/test/Transforms/LoopVectorize/RISCV/uniform-load-store.ll
+5-7llvm/lib/Transforms/Vectorize/VPlan.h
+3-8llvm/test/Transforms/LoopVectorize/pr43166-fold-tail-by-masking.ll
+8-0llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+1-6llvm/test/Transforms/LoopVectorize/RISCV/scalable-tailfold.ll
+2-1llvm/lib/Transforms/Vectorize/VPlanPatternMatch.h
+21-342 files not shown
+22-378 files

LLVM/project ee8a4bclibclc/clc/lib/amdgcn/math clc_ldexp_override.cl, libclc/clc/lib/amdgpu/math clc_sqrt.cl clc_sqrt_fp64.cl

[libclc] Remove llvm-link --override flag and make implementation self-contained (#175134)

Revert --override flag added in 28d9255aa7c0 and avoid defining the same
symbol across multiple files of a target, simplifying the build and
easing the transition to CMake add_library for libclc.

amdgcn ldexp now uses __builtin_elementwise_ldexp.

No functional changes to clc_sqrt or clc_rsqrt.
DeltaFile
+67-0libclc/clc/lib/amdgpu/math/clc_sqrt.cl
+0-50libclc/clc/lib/amdgpu/math/clc_sqrt_fp64.cl
+0-35libclc/clc/lib/r600/math/clc_rsqrt_override.cl
+0-33libclc/clc/lib/amdgcn/math/clc_ldexp_override.cl
+27-0libclc/clc/lib/r600/math/clc_rsqrt.cl
+2-18libclc/cmake/modules/AddLibclc.cmake
+96-1364 files not shown
+114-13910 files

LLVM/project 81d5b36libc/shared/math log.h, libc/src/__support/math CMakeLists.txt log.h

[libc][math] Fix GPU build fails (#175474)

DeltaFile
+3-3libc/src/__support/math/CMakeLists.txt
+2-3libc/src/__support/math/log.h
+1-2libc/src/math/generic/log10.cpp
+1-2libc/src/math/generic/log2.cpp
+1-1libc/shared/math/log.h
+8-115 files

LLVM/project f6c743amlir/python CMakeLists.txt, mlir/python/mlir/_mlir_libs _mlirExecutionEngine.pyi

[mlir][Python] generate type stubs for dialect extensions
DeltaFile
+0-142mlir/python/mlir/_mlir_libs/_mlir/dialects/quant.pyi
+49-33mlir/python/CMakeLists.txt
+0-63mlir/python/mlir/_mlir_libs/_mlir/dialects/pdl.pyi
+0-25mlir/python/mlir/_mlir_libs/_mlir/dialects/transform/__init__.pyi
+0-24mlir/python/mlir/_mlir_libs/_mlirExecutionEngine.pyi
+49-2875 files

LLVM/project a70e6bdlibc/shared/math log.h

add new line
DeltaFile
+1-1libc/shared/math/log.h
+1-11 files

LLVM/project dcf8ae8llvm/include/llvm/ExecutionEngine/Orc BacktraceTools.h, llvm/lib/ExecutionEngine/Orc BacktraceTools.cpp CMakeLists.txt

Reapply "[ORC] Add utilities for limited symbolication of JIT backtra… (#175469)

…ces"

This reapplies 906b48616c03948a4df62a5a144f7108f3c455e8, which was
reverted in c11df52f9b847170b766fb71defd2a9222d95a8d due to bot
failures.

The testcase has been dropped from this recommit as it failed on several
bots (possbly due to differing backtrace formats or failure modes). I'll
re-introduce the testcase in a follow-up commit so that it cane be
iterated on (and re-reverted if necessary) without affecting the options
introduced by this commit. (Since these options are best-effort
debugging tools it's ok if they live in-tree without a test for now).
DeltaFile
+150-0llvm/lib/ExecutionEngine/Orc/BacktraceTools.cpp
+99-0llvm/include/llvm/ExecutionEngine/Orc/BacktraceTools.h
+57-0llvm/tools/llvm-jitlink/llvm-jitlink.cpp
+1-0llvm/lib/ExecutionEngine/Orc/CMakeLists.txt
+307-04 files

LLVM/project 0372504libc/src/__support/math CMakeLists.txt log.h, libc/src/math/generic log10.cpp log2.cpp

[libc][math] Fix GPU build fails
DeltaFile
+3-3libc/src/__support/math/CMakeLists.txt
+2-3libc/src/__support/math/log.h
+1-2libc/src/math/generic/log10.cpp
+1-2libc/src/math/generic/log2.cpp
+7-104 files

LLVM/project 87e6dfcmlir/python CMakeLists.txt

[mlir][Python] generate type stubs for dialect extensions
DeltaFile
+21-3mlir/python/CMakeLists.txt
+21-31 files

LLVM/project 282f8f7llvm/lib/Target/RISCV RISCVMergeBaseOffset.cpp, llvm/test/CodeGen/RISCV hoist-global-addr-base.ll fold-addi-loadstore.ll

[RISCV] Add support for QC.E.LI in RISCVMergeBaseOffset (#175310)

When we have `Xqcili` enabled and it is the `small code model`, we use
the `QC.E.LI` instruction to materialize addresses. Add support for
`QC.E.LI` in the `RISCVMergeBaseOffset` pass to merge the offset of the
address calculation into the offset field in a global address lowering
sequence.
DeltaFile
+469-140llvm/test/CodeGen/RISCV/hoist-global-addr-base.ll
+53-38llvm/lib/Target/RISCV/RISCVMergeBaseOffset.cpp
+6-8llvm/test/CodeGen/RISCV/fold-addi-loadstore.ll
+528-1863 files

LLVM/project f091be6llvm/lib/ExecutionEngine/Orc/Shared CMakeLists.txt, llvm/lib/ExecutionEngine/Orc/TargetProcess CMakeLists.txt

[ORC] Fixed incorrect additional header dirs (#175193)

The CMake ADDITIONAL_HEADER_DIRS directive for two Orc libraries,
specifically Shared and TargetProcess, used incorrect values that
pointed to its parent library include directory instead of its own. This
is now fixed.
DeltaFile
+2-1llvm/lib/ExecutionEngine/Orc/Shared/CMakeLists.txt
+1-1llvm/lib/ExecutionEngine/Orc/TargetProcess/CMakeLists.txt
+3-22 files

LLVM/project bc51c9dlibc/src/__support/math log.h log_range_reduction.h, libc/src/math/generic log.cpp log_range_reduction.h

[libc][math] Refactor log to header-only shared math (#175395)

Refactors log to a header-only shared math implementation.

Fixes #175369
DeltaFile
+861-0libc/src/__support/math/log.h
+2-836libc/src/math/generic/log.cpp
+98-0libc/src/__support/math/log_range_reduction.h
+0-94libc/src/math/generic/log_range_reduction.h
+22-14utils/bazel/llvm-project-overlay/libc/BUILD.bazel
+27-0libc/src/__support/math/CMakeLists.txt
+1,010-9447 files not shown
+1,044-96813 files

LLVM/project c726fffllvm/lib/Transforms/InstCombine InstCombineAndOrXor.cpp, llvm/test/Transforms/InstCombine and.ll binop-cast.ll

[InstCombine][profcheck] Add unknown branch weights to selects created in InstCombineAndOrXor.cpp (#175269)

These select instructions were created from combinations of bitwise
operators which have no branch weight information.

Tracking issue: #147390
DeltaFile
+30-19llvm/test/Transforms/InstCombine/and.ll
+27-20llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
+19-11llvm/test/Transforms/InstCombine/binop-cast.ll
+11-3llvm/test/Transforms/InstCombine/sub-ashr-or-to-icmp-select.ll
+9-3llvm/test/Transforms/InstCombine/conditional-negation.ll
+9-3llvm/test/Transforms/InstCombine/xor-ashr.ll
+105-591 files not shown
+105-707 files

LLVM/project 2f7e218llvm/lib/Transforms/Vectorize VPlanUtils.cpp, llvm/test/Transforms/LoopVectorize/AArch64 induction-costs.ll

[VPlan] Add missing sext(sub) SCEV fold to getSCEVExprForVPValue.

SCEV has a manual fold when doing SCEV construction from IR, that is not
integrated in the regular SCEV construction functions. Mirror the
behavior in getSCEVExprForVPValue, to match results when constructing
SCEVs from IR.

Fixes https://github.com/llvm/llvm-project/issues/174622.
DeltaFile
+255-0llvm/test/Transforms/LoopVectorize/AArch64/induction-costs.ll
+69-0llvm/lib/Transforms/Vectorize/VPlanUtils.cpp
+324-02 files

LLVM/project fbad7d8llvm/lib/Analysis ValueTracking.cpp, llvm/test/Transforms/Attributor nofpclass.ll

ValueTracking: Fix handling of fadd with mixed denormal modes

Fix case where the input mode is IEEE, the output flushes, and the
input could be subnormal. Also improves accuracy with positive zero
case.
DeltaFile
+45-1llvm/test/Transforms/Attributor/nofpclass.ll
+4-1llvm/lib/Analysis/ValueTracking.cpp
+49-22 files

LLVM/project c6db8f4llvm/lib/Target/AArch64 AArch64InstrFormats.td

[AArch64] Remove dead tuimm5sN tablegen Operands. NFC (#174735)

I believe these were last used in https://reviews.llvm.org/D71773.
DeltaFile
+0-24llvm/lib/Target/AArch64/AArch64InstrFormats.td
+0-241 files

LLVM/project eba79bcllvm/lib/Target/X86 X86FixupBWInsts.cpp X86.h, llvm/test/CodeGen/X86 fixup-bw-inst.mir

[X86][NewPM] Port x86-fixup-bw-insts to NPM (#175399)

Similar to other pass portings. Refactor into an implementation class,
rename the old pass, and add a wrapper around the implementation for the
new pass manager. Handle PSI/MBFI similar to other backend passes.
DeltaFile
+84-45llvm/lib/Target/X86/X86FixupBWInsts.cpp
+8-2llvm/lib/Target/X86/X86.h
+2-2llvm/lib/Target/X86/X86TargetMachine.cpp
+1-1llvm/lib/Target/X86/X86PassRegistry.def
+1-0llvm/test/DebugInfo/MIR/InstrRef/x86-fixup-bw-inst-subreb.mir
+1-0llvm/test/CodeGen/X86/fixup-bw-inst.mir
+97-501 files not shown
+98-507 files

LLVM/project 3ad0281llvm/lib/Support KnownFPClass.cpp, llvm/test/Transforms/Attributor nofpclass-sqrt.ll

ValueTracking: sqrt never returns subnormal (#174846)

DeltaFile
+52-52llvm/test/Transforms/Attributor/nofpclass-sqrt.ll
+14-14llvm/unittests/Analysis/ValueTrackingTest.cpp
+2-5llvm/test/Transforms/InstCombine/simplify-demanded-fpclass-sqrt.ll
+1-0llvm/lib/Support/KnownFPClass.cpp
+69-714 files

LLVM/project f987bbdllvm/test/Transforms/InstCombine simplify-demanded-fpclass-fptrunc.ll simplify-demanded-fpclass-fptrunc-round.ll

InstCombine: Add fptrunc SimplifyDemandedFPClass baseline tests (#175420)

Also llvm.fptrunc.round, which should be the same.
DeltaFile
+578-0llvm/test/Transforms/InstCombine/simplify-demanded-fpclass-fptrunc.ll
+578-0llvm/test/Transforms/InstCombine/simplify-demanded-fpclass-fptrunc-round.ll
+1,156-02 files

LLVM/project 5a2677ellvm/test/MC/AMDGPU gfx8_asm_vop3.s gfx7_asm_vop3.s, llvm/test/MC/Disassembler/AMDGPU gfx9_vop3.txt

Rebase

Created using spr 1.3.7
DeltaFile
+42,349-42,348llvm/test/MC/AMDGPU/gfx8_asm_vop3.s
+41,419-41,418llvm/test/MC/AMDGPU/gfx7_asm_vop3.s
+36,428-36,427llvm/test/MC/AMDGPU/gfx9_asm_vop3.s
+28,175-28,174llvm/test/MC/AMDGPU/gfx9_asm_vopc.s
+22,708-22,884llvm/test/MC/Disassembler/AMDGPU/gfx9_vop3.txt
+22,276-22,275llvm/test/MC/AMDGPU/gfx8_asm_vopc.s
+193,355-193,52611,476 files not shown
+1,808,891-1,338,63711,482 files

LLVM/project 64f4a16llvm/test/Transforms/InstCombine simplify-demanded-fpclass-maximum.ll simplify-demanded-fpclass-minimum.ll

InstCombine: Add more tests for min/max SimplifyDemandedFPClass

Test some more refined cases, such as ordering with 0s and within
known positive and known negative cases.
DeltaFile
+394-0llvm/test/Transforms/InstCombine/simplify-demanded-fpclass-maximum.ll
+393-0llvm/test/Transforms/InstCombine/simplify-demanded-fpclass-minimum.ll
+392-0llvm/test/Transforms/InstCombine/simplify-demanded-fpclass-minimumnum.ll
+392-0llvm/test/Transforms/InstCombine/simplify-demanded-fpclass-maximumnum.ll
+1,571-04 files

LLVM/project 036c6c2llvm/include/llvm/ADT FloatingPointMode.h, llvm/lib/Support FloatingPointMode.cpp

ADT: Add utility functions for comparing FPClassTest

Add utility functions for checking if less and greater queries
are known to not evaluate to true. This will permit more precise
folding of min/max intrinsics. The test is kind of a mess.
DeltaFile
+560-0llvm/unittests/ADT/FloatingPointMode.cpp
+55-0llvm/lib/Support/FloatingPointMode.cpp
+34-0llvm/include/llvm/ADT/FloatingPointMode.h
+649-03 files

LLVM/project 7049481llvm/lib/Transforms/InstCombine InstCombineSimplifyDemanded.cpp, llvm/test/Transforms/InstCombine simplify-demanded-fpclass-minimum.ll simplify-demanded-fpclass-minimumnum.ll

InstCombine: Improve SimplifyDemandedFPClass min/max handling

Refine handling of minimum/maximum and minimumnum/maximumnum. The
previous folds to input were based on sign bit checks. This was too
conservative with 0s. This can now consider -0 as less than or equal
to +0 as appropriate, account for nsz. It additionally can handle
cases like one half is known positive normal and the other subnormal.
DeltaFile
+32-61llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
+23-58llvm/test/Transforms/InstCombine/simplify-demanded-fpclass-minimum.ll
+22-52llvm/test/Transforms/InstCombine/simplify-demanded-fpclass-minimumnum.ll
+23-46llvm/test/Transforms/InstCombine/simplify-demanded-fpclass-maximumnum.ll
+22-44llvm/test/Transforms/InstCombine/simplify-demanded-fpclass-maximum.ll
+122-2615 files

LLVM/project 28f35dcllvm/include/llvm/Support KnownFPClass.h, llvm/lib/Analysis ValueTracking.cpp

InstCombine: Handle fptrunc in SimplifyDemandedFPClass

Also handle llvm.fptrunc.round since it's the same.
DeltaFile
+15-30llvm/test/Transforms/InstCombine/simplify-demanded-fpclass-fptrunc.ll
+15-28llvm/test/Transforms/InstCombine/simplify-demanded-fpclass-fptrunc-round.ll
+41-0llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
+14-0llvm/lib/Support/KnownFPClass.cpp
+1-9llvm/lib/Analysis/ValueTracking.cpp
+3-0llvm/include/llvm/Support/KnownFPClass.h
+89-676 files

LLVM/project 53b795ellvm/lib/Analysis ValueTracking.cpp, llvm/test/Transforms/Attributor nofpclass-implied-by-fcmp.ll nofpclass-select.ll

ValueTracking: Account for undef in adjustKnownFPClassForSelectArm

This needs to consider undef like the KnownBits case does.
DeltaFile
+600-600llvm/test/Transforms/Attributor/nofpclass-implied-by-fcmp.ll
+25-25llvm/test/Transforms/Attributor/nofpclass-select.ll
+19-6llvm/test/Transforms/InstCombine/simplify-demanded-fpclass.ll
+6-6llvm/test/Transforms/Attributor/nofpclass.ll
+8-3llvm/lib/Analysis/ValueTracking.cpp
+1-1llvm/test/Transforms/InstCombine/minmax-fp.ll
+659-6416 files

LLVM/project 634203fllvm/lib/Support KnownFPClass.cpp, llvm/test/Transforms/Attributor nofpclass-sqrt.ll

ValueTracking: sqrt never returns subnormal
DeltaFile
+52-52llvm/test/Transforms/Attributor/nofpclass-sqrt.ll
+14-14llvm/unittests/Analysis/ValueTrackingTest.cpp
+2-5llvm/test/Transforms/InstCombine/simplify-demanded-fpclass-sqrt.ll
+1-0llvm/lib/Support/KnownFPClass.cpp
+69-714 files

LLVM/project de48ee6llvm/test/Transforms/InstCombine simplify-demanded-fpclass-fptrunc.ll simplify-demanded-fpclass-fptrunc-round.ll

InstCombine: Add fptrunc SimplifyDemandedFPClass baseline tests

Also llvm.fptrunc.round, which should be the same.
DeltaFile
+578-0llvm/test/Transforms/InstCombine/simplify-demanded-fpclass-fptrunc.ll
+578-0llvm/test/Transforms/InstCombine/simplify-demanded-fpclass-fptrunc-round.ll
+1,156-02 files

LLVM/project 1c0c9aellvm/include/llvm/CAS OnDiskGraphDB.h

[llvm][CAS] Fixed build with -D_LIBCPP_REMOVE_TRANSITIVE_INCLUDES (#173797)

DeltaFile
+1-0llvm/include/llvm/CAS/OnDiskGraphDB.h
+1-01 files

LLVM/project e7f23b4llvm/lib/Target/SystemZ SystemZISelLowering.h, llvm/test/CodeGen/SystemZ fmuladd-soft-float.ll

[SystemZ] Remove the `softPromoteHalfType` override (#175410)

`softPromoteHalfType` is being phased out because it is prone to
miscompilations (further context at [1]). SystemZ is one of the few
remaining platforms to override the default, so remove it here.

This only affects SystemZ when the `soft-float` option is used.

[1]: https://github.com/llvm/llvm-project/pull/175149
DeltaFile
+13-9llvm/test/CodeGen/SystemZ/fmuladd-soft-float.ll
+0-1llvm/lib/Target/SystemZ/SystemZISelLowering.h
+13-102 files

LLVM/project 8877491llvm/include/llvm/Support KnownBits.h, llvm/lib/Analysis ValueTracking.cpp

[ValueTracking] Support horizontal vector add in computeKnownBits (#174410)

Alive2 proofs:
* Leading zeros - [4vi32](https://alive2.llvm.org/ce/z/w--S2D),
[16vi8](https://alive2.llvm.org/ce/z/hEdVks)
* Leading ones - [4vi16](https://alive2.llvm.org/ce/z/RyPdBS),
[16vi8](https://alive2.llvm.org/ce/z/UTFFt9)
DeltaFile
+45-0llvm/test/Transforms/InstCombine/vector-reduce-add-known-bits.ll
+40-0llvm/lib/Support/KnownBits.cpp
+34-0llvm/unittests/Support/KnownBitsTest.cpp
+10-10llvm/test/Transforms/PhaseOrdering/AArch64/udotabd.ll
+8-0llvm/lib/Analysis/ValueTracking.cpp
+5-0llvm/include/llvm/Support/KnownBits.h
+142-106 files