LLVM/project f827c20llvm/tools/llvm-jitlink llvm-jitlink.cpp

[llvm-jitlink] Remove redundant ExecutorAddr constructor calls. NFCI. (#175488)

These ExecutorAddr calls were legacy from pre-ExecutorSymbolDef code.
The getAddress method already returns an ExecutorAddr, so there's no
need for them anymore.
DeltaFile
+2-4llvm/tools/llvm-jitlink/llvm-jitlink.cpp
+2-41 files

LLVM/project ba6a59cllvm/lib/ExecutionEngine/JITLink ELF_x86_64.cpp

[JITLink] Set correct triple instead of hard-code the value to linux (#175404)

DeltaFile
+5-4llvm/lib/ExecutionEngine/JITLink/ELF_x86_64.cpp
+5-41 files

LLVM/project f114d95llvm/lib/ExecutionEngine/Orc/TargetProcess JITLoaderPerf.cpp

[ORC] Simplify zero initializer. NFCI. (#175482)

Based on suggestion from @macdice on
https://github.com/llvm/llvm-project/pull/175204. Thanks @macdice!
DeltaFile
+1-1llvm/lib/ExecutionEngine/Orc/TargetProcess/JITLoaderPerf.cpp
+1-11 files

LLVM/project 205f342llvm/lib/Target/LoongArch LoongArchISelLowering.cpp, llvm/test/CodeGen/LoongArch/lsx issue174606.ll

[LoongArch] Disable strict node mutation to fix strict FP lowering crash

The patch disables strict node mutation for LoongArch by setting
IsStrictFPEnabled to true.

This change fixes the current strict FP lowering crash only.
ISD::STRICT_FSETCC and ISD::STRICT_FSETCCS can be further improved.
DeltaFile
+32-0llvm/test/CodeGen/LoongArch/lsx/issue174606.ll
+3-0llvm/lib/Target/LoongArch/LoongArchISelLowering.cpp
+35-02 files

LLVM/project 79a1b80llvm/lib/Target/RISCV RISCVMachineScheduler.cpp RISCVMachineScheduler.h, llvm/test/CodeGen/RISCV features-info.ll

[RISCV] Schedule RVV instructions with compatible vtype/vl first

This can reduce some vsetvli toggles.

This can be done in pre-ra scheduling as we have moved insertion of
vsetvli after the first RA.

Currently, we override `tryCandidate` and add a new heuristic based
on comparison of `vtype`/`vl`.

Reviewers: asb, preames, topperc, lukel97, mshockwave, BeMg

Reviewed By: mshockwave, lukel97

Pull Request: https://github.com/llvm/llvm-project/pull/95924
DeltaFile
+170-0llvm/test/CodeGen/RISCV/rvv/rvv-vtype-based-scheduler.ll
+98-4llvm/lib/Target/RISCV/RISCVMachineScheduler.cpp
+18-1llvm/lib/Target/RISCV/RISCVMachineScheduler.h
+4-0llvm/lib/Target/RISCV/RISCVFeatures.td
+1-0llvm/test/CodeGen/RISCV/features-info.ll
+291-55 files

LLVM/project 67601a4llvm/lib/Target/RISCV RISCVInsertVSETVLI.cpp RISCVVSETVLIInfoAnalysis.h

[RISCV][NFC] Add RISCVVSETVLIInfoAnalysis

This can be reused by #95924.

Reviewers: BeMg, topperc, lukel97, preames, mshockwave

Reviewed By: mshockwave, topperc

Pull Request: https://github.com/llvm/llvm-project/pull/172615
DeltaFile
+14-1,030llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp
+589-0llvm/lib/Target/RISCV/RISCVVSETVLIInfoAnalysis.h
+501-0llvm/lib/Target/RISCV/RISCVVSETVLIInfoAnalysis.cpp
+8-0llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
+3-0llvm/lib/Target/RISCV/RISCVInstrInfo.h
+1-0llvm/lib/Target/RISCV/CMakeLists.txt
+1,116-1,0306 files

LLVM/project 564f2bellvm/lib/Target/RISCV RISCVMachineScheduler.cpp RISCVMachineScheduler.h

[RISCV] Add a custom pre-ra scheduler

Currently we do nothing RISC-V specific in this scheduler.

This is a part of vtype-based scheduling.

Reviewers: BeMg, mshockwave, lukel97, preames, topperc

Pull Request: https://github.com/llvm/llvm-project/pull/172613
DeltaFile
+122-0llvm/lib/Target/RISCV/RISCVMachineScheduler.cpp
+33-0llvm/lib/Target/RISCV/RISCVMachineScheduler.h
+2-1llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
+1-0llvm/lib/Target/RISCV/CMakeLists.txt
+158-14 files

LLVM/project a3ca7calibclc/cmake/modules AddLibclc.cmake

[libclc][NFC] Remove unused builtins_opt_lib_tgt (#175479)

It was left behind after f07988ff3ec8.
DeltaFile
+0-2libclc/cmake/modules/AddLibclc.cmake
+0-21 files

LLVM/project e0cf581clang/lib/CodeGen/TargetBuiltins X86.cpp, clang/test/CodeGen/X86 keylocker.c

[Clang][X86] Remove useless `extractvalue` on aesencwide/aesdecwide builtin CodeGen (#175113)

This is a pre-commit of CIR codegen for `aesencwide/aesdecwide` builtin,
remove useless `extractvalue` on clang CodeGen for this builtin.
DeltaFile
+140-204clang/test/CodeGen/X86/keylocker.c
+3-4clang/lib/CodeGen/TargetBuiltins/X86.cpp
+143-2082 files

LLVM/project bdc6a67llvm/lib/CodeGen/AsmPrinter PseudoProbePrinter.cpp, llvm/test/CodeGen/X86 pseudo-probe-desc-check.ll

[PseudoProbe] Add switch to control illegal guid warnings (#174927)

Do not verify GUID existence in pseudo probe desc by default since it
generates false positive warnings with ThinLTO.
User can use -pseudo-probe-verify-guid-existence-in-desc to verify it
explicitly.
DeltaFile
+15-3llvm/lib/CodeGen/AsmPrinter/PseudoProbePrinter.cpp
+2-2llvm/test/CodeGen/X86/pseudo-probe-desc-check.ll
+17-52 files

LLVM/project 187ca86llvm/test/ExecutionEngine/JITLink/AArch64 backtrace-symbolication.s

Reapply "[llvm-jitlink] Replace IR backtrace symbolication test..." (… (#175476)

…#175242)

This reapplies 451ca458cf51d553f5c49e67d841280e8166f933, which was
reverted in 25976e83606f1a7615e3725e6038bb53ee96c3d5 due to bot
failures.

The REQUIRES line has been further constrained to try to address the
failures.
DeltaFile
+42-0llvm/test/ExecutionEngine/JITLink/AArch64/backtrace-symbolication.s
+42-01 files

LLVM/project 91dafc7llvm/lib/ExecutionEngine/Orc/TargetProcess JITLoaderPerf.cpp

[ORC][JITLink] Fix unitialised JIT dump header (#175204)

When trying to perf inject JIT dump generatd through the perf plugin,
perf fails with the following error:
```
jitdump file contains invalid or unsupported flags 0xf5880666c26c
0x2b750 [0xa8]: failed to process type: 10 [Operation not permitted]
```
It turns out that Header's Flags field was never initialized, so the
value could be random.
This patch fixes the issue by initialising all Header's fields.

Co-authored-by: Lang Hames <lhames at gmail.com>
DeltaFile
+1-1llvm/lib/ExecutionEngine/Orc/TargetProcess/JITLoaderPerf.cpp
+1-11 files

LLVM/project 79be97dlibc/shared/math ilogbf16.h, libc/src/__support/math ilogbf16.h CMakeLists.txt

[libc][math] Refactor ilogbf16 implementation to header-only in src/__support/math folder. (#175450)

Closes [#175346](https://github.com/llvm/llvm-project/issues/175346),
Part of #175344
DeltaFile
+34-0libc/src/__support/math/ilogbf16.h
+28-0libc/shared/math/ilogbf16.h
+17-1utils/bazel/llvm-project-overlay/libc/BUILD.bazel
+11-0libc/src/__support/math/CMakeLists.txt
+2-6libc/src/math/generic/ilogbf16.cpp
+1-2libc/src/math/generic/CMakeLists.txt
+93-93 files not shown
+97-99 files

LLVM/project 4a589cdmlir/python CMakeLists.txt replace_text.cmake, mlir/python/mlir/_mlir_libs _mlirExecutionEngine.pyi

[mlir][Python] generate type stubs for dialect extensions
DeltaFile
+0-142mlir/python/mlir/_mlir_libs/_mlir/dialects/quant.pyi
+63-33mlir/python/CMakeLists.txt
+0-63mlir/python/mlir/_mlir_libs/_mlir/dialects/pdl.pyi
+0-25mlir/python/mlir/_mlir_libs/_mlir/dialects/transform/__init__.pyi
+0-24mlir/python/mlir/_mlir_libs/_mlirExecutionEngine.pyi
+9-0mlir/python/replace_text.cmake
+72-2876 files

LLVM/project 458a983llvm/lib/ExecutionEngine/RuntimeDyld RuntimeDyld.cpp

[RuntimeDyld][MIPS] Use AT for stub function instead of T9 (#174354)

The stub function is generated for R_MIPS_26 relocation, which could be
used for local jumping inside a function, and do not expect any
temporary register to be clobbered.

Use AT instead of T9 for the stub function, otherwise functions using T9
will be messed up.

Signed-off-by: Icenowy Zheng <uwu at icenowy.me>
DeltaFile
+27-27llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyld.cpp
+27-271 files

LLVM/project 617f3c2mlir/python CMakeLists.txt replace_text.cmake, mlir/python/mlir/_mlir_libs _mlirExecutionEngine.pyi

[mlir][Python] generate type stubs for dialect extensions
DeltaFile
+0-142mlir/python/mlir/_mlir_libs/_mlir/dialects/quant.pyi
+61-33mlir/python/CMakeLists.txt
+0-63mlir/python/mlir/_mlir_libs/_mlir/dialects/pdl.pyi
+0-25mlir/python/mlir/_mlir_libs/_mlir/dialects/transform/__init__.pyi
+0-24mlir/python/mlir/_mlir_libs/_mlirExecutionEngine.pyi
+9-0mlir/python/replace_text.cmake
+70-2876 files

LLVM/project cd2caf6llvm/lib/Transforms/Vectorize VPlan.h VPlanTransforms.cpp, llvm/test/Transforms/LoopVectorize pr43166-fold-tail-by-masking.ll

[LV] Simplify extract-lane with scalar operand to the scalar value itself. (#174534)

This patch simplifies extract-lane(%lane_num, %X) to %X when %X is a
scalar value. Extracting from a scalar is redundant since there is only
one value to extract.
DeltaFile
+2-12llvm/test/Transforms/LoopVectorize/RISCV/uniform-load-store.ll
+5-7llvm/lib/Transforms/Vectorize/VPlan.h
+3-8llvm/test/Transforms/LoopVectorize/pr43166-fold-tail-by-masking.ll
+8-0llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+1-6llvm/test/Transforms/LoopVectorize/RISCV/scalable-tailfold.ll
+2-1llvm/lib/Transforms/Vectorize/VPlanPatternMatch.h
+21-342 files not shown
+22-378 files

LLVM/project ee8a4bclibclc/clc/lib/amdgcn/math clc_ldexp_override.cl, libclc/clc/lib/amdgpu/math clc_sqrt.cl clc_sqrt_fp64.cl

[libclc] Remove llvm-link --override flag and make implementation self-contained (#175134)

Revert --override flag added in 28d9255aa7c0 and avoid defining the same
symbol across multiple files of a target, simplifying the build and
easing the transition to CMake add_library for libclc.

amdgcn ldexp now uses __builtin_elementwise_ldexp.

No functional changes to clc_sqrt or clc_rsqrt.
DeltaFile
+67-0libclc/clc/lib/amdgpu/math/clc_sqrt.cl
+0-50libclc/clc/lib/amdgpu/math/clc_sqrt_fp64.cl
+0-35libclc/clc/lib/r600/math/clc_rsqrt_override.cl
+0-33libclc/clc/lib/amdgcn/math/clc_ldexp_override.cl
+27-0libclc/clc/lib/r600/math/clc_rsqrt.cl
+2-18libclc/cmake/modules/AddLibclc.cmake
+96-1364 files not shown
+114-13910 files

LLVM/project 81d5b36libc/shared/math log.h, libc/src/__support/math CMakeLists.txt log.h

[libc][math] Fix GPU build fails (#175474)

DeltaFile
+3-3libc/src/__support/math/CMakeLists.txt
+2-3libc/src/__support/math/log.h
+1-2libc/src/math/generic/log10.cpp
+1-2libc/src/math/generic/log2.cpp
+1-1libc/shared/math/log.h
+8-115 files

LLVM/project f6c743amlir/python CMakeLists.txt, mlir/python/mlir/_mlir_libs _mlirExecutionEngine.pyi

[mlir][Python] generate type stubs for dialect extensions
DeltaFile
+0-142mlir/python/mlir/_mlir_libs/_mlir/dialects/quant.pyi
+49-33mlir/python/CMakeLists.txt
+0-63mlir/python/mlir/_mlir_libs/_mlir/dialects/pdl.pyi
+0-25mlir/python/mlir/_mlir_libs/_mlir/dialects/transform/__init__.pyi
+0-24mlir/python/mlir/_mlir_libs/_mlirExecutionEngine.pyi
+49-2875 files

LLVM/project a70e6bdlibc/shared/math log.h

add new line
DeltaFile
+1-1libc/shared/math/log.h
+1-11 files

LLVM/project dcf8ae8llvm/include/llvm/ExecutionEngine/Orc BacktraceTools.h, llvm/lib/ExecutionEngine/Orc BacktraceTools.cpp CMakeLists.txt

Reapply "[ORC] Add utilities for limited symbolication of JIT backtra… (#175469)

…ces"

This reapplies 906b48616c03948a4df62a5a144f7108f3c455e8, which was
reverted in c11df52f9b847170b766fb71defd2a9222d95a8d due to bot
failures.

The testcase has been dropped from this recommit as it failed on several
bots (possbly due to differing backtrace formats or failure modes). I'll
re-introduce the testcase in a follow-up commit so that it cane be
iterated on (and re-reverted if necessary) without affecting the options
introduced by this commit. (Since these options are best-effort
debugging tools it's ok if they live in-tree without a test for now).
DeltaFile
+150-0llvm/lib/ExecutionEngine/Orc/BacktraceTools.cpp
+99-0llvm/include/llvm/ExecutionEngine/Orc/BacktraceTools.h
+57-0llvm/tools/llvm-jitlink/llvm-jitlink.cpp
+1-0llvm/lib/ExecutionEngine/Orc/CMakeLists.txt
+307-04 files

LLVM/project 0372504libc/src/__support/math CMakeLists.txt log.h, libc/src/math/generic log10.cpp log2.cpp

[libc][math] Fix GPU build fails
DeltaFile
+3-3libc/src/__support/math/CMakeLists.txt
+2-3libc/src/__support/math/log.h
+1-2libc/src/math/generic/log10.cpp
+1-2libc/src/math/generic/log2.cpp
+7-104 files

LLVM/project 87e6dfcmlir/python CMakeLists.txt

[mlir][Python] generate type stubs for dialect extensions
DeltaFile
+21-3mlir/python/CMakeLists.txt
+21-31 files

LLVM/project 282f8f7llvm/lib/Target/RISCV RISCVMergeBaseOffset.cpp, llvm/test/CodeGen/RISCV hoist-global-addr-base.ll fold-addi-loadstore.ll

[RISCV] Add support for QC.E.LI in RISCVMergeBaseOffset (#175310)

When we have `Xqcili` enabled and it is the `small code model`, we use
the `QC.E.LI` instruction to materialize addresses. Add support for
`QC.E.LI` in the `RISCVMergeBaseOffset` pass to merge the offset of the
address calculation into the offset field in a global address lowering
sequence.
DeltaFile
+469-140llvm/test/CodeGen/RISCV/hoist-global-addr-base.ll
+53-38llvm/lib/Target/RISCV/RISCVMergeBaseOffset.cpp
+6-8llvm/test/CodeGen/RISCV/fold-addi-loadstore.ll
+528-1863 files

LLVM/project f091be6llvm/lib/ExecutionEngine/Orc/Shared CMakeLists.txt, llvm/lib/ExecutionEngine/Orc/TargetProcess CMakeLists.txt

[ORC] Fixed incorrect additional header dirs (#175193)

The CMake ADDITIONAL_HEADER_DIRS directive for two Orc libraries,
specifically Shared and TargetProcess, used incorrect values that
pointed to its parent library include directory instead of its own. This
is now fixed.
DeltaFile
+2-1llvm/lib/ExecutionEngine/Orc/Shared/CMakeLists.txt
+1-1llvm/lib/ExecutionEngine/Orc/TargetProcess/CMakeLists.txt
+3-22 files

LLVM/project bc51c9dlibc/src/__support/math log.h log_range_reduction.h, libc/src/math/generic log.cpp log_range_reduction.h

[libc][math] Refactor log to header-only shared math (#175395)

Refactors log to a header-only shared math implementation.

Fixes #175369
DeltaFile
+861-0libc/src/__support/math/log.h
+2-836libc/src/math/generic/log.cpp
+98-0libc/src/__support/math/log_range_reduction.h
+0-94libc/src/math/generic/log_range_reduction.h
+22-14utils/bazel/llvm-project-overlay/libc/BUILD.bazel
+27-0libc/src/__support/math/CMakeLists.txt
+1,010-9447 files not shown
+1,044-96813 files

LLVM/project c726fffllvm/lib/Transforms/InstCombine InstCombineAndOrXor.cpp, llvm/test/Transforms/InstCombine and.ll binop-cast.ll

[InstCombine][profcheck] Add unknown branch weights to selects created in InstCombineAndOrXor.cpp (#175269)

These select instructions were created from combinations of bitwise
operators which have no branch weight information.

Tracking issue: #147390
DeltaFile
+30-19llvm/test/Transforms/InstCombine/and.ll
+27-20llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
+19-11llvm/test/Transforms/InstCombine/binop-cast.ll
+11-3llvm/test/Transforms/InstCombine/sub-ashr-or-to-icmp-select.ll
+9-3llvm/test/Transforms/InstCombine/conditional-negation.ll
+9-3llvm/test/Transforms/InstCombine/xor-ashr.ll
+105-591 files not shown
+105-707 files

LLVM/project 2f7e218llvm/lib/Transforms/Vectorize VPlanUtils.cpp, llvm/test/Transforms/LoopVectorize/AArch64 induction-costs.ll

[VPlan] Add missing sext(sub) SCEV fold to getSCEVExprForVPValue.

SCEV has a manual fold when doing SCEV construction from IR, that is not
integrated in the regular SCEV construction functions. Mirror the
behavior in getSCEVExprForVPValue, to match results when constructing
SCEVs from IR.

Fixes https://github.com/llvm/llvm-project/issues/174622.
DeltaFile
+255-0llvm/test/Transforms/LoopVectorize/AArch64/induction-costs.ll
+69-0llvm/lib/Transforms/Vectorize/VPlanUtils.cpp
+324-02 files

LLVM/project fbad7d8llvm/lib/Analysis ValueTracking.cpp, llvm/test/Transforms/Attributor nofpclass.ll

ValueTracking: Fix handling of fadd with mixed denormal modes

Fix case where the input mode is IEEE, the output flushes, and the
input could be subnormal. Also improves accuracy with positive zero
case.
DeltaFile
+45-1llvm/test/Transforms/Attributor/nofpclass.ll
+4-1llvm/lib/Analysis/ValueTracking.cpp
+49-22 files