LLVM/project 3823341llvm/lib/Transforms/Vectorize VPlanTransforms.cpp, llvm/test/Transforms/LoopVectorize reduction-inloop-uf4.ll consecutive-ptr-uniforms.ll

Reapply "[VPlan] Run removeDeadRecipes early." (#195325) (#195445)

This reverts commit 2a9699ccd128d7f94372d18c97229e1934b8506e.

Recommit contains a small fix for skipping dead recipes when finding
induction casts.

Original message:
The initial simplifyRecipes run can leave dead recipes, which
removeDeadRecipes can clean up, similar for dead instructions in the
input.

PR: https://github.com/llvm/llvm-project/pull/190191
DeltaFile
+8-25llvm/test/Transforms/LoopVectorize/VPlan/predicator.ll
+15-10llvm/test/Transforms/LoopVectorize/reduction-inloop-uf4.ll
+10-6llvm/test/Transforms/LoopVectorize/AArch64/transform-narrow-interleave-to-widen-memory-with-wide-ops-chained.ll
+10-5llvm/test/Transforms/LoopVectorize/consecutive-ptr-uniforms.ll
+13-1llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+6-6llvm/test/Transforms/LoopVectorize/tail-folding-optimize-vector-induction-width.ll
+62-5330 files not shown
+132-13136 files

LLVM/project 82f9618libc/src/__support/math CMakeLists.txt llrintf16.h, libc/src/math/generic CMakeLists.txt

[libc][math] Refactor lrint_lround family to header-only (#195441)

Refactors the lrint_lround math family to be header-only.

part of: #147386

Target Functions:
  - llrint
  - llrintbf16
  - llrintf
  - llrintf128
  - llrintf16
  - llrintl
  - llround
  - llroundbf16
  - llroundf
  - llroundf128
  - llroundf16
  - llroundl

    [11 lines not shown]
DeltaFile
+352-12utils/bazel/llvm-project-overlay/libc/BUILD.bazel
+228-0libc/src/__support/math/CMakeLists.txt
+24-42libc/src/math/generic/CMakeLists.txt
+48-0libc/test/shared/CMakeLists.txt
+33-0libc/src/__support/math/llrintf16.h
+33-0libc/src/__support/math/llrintf128.h
+718-5473 files not shown
+2,045-19979 files

LLVM/project 516a9d4mlir/docs/Traits _index.md, mlir/include/mlir/IR OpDefinition.h OpBase.td

[MLIR] Add HasAncestor op trait

Add HasAncestor/AncestorOneOf traits that verify an operation has a
specific ancestor anywhere in the parent chain, unlike HasParent which
only checks the immediate parent. This enables declarative verification
for ops that can be nested arbitrarily deep inside a required ancestor.
DeltaFile
+79-0mlir/test/IR/traits.mlir
+26-0mlir/include/mlir/IR/OpDefinition.h
+10-0mlir/docs/Traits/_index.md
+9-0mlir/test/lib/Dialect/Test/TestOps.td
+8-0mlir/include/mlir/IR/OpBase.td
+132-05 files

LLVM/project 1ed1761llvm/lib/Transforms/Utils CloneFunction.cpp

[NFC][LLVM] Simplify `PruningFunctionCloner::cloneInstruction` (#195389)

Add early returns and decrease indendation of the code that does
implements calls to constrained intrinsics.
DeltaFile
+57-62llvm/lib/Transforms/Utils/CloneFunction.cpp
+57-621 files

LLVM/project 7b026acllvm/lib/Transforms/Vectorize VPlanTransforms.cpp, llvm/test/Transforms/LoopVectorize/X86 replicating-load-store-costs.ll

[VPlan] Simplify extract-lane of all single-scalars (#194838)

Checking against vputils::isSingleScalar is sufficient for both
correctness and profitability.
DeltaFile
+3-5llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+0-1llvm/test/Transforms/LoopVectorize/X86/replicating-load-store-costs.ll
+3-62 files

LLVM/project 482d0f0llvm/test/Transforms/LoopVectorize cast-induction.ll

[LV] Add test for crash with VPlan-based DCE (NFC) (#195438)

Add a test with a dead cast, causing the revert in 2a9699ccd12, due to
https://lab.llvm.org/buildbot/#/builders/67/builds/3821.
DeltaFile
+152-0llvm/test/Transforms/LoopVectorize/cast-induction.ll
+152-01 files

LLVM/project 2447939clang-tools-extra/clang-tidy/misc UnusedParametersCheck.cpp UnusedParametersCheck.h, clang-tools-extra/docs ReleaseNotes.rst

[Clang-Tidy] Skip `misc-unused-parameters` in macro. (#194999)

The new parameter allows to skip the check for the cases when we need to
use the macro, but there is no immediate way to fix the macro itself. It
recently come up with a gradual adoption of clang-tidy and not all parts
of the code could be fixed at once.

Simply enabling it using `clang-tidy-diff` is not enough, since
`misc-unused-parameters` would cause false-positive, since the given
diff didn't introduce the unused parameters and might be not easy to
change.

The given parameters allow for better incremental adoption.

---------

Co-authored-by: Dmitrii Kuragin <dkuragin at adobe.com>
Co-authored-by: EugeneZelenko <eugene.zelenko at gmail.com>
DeltaFile
+21-0clang-tools-extra/test/clang-tidy/checkers/misc/unused-parameters-macro.cpp
+8-0clang-tools-extra/docs/clang-tidy/checks/misc/unused-parameters.rst
+5-1clang-tools-extra/clang-tidy/misc/UnusedParametersCheck.cpp
+5-0clang-tools-extra/docs/ReleaseNotes.rst
+1-0clang-tools-extra/test/clang-tidy/infrastructure/dump-config-filtering.cpp
+1-0clang-tools-extra/clang-tidy/misc/UnusedParametersCheck.h
+41-16 files

LLVM/project 1b1e14flibc/src/__support/math f16addl.h f16addf128.h, libc/test/shared shared_math_constexpr_test.cpp CMakeLists.txt

[libc][math] Qualify f16add functions to constexpr (#195429)

Signed-off-by: udaykiriti <udaykiriti624 at gmail.com>
DeltaFile
+11-0libc/test/shared/shared_math_constexpr_test.cpp
+4-0libc/test/shared/CMakeLists.txt
+1-1libc/src/__support/math/f16addl.h
+1-1libc/src/__support/math/f16addf128.h
+1-1libc/src/__support/math/f16addf.h
+1-1libc/src/__support/math/f16add.h
+19-46 files

LLVM/project 34a80c9llvm/lib/Transforms/InstCombine InstructionCombining.cpp, llvm/test/Transforms/InstCombine allocsite-removable-few-users.ll

[InstCombine] Add user-count bailout to isAllocSiteRemovable (#190347)

isAllocSiteRemovable() walks all transitive users of an alloc site, but
sites with many users are almost never removable. Profiling on
real-world codegen workloads (73,943 alloc sites) showed:

- 89 removable sites, max 1,392 users walked
- 73,854 non-removable sites, avg 31,305 users walked
- 2.31B total wasted user visits (~400s wall-clock on a 35-min build)

Skip the removability analysis when direct user count exceeds a
configurable threshold (default 2048, tunable via hidden cl::opt
-instcombine-max-allocsite-removable-users).

Also defer WeakTrackingVH conversion: collect into Instruction* first
and convert only when the site is actually removable.
DeltaFile
+31-0llvm/test/Transforms/InstCombine/allocsite-removable-few-users.ll
+15-3llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
+46-32 files

LLVM/project 0ec4b22libc/src/__support/math fmul.h

[libc][math][NFC] fix fmul build (#195437)
DeltaFile
+1-1libc/src/__support/math/fmul.h
+1-11 files

LLVM/project 11dd2c1libc/src/__support/math fmul.h

[libc][math][NFC] fix fmul build
DeltaFile
+1-1libc/src/__support/math/fmul.h
+1-11 files

LLVM/project a7f6a6flibc/src/__support/math fmul.h CMakeLists.txt, libc/src/math/generic fmul.cpp

[libc][math] Refactor fmul-fsub-frexp family to header-only (#195431)

Refactors the fmul-fsub-frexp math family to be header-only.

part of: #147386

Target Functions:
  - fmul
  - fmulf128
  - fmull
  - fsub
  - fsubf128
  - fsubl
  - frexp
  - frexpbf16
  - frexpl

Co-authored-by: Muhammad Bassiouni <60100307+bassiounix at users.noreply.github.com>
DeltaFile
+131-8utils/bazel/llvm-project-overlay/libc/BUILD.bazel
+126-0libc/src/__support/math/fmul.h
+4-105libc/src/math/generic/fmul.cpp
+87-0libc/src/__support/math/CMakeLists.txt
+31-0libc/src/__support/math/fsubf128.h
+31-0libc/src/__support/math/fmulf128.h
+410-11329 files not shown
+873-16535 files

LLVM/project d22b41dllvm/lib/Transforms/Utils AssumeBundleBuilder.cpp, llvm/test/Transforms/Util assume-builder-atomics.ll

[llvm] Add support for atomicrmw and cmpxchg in AssumeBundleBuilder (#194630)

The assume builder currently only preserves dereferenceable, nonnull,
and alignment knowledge for regular load/store instructions and calls.
Atomic memory accessing instructions (atomicrmw and cmpxchg) also
dereference their pointer operands, but were previously skipped, causing
useful knowledge to be lost across these operations.

Add handling for AtomicRMWInst and AtomicCmpXchgInst in
AssumeBuilderState::addInstruction(), using the same addAccessedPtr()
path as loads and stores. The accessed type is taken from the value
operand (atomicrmw) or compare operand (cmpxchg), which corresponds to
the in-memory element type, and the alignment is taken from the
instruction's explicit alignment.

Add a test to verify that assume bundles are correctly generated before
atomicrmw and cmpxchg instructions.

---------

Co-authored-by: Nikita Popov <github at npopov.com>
DeltaFile
+17-0llvm/test/Transforms/Util/assume-builder-atomics.ll
+7-1llvm/lib/Transforms/Utils/AssumeBundleBuilder.cpp
+24-12 files

LLVM/project 6ef6713llvm/lib/Target/RISCV RISCVISelLowering.cpp RISCVInstrInfoZvzip.td, llvm/test/CodeGen/RISCV/rvv vector-interleave.ll fixed-vectors-shuffle-int-interleave.ll

[RISCV][CodeGen] Add initial vzip codegen support (#194548)

Add initial support for vzip instruction, which is included in zvzip
extension. It is used to lower VECTOR_SHUFFLE with interleave pattern
and VECTOR_INTERLEAVE.
DeltaFile
+61-143llvm/test/CodeGen/RISCV/rvv/vector-interleave.ll
+23-63llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-int-interleave.ll
+55-2llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+11-25llvm/test/CodeGen/RISCV/rvv/vector-interleave-fixed.ll
+25-0llvm/lib/Target/RISCV/RISCVInstrInfoZvzip.td
+4-8llvm/test/CodeGen/RISCV/rvv/vector-deinterleave.ll
+179-2416 files

LLVM/project ef4a720llvm/lib/Transforms/IPO Instrumentor.cpp

Fix format
DeltaFile
+2-1llvm/lib/Transforms/IPO/Instrumentor.cpp
+2-11 files

LLVM/project 329853dclang/lib/CIR/CodeGen CIRGenExpr.cpp CIRGenValue.h, clang/test/CIR/CodeGen vector-ext-element.cpp

[CIR] Implement ExtVectorElementExpr with non simple base (#195165)

Implement support for the ExtVectorElementExpr with non simple base

Issue #192311
DeltaFile
+24-0clang/test/CIR/CodeGen/vector-ext-element.cpp
+13-3clang/lib/CIR/CodeGen/CIRGenExpr.cpp
+3-1clang/lib/CIR/CodeGen/CIRGenValue.h
+40-43 files

LLVM/project ce49712llvm/lib/Transforms/IPO Instrumentor.cpp

Indicate when constant int should be signed
DeltaFile
+3-3llvm/lib/Transforms/IPO/Instrumentor.cpp
+3-31 files

LLVM/project 797cad2llvm/test/CodeGen/AMDGPU amdgcn.bitcast.1024bit.ll amdgcn.bitcast.512bit.ll, llvm/test/CodeGen/AMDGPU/NextUseAnalysis spill-vreg-many-lanes.mir acyclic-770bb.mir

Merge branch 'main' into users/kevinsala/instrumentor-base-pr
DeltaFile
+158,755-173,230llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.1024bit.ll
+275,101-0llvm/test/CodeGen/AMDGPU/NextUseAnalysis/spill-vreg-many-lanes.mir
+144,679-0llvm/test/CodeGen/AMDGPU/NextUseAnalysis/acyclic-770bb.mir
+50,477-50,088llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.512bit.ll
+92,827-0llvm/test/CodeGen/RISCV/rvv/clmulh-sdnode.ll
+42,349-42,348llvm/test/MC/AMDGPU/gfx8_asm_vop3.s
+764,188-265,66644,684 files not shown
+6,747,763-3,384,96444,690 files

LLVM/project d4ae620clang/lib/CIR/Dialect/IR CIRDialect.cpp, clang/unittests/CIR ControlFlowTest.cpp CMakeLists.txt

[CIR] Add RegionBranchOpInterface unit tests and fix control flow bugs

Add unit tests for RegionBranchOpInterface implementations across CIR
control flow operations: IfOp, ScopeOp, TernaryOp, SwitchOp, WhileOp,
ForOp, DoWhileOp, and TryOp. The tests verify successor regions,
terminator successors, loop detection, repetitive region marking, and
op/terminator successor consistency.

Fix a missing return in ConditionOp::getSuccessorRegions that caused
fallthrough from the loop case to an unconditional cast<AwaitOp>,
crashing when the parent is a loop operation.

Fix IfOp::getSuccessorRegions to report parent exit as a successor
when the else region is absent, correctly modeling the case where the
condition is false.
DeltaFile
+458-0clang/unittests/CIR/ControlFlowTest.cpp
+3-3clang/lib/CIR/Dialect/IR/CIRDialect.cpp
+2-0clang/unittests/CIR/CMakeLists.txt
+463-33 files

LLVM/project 0d27ddalibc/src/__support/math CMakeLists.txt scalbnf16.h, libc/src/math/generic CMakeLists.txt

[libc][math] Refactor scalbln-scalbn-ldexp family to header-only (#195423)

Refactors the scalbln-scalbn-ldexp math family to be header-only.

part of: #147386

Target Functions:
  - ldexp
  - ldexpbf16
  - ldexpl
  - scalbln
  - scalblnbf16
  - scalblnf
  - scalblnf128
  - scalblnf16
  - scalblnl
  - scalbn
  - scalbnbf16
  - scalbnf

    [2 lines not shown]
DeltaFile
+209-12utils/bazel/llvm-project-overlay/libc/BUILD.bazel
+156-2libc/src/__support/math/CMakeLists.txt
+15-43libc/src/math/generic/CMakeLists.txt
+36-0libc/src/__support/math/scalbnf16.h
+36-0libc/src/__support/math/scalbnf128.h
+36-0libc/src/__support/math/scalblnf16.h
+488-5746 files not shown
+1,326-18252 files

LLVM/project 39d1203libc/src/__support/math CMakeLists.txt remainderf16.h, libc/test/shared shared_math_constexpr_test.cpp shared_math_test.cpp

[libc][math] Refactor remainder-remquo family to header-only (#195421)

Refactors the remainder-remquo math family to be header-only.

part of: #147386

Target Functions:
  - remainder
  - remainderbf16
  - remainderf
  - remainderf128
  - remainderf16
  - remainderl
  - remquo
  - remquobf16
  - remquof
  - remquof128
  - remquof16
  - remquol
DeltaFile
+176-6utils/bazel/llvm-project-overlay/libc/BUILD.bazel
+114-0libc/src/__support/math/CMakeLists.txt
+34-0libc/test/shared/shared_math_constexpr_test.cpp
+33-0libc/test/shared/shared_math_test.cpp
+32-0libc/src/__support/math/remainderf16.h
+32-0libc/src/__support/math/remainderf128.h
+421-639 files not shown
+1,078-9645 files

LLVM/project 31068f3llvm/lib/DWARFLinker/Parallel DWARFLinkerCompileUnit.h

[llvm][doc][nfc] Fix typo (#195418)

asinchronously -> asynchronously
DeltaFile
+1-1llvm/lib/DWARFLinker/Parallel/DWARFLinkerCompileUnit.h
+1-11 files

LLVM/project f5c0d91llvm/lib/Target/X86 X86InstrAVX10.td X86IntrinsicsInfo.h, llvm/test/CodeGen/X86 avx10_2bf16-select-minmax.ll

[X86][AVX10.2] Use SDNode patterns based lowering for VMINBF16/VMAXBF16 (#194987)

This PR adds direct SDNode-based selection for AVX10.2 BF16 vmin/vmax.
This unblocks the select-minmax DAG combine which would earlier hit a
selection failure.
DeltaFile
+92-0llvm/test/CodeGen/X86/avx10_2bf16-select-minmax.ll
+2-24llvm/lib/Target/X86/X86InstrAVX10.td
+6-0llvm/lib/Target/X86/X86IntrinsicsInfo.h
+100-243 files

LLVM/project ce7fbcblibc/src/__support/math faddf128.h faddl.h, libc/test/shared shared_math_constexpr_test.cpp CMakeLists.txt

[libc][math] Qualify fadd functions to constexpr (#194322)

Signed-off-by: udaykiriti <udaykiriti624 at gmail.com>
DeltaFile
+4-0libc/test/shared/shared_math_constexpr_test.cpp
+3-0libc/test/shared/CMakeLists.txt
+1-1libc/src/__support/math/faddf128.h
+1-1libc/src/__support/math/faddl.h
+1-1libc/src/__support/math/fadd.h
+10-35 files

LLVM/project 57c6854libcxx/lib/abi i686-linux-android23.libcxxabi.v1.stable.exceptions.nonew.abilist i686-linux-android21.libcxxabi.v1.stable.exceptions.nonew.abilist, libcxx/utils adb_run.py

Update Android CI and Emulator image to API 23 (#194936)

As seen in https://github.com/android/ndk/issues/2188, NDK will raise
minimum supported version to API 23 (Android 6.0) in r31. We need to
bump the API level for the x86 emulator image so we can use it for the
CI. It required generating a new ABI list for API 23 and removing the
old API 21 and making some changes to adb_run.py to filter out warnings
and get permissions for the adb run folder.
DeltaFile
+2,338-0libcxx/lib/abi/i686-linux-android23.libcxxabi.v1.stable.exceptions.nonew.abilist
+0-2,337libcxx/lib/abi/i686-linux-android21.libcxxabi.v1.stable.exceptions.nonew.abilist
+2,332-0libcxx/lib/abi/x86_64-linux-android23.libcxxabi.v1.stable.exceptions.nonew.abilist
+0-2,331libcxx/lib/abi/x86_64-linux-android21.libcxxabi.v1.stable.exceptions.nonew.abilist
+28-5libcxx/utils/adb_run.py
+4-4runtimes/cmake/android/Arch-x86_64.cmake
+4,702-4,6775 files not shown
+4,717-4,69211 files

LLVM/project edfb913libc/src/__support/math CMakeLists.txt fminimum_mag_numf16.h, utils/bazel/llvm-project-overlay/libc BUILD.bazel

[libc][math] Refactor fmaximum-mag-num-fminimum-mag-num family to header-only (#195415)

Refactors the fmaximum-mag-num-fminimum-mag-num math family to be
header-only.

part of: #147386

Target Functions:
  - fmaximum_mag_numf128
  - fmaximum_mag_numf16
  - fmaximum_mag_numl
  - fminimum_mag_num
  - fminimum_mag_numbf16
  - fminimum_mag_numf
  - fminimum_mag_numf128
  - fminimum_mag_numf16
  - fminimum_mag_numl
DeltaFile
+129-4utils/bazel/llvm-project-overlay/libc/BUILD.bazel
+88-0libc/src/__support/math/CMakeLists.txt
+31-0libc/src/__support/math/fminimum_mag_numf16.h
+31-0libc/src/__support/math/fmaximum_mag_numf128.h
+31-0libc/src/__support/math/fmaximum_mag_numf16.h
+31-0libc/src/__support/math/fminimum_mag_numf128.h
+341-429 files not shown
+786-6035 files

LLVM/project a5c9be0libc/src/__support/math CMakeLists.txt fromfpxf128.h, libc/src/math/generic CMakeLists.txt

[libc][math] Refactor fromfp family to header-only (#195413)

Refactors the fromfp math family to be header-only.

part of: #147386

Target Functions:
  - fromfp
  - fromfpbf16
  - fromfpf
  - fromfpf128
  - fromfpf16
  - fromfpl
  - fromfpx
  - fromfpxbf16
  - fromfpxf
  - fromfpxf128
  - fromfpxf16
  - fromfpxl
DeltaFile
+176-6utils/bazel/llvm-project-overlay/libc/BUILD.bazel
+114-0libc/src/__support/math/CMakeLists.txt
+12-24libc/src/math/generic/CMakeLists.txt
+32-0libc/src/__support/math/fromfpxf128.h
+32-0libc/src/__support/math/fromfpf16.h
+32-0libc/src/__support/math/fromfpf128.h
+398-3037 files not shown
+1,029-8043 files

LLVM/project 61a401elibc/src/__support/math CMakeLists.txt fmodf16.h, libc/src/math/generic CMakeLists.txt

[libc][math] Refactor fmod-modf family to header-only (#195406)

Refactors the fmod-modf math family to be header-only.

part of: #147386

Target Functions:
  - fmod
  - fmodbf16
  - fmodf
  - fmodf128
  - fmodf16
  - fmodl
  - modf
  - modfbf16
  - modff
  - modff128
  - modff16
  - modfl
DeltaFile
+157-10utils/bazel/llvm-project-overlay/libc/BUILD.bazel
+80-49libc/test/shared/shared_math_constexpr_test.cpp
+113-0libc/src/__support/math/CMakeLists.txt
+12-24libc/src/math/generic/CMakeLists.txt
+31-0libc/src/__support/math/fmodf16.h
+31-0libc/src/__support/math/modff16.h
+424-8339 files not shown
+1,081-13945 files

LLVM/project b2f9210llvm/lib/Transforms/Vectorize SLPVectorizer.cpp, llvm/test/Transforms/SLPVectorizer/X86 sibling-loops-mismatched-tripcount.ll

[SLP] Keep loops BTCs across CurrentLoopNest truncations

Record SCEV BTCs in a per-depth vector so a later loop nest reaching a
previously merged depth via the empty, divergence, or extend branch in
buildTreeRec is re-validated.

Reviewers: 

Pull Request: https://github.com/llvm/llvm-project/pull/195411
DeltaFile
+39-4llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+24-6llvm/test/Transforms/SLPVectorizer/X86/sibling-loops-mismatched-tripcount.ll
+63-102 files

LLVM/project 6d46b0bclang/test/Instrumentor UnreachableRT.cpp InstrumentorUnreachable.cpp, llvm/include/llvm/Transforms/IPO Instrumentor.h

[Instrumentor] Add unreachable support; unreachable stack trace printing

Allow to instrument unreachable and provide a use case for stack trace
printing.
DeltaFile
+22-0llvm/include/llvm/Transforms/IPO/Instrumentor.h
+20-0clang/test/Instrumentor/UnreachableRT.cpp
+20-0clang/test/Instrumentor/InstrumentorUnreachable.cpp
+15-0clang/test/Instrumentor/UnreachableRT.json
+13-1llvm/lib/Transforms/IPO/Instrumentor.cpp
+5-1clang/test/Instrumentor/lit.local.cfg
+95-26 files