LLVM/project 09d7916llvm/test/CodeGen/X86 bitcnt-big-integer.ll

[X86] bitcnt-big-integer.ll - add non-VLX avx512vpopcntdq test coverage (#182676)

Pulled out of #182547
DeltaFile
+1,693-852llvm/test/CodeGen/X86/bitcnt-big-integer.ll
+1,693-8521 files

LLVM/project 4ec0c73llvm/lib/Target/Hexagon HexagonFrameLowering.cpp, llvm/test/CodeGen/Hexagon bfloat_vec.ll frame-pointer-attr.ll

[Hexagon] Fix hasFP to respect frame-pointer attribute unconditionally (#181524)

HexagonFrameLowering::hasFPImpl() incorrectly gated the
DisableFramePointerElim check behind MFI.getStackSize() > 0. This meant
leaf functions with no stack allocation would not get a frame pointer
even when "frame-pointer"="all" (-fno-omit-frame-pointer) was set,
violating the user/ABI request. Every other LLVM target checks
DisableFramePointerElim unconditionally.

Move the DisableFramePointerElim and EliminateFramePointer checks
outside the getStackSize() > 0 guard so they are always evaluated.
Update affected tests whose CHECK patterns change due to the now-
correct allocframe emission.
DeltaFile
+63-44llvm/test/CodeGen/Hexagon/bfloat_vec.ll
+28-0llvm/test/CodeGen/Hexagon/frame-pointer-attr.ll
+8-4llvm/lib/Target/Hexagon/HexagonFrameLowering.cpp
+3-3llvm/test/CodeGen/Hexagon/constp-extract.ll
+1-1llvm/test/CodeGen/Hexagon/hasfp-crash1.ll
+103-525 files

LLVM/project 8e22227llvm/cmake/modules TableGen.cmake

Revert "[CMake][TableGen] Fix Ninja depslog error with implicit outputs on Ninja <1.10" (#182695)

Reverts llvm/llvm-project#179842

This seems to break some dependency tracking, as I no longer see .inc
files being regenerated when I update a TableGen .cpp file. Reverting
for now per the discussion on the PR.
DeltaFile
+3-43llvm/cmake/modules/TableGen.cmake
+3-431 files

LLVM/project 8f25c6bllvm/lib/Target/AMDGPU AMDGPUISelLowering.cpp

Capitalize
DeltaFile
+41-41llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
+41-411 files

LLVM/project cd8c5c2llvm/lib/Target/AMDGPU AMDGPULegalizerInfo.cpp AMDGPUISelLowering.cpp, llvm/test/CodeGen/AMDGPU llvm.exp10.f64.ll llvm.exp.f64.ll

AMDGPU: Implement expansion for f64 exp

I asked AI to port the device libs reference implementation.
It mostly worked, though it got the compares wrong and also
missed a fold that happened in compiler. With that fixed I get
identical DAG output, and almost the same globalisel output (differing
by an inverted compare and select). Also adjusted some stylistic choices.
DeltaFile
+11,178-0llvm/test/CodeGen/AMDGPU/llvm.exp10.f64.ll
+10,242-0llvm/test/CodeGen/AMDGPU/llvm.exp.f64.ll
+9,987-0llvm/test/CodeGen/AMDGPU/llvm.exp2.f64.ll
+117-9llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
+116-1llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
+31-7llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp
+31,671-176 files not shown
+31,729-6512 files

LLVM/project cfa483bllvm/cmake/modules TableGen.cmake

Revert "[CMake][TableGen] Fix Ninja depslog error with implicit outputs on Ni…"

This reverts commit 2a0b93546138c7250b9c674647150cbe4298e8e9.
DeltaFile
+3-43llvm/cmake/modules/TableGen.cmake
+3-431 files

LLVM/project b785d4allvm/lib/Transforms/InstCombine InstCombineSelect.cpp, llvm/test/Transforms/InstCombine nanless-canonicalize-combine.ll

InstCombine: Fold out nanless canonicalize pattern

Pattern match a wrapper around llvm.canonicalize which
weakens the semantics to not require quieting signaling
nans. Depending on the denormal mode and FP type, we can
either drop the pattern entirely or reduce it only to
a canonicalize call. I'm inventing this pattern to deal
with LLVM's lax canonicalization model in math library
code.

The math library code currently has explicit checks for
the denormal mode, and conditionally canonicalizes the
result if there is flushing. Semantically, this could be
directly replaced with a simple call to llvm.canonicalize,
but doing so would incur an additional cost when using
standard IEEE behavior. If we do not care about quieting
a signaling nan, this should be a no-op unless the denormal
mode may flush. This will allow replacement of the
conditional code with a zero cost abstraction utility

    [17 lines not shown]
DeltaFile
+51-155llvm/test/Transforms/InstCombine/nanless-canonicalize-combine.ll
+103-0llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
+154-1552 files

LLVM/project a6416a8llvm/test/TableGen RegisterInfoEmitter-regcost-tuple.td

[NFC] Simplify a RegisterInfoEmitter lit test (#182672)

Eliminate SubRegIndex defs that are not used/required for the test.
DeltaFile
+0-19llvm/test/TableGen/RegisterInfoEmitter-regcost-tuple.td
+0-191 files

LLVM/project 2f43a1ellvm/test/Transforms/InstCombine nanless-canonicalize-combine.ll

InstCombine: Add baseline test for nanless canonicalize combine
DeltaFile
+832-0llvm/test/Transforms/InstCombine/nanless-canonicalize-combine.ll
+832-01 files

LLVM/project 4846e3allvm/lib/Transforms/InstCombine InstCombineSelect.cpp, llvm/test/Transforms/InstCombine select-fcmp-fmul-zero-absorbing-value.ll

InstCombine: Fold absorbing fmul of compared 0 into select (#172381)

This is similar to the select-bin-op identity case, except
in this case we are looking for the absorbing value for the
binary operator.

If the compared value is a floating-point 0, and the fmul is
implied to return a +0, put the 0 directly into the select
operand. This pattern appears in scale-if-denormal sequences
after  optimizations assume denormals are treated as 0.

Fold:

```
%fabs.x = call float @llvm.fabs.f32(float %x)
%mul.fabs.x = fmul float %fabs.x, known_positive
%x.is.zero = fcmp oeq float %x, 0.0
%select = select i1 %x.is.zero, float %mul.fabs.x, float %fabs.x


    [12 lines not shown]
DeltaFile
+39-19llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
+7-15llvm/test/Transforms/InstCombine/select-fcmp-fmul-zero-absorbing-value.ll
+46-342 files

LLVM/project 18131dcllvm/test lit.cfg.py, llvm/utils profcheck-xfail.txt

[ProfCheck] Exclude bitcode tests

These tests fail due to inserted function entry count annotations. Just
exclude them for now given they aren't actually running any passes.
DeltaFile
+2-2llvm/test/lit.cfg.py
+0-2llvm/utils/profcheck-xfail.txt
+2-42 files

LLVM/project 20bce6dclang-tools-extra/clang-tidy/performance FasterStringFindCheck.cpp, clang-tools-extra/docs ReleaseNotes.rst

[clang-tidy] Teach `performance-faster-string-find` about `starts_with`, `ends_with`, and `contains` (#182633)

These aren't "find" functions per se, so they don't totally match the
check name, but the same optimization is applicable to them (for
example, see
https://en.cppreference.com/w/cpp/string/basic_string_view/starts_with.html).
This optimization could be expanded to `operator+=` as well, but that's
a bit more involved, so I'm not doing it in this PR.
DeltaFile
+15-0clang-tools-extra/test/clang-tidy/checkers/performance/faster-string-find.cpp
+6-8clang-tools-extra/clang-tidy/performance/FasterStringFindCheck.cpp
+4-3clang-tools-extra/docs/clang-tidy/checks/performance/faster-string-find.rst
+5-0clang-tools-extra/docs/ReleaseNotes.rst
+30-114 files

LLVM/project d01b078llvm/lib/Transforms/InstCombine InstCombineSelect.cpp, llvm/test/Transforms/InstCombine select-fcmp-fmul-zero-absorbing-value.ll

InstCombine: Fold absorbing fmul of compared 0 into select

This is similar to the select-bin-op identity case, except
in this case we are looking for the absorbing value for the
binary operator.

If the compared value is a floating-point 0, and the fmul is
implied to return a +0, put the 0 directly into the select
operand. This pattern appears in scale-if-denormal sequences
after  optimizations assume denormals are treated as 0.

Fold:
  %fabs.x = call float @llvm.fabs.f32(float %x)
  %mul.fabs.x = fmul float %fabs.x, known_positive
  %x.is.zero = fcmp oeq float %x, 0.0
  %select = select i1 %x.is.zero, float %mul.fabs.x, float %fabs.x

To:
  %fabs.x = call float @llvm.fabs.f32(float %x)

    [5 lines not shown]
DeltaFile
+39-19llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
+7-15llvm/test/Transforms/InstCombine/select-fcmp-fmul-zero-absorbing-value.ll
+46-342 files

LLVM/project 6e0054allvm/lib/Transforms/Scalar Scalarizer.cpp, llvm/test/Transforms/Scalarizer constant-extractelement.ll

[Scalarizer] Fix out-of-bounds crash (#180359)

When processing an extractelement instruction with an index that exceeds
the vector size (e.g., extracting index 2147483647 from a 4-element
vector), the scalarizer would calculate an out-of-bounds Fragment index
and crash with an assertion failure in `SmallVector::operator[]`.

This PR adds a bounds check in
`ScalarizerVisitor::visitExtractElementInst` to prevent a crash when the
extractelement index is out of bounds.

Fixes #179880
DeltaFile
+10-0llvm/test/Transforms/Scalarizer/constant-extractelement.ll
+2-0llvm/lib/Transforms/Scalar/Scalarizer.cpp
+12-02 files

LLVM/project a67bf7d.github/workflows lldb-pylint-action.yml

Remove whitespace on blank lines (#182574)

I removed some whitespace on a workflow job, which only had spaces.
I did not remove the newline completelty, only the whitespace junk,
which I found by git diffing the head.
DeltaFile
+2-2.github/workflows/lldb-pylint-action.yml
+2-21 files

LLVM/project 7a1c498llvm/test/Transforms/InstCombine select-fcmp-fmul-zero-absorbing-value.ll

[InstCombine] Update test

This was breaking buildbots due to a mid-air collision where some change
caused test differences between when the test was put up/passed CI and
when it landed.
DeltaFile
+1-1llvm/test/Transforms/InstCombine/select-fcmp-fmul-zero-absorbing-value.ll
+1-11 files

LLVM/project 7ed0aa2offload/plugins-nextgen/level_zero/include L0Plugin.h, offload/plugins-nextgen/level_zero/src L0Program.cpp L0Kernel.cpp

[OFFLOAD][L0] Remove leftover global constructor (#182611) (#182665)

fixes #182611
DeltaFile
+5-2offload/plugins-nextgen/level_zero/src/L0Program.cpp
+3-3offload/plugins-nextgen/level_zero/include/L0Plugin.h
+3-3offload/plugins-nextgen/level_zero/src/L0Kernel.cpp
+0-4offload/plugins-nextgen/level_zero/src/L0Plugin.cpp
+11-124 files

LLVM/project 15430ballvm/lib/CodeGen/SelectionDAG DAGCombiner.cpp

[DAGCombiner] Use APInt::isPower2() instead of popcount() == 1. NFC (#182600)

DeltaFile
+1-1llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+1-11 files

LLVM/project fe5096fllvm/include/llvm/IR PatternMatch.h

[PatternMatch] Use APInt::tryZExtValue. NFC (#182618)

DeltaFile
+3-2llvm/include/llvm/IR/PatternMatch.h
+3-21 files

LLVM/project 8d3e6e7llvm/lib/Transforms/InstCombine InstCombineCalls.cpp, llvm/test/Transforms/InstCombine vector-reductions.ll

[InstCombine] Transform splat before n x i1 for vec.reduce.add (#182213)

```llvm
define i1 @src(i1 %0) {
  %2 = insertelement <8 x i1> poison, i1 %0, i32 0
  %3 = shufflevector <8 x i1> %2, <8 x i1> poison, <8 x i32> zeroinitializer
  %4 = tail call i1 @llvm.vector.reduce.add.v8i1(<8 x i1> %3)
  ret i1 %4
}

define i1 @tgt(i1 %0) {
  ret i1 0
}
```

alive2: https://alive2.llvm.org/ce/z/vejxot

`vector_reduce_add(<n x i1>)` to `Trunc(ctpop(bitcast <n x i1> to in))`
interferes with the `vector_reduce_add(<splat>)` to `mul`, so I

    [2 lines not shown]
DeltaFile
+13-13llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
+1-6llvm/test/Transforms/InstCombine/vector-reductions.ll
+14-192 files

LLVM/project c936398llvm/test/Transforms/InstCombine select-fcmp-fmul-zero-absorbing-value.ll

InstCombine: Add baseline test for fcmp-0-select combine (#172380)

DeltaFile
+601-0llvm/test/Transforms/InstCombine/select-fcmp-fmul-zero-absorbing-value.ll
+601-01 files

LLVM/project f1bfed1llvm/docs ReleaseNotes.md, llvm/lib/Target/ARM ARMISelLowering.cpp

[ARM] support `r14` as an alias for `lr` in inline assembly (#179740)

In rustc (and I suspect Clang and Zig) there is some special logic to
rewrite `r14` into `lr` when used in inline assembly. LLVM should
probably support `r14` directly.


https://developer.arm.com/documentation/ddi0211/i/programmer-s-model/registers/the-arm-state-register-set

> You can treat r14 as a general-purpose register at all other times.

This heavily suggests that we should be able to use it as a clobber and
read its value.

This is the arm analogue to
https://github.com/llvm/llvm-project/pull/167783.
DeltaFile
+25-0llvm/test/CodeGen/ARM/inline-asm-clobber.ll
+4-0llvm/lib/Target/ARM/ARMISelLowering.cpp
+4-0llvm/docs/ReleaseNotes.md
+33-03 files

LLVM/project c3e318dlibcxx/test/std/atomics/atomics.types.operations/atomics.types.operations.req atomic_fetch_max.pass.cpp atomic_fetch_max_explicit.pass.cpp

header
DeltaFile
+1-0libcxx/test/std/atomics/atomics.types.operations/atomics.types.operations.req/atomic_fetch_max.pass.cpp
+1-0libcxx/test/std/atomics/atomics.types.operations/atomics.types.operations.req/atomic_fetch_max_explicit.pass.cpp
+1-0libcxx/test/std/atomics/atomics.types.operations/atomics.types.operations.req/atomic_fetch_min.pass.cpp
+1-0libcxx/test/std/atomics/atomics.types.operations/atomics.types.operations.req/atomic_fetch_min_explicit.pass.cpp
+4-04 files

LLVM/project a85a1dflibcxx/include/__atomic atomic.h atomic_ref.h, libcxx/include/__atomic/support gcc.h c11.h

address review comments
DeltaFile
+48-21libcxx/test/std/atomics/atomics.ref/fetch_max.pass.cpp
+48-21libcxx/test/std/atomics/atomics.ref/fetch_min.pass.cpp
+31-8libcxx/include/__atomic/atomic.h
+28-0libcxx/include/__atomic/atomic_ref.h
+12-14libcxx/include/__atomic/support/gcc.h
+8-8libcxx/include/__atomic/support/c11.h
+175-725 files not shown
+187-7611 files

LLVM/project 334502dllvm/docs/TableGen ProgRef.rst, llvm/lib/TableGen TGParser.cpp

[TableGen] Add let append/prepend syntax for field concatenation
DeltaFile
+110-0llvm/test/TableGen/let-append.td
+98-0mlir/test/mlir-tblgen/op-decl-and-defs.td
+82-0mlir/test/mlir-tblgen/typedefs.td
+82-0mlir/test/mlir-tblgen/attrdefs.td
+68-7llvm/lib/TableGen/TGParser.cpp
+41-2llvm/docs/TableGen/ProgRef.rst
+481-98 files not shown
+568-1614 files

LLVM/project b397c9dllvm/lib/Transforms/IPO FunctionAttrs.cpp, llvm/test/Transforms/FunctionAttrs nofpclass.ll

FunctionAttrs: Basic propagation of nofpclass (#182444)

DeltaFile
+317-0llvm/test/Transforms/FunctionAttrs/nofpclass.ll
+58-4llvm/lib/Transforms/IPO/FunctionAttrs.cpp
+375-42 files

LLVM/project aa13cd6llvm/lib/Transforms/IPO FunctionAttrs.cpp, llvm/test/Transforms/FunctionAttrs nofpclass.ll

Address comments
DeltaFile
+13-4llvm/lib/Transforms/IPO/FunctionAttrs.cpp
+1-1llvm/test/Transforms/FunctionAttrs/nofpclass.ll
+14-52 files

LLVM/project 5ecc64allvm/lib/Transforms/IPO FunctionAttrs.cpp, llvm/test/Transforms/FunctionAttrs nofpclass.ll

FunctionAttrs: Basic propagation of nofpclass

Perform caller->callee propagation of nofpclass on callsites. As
far as I can tell the only prior callsite to callee propagation here
was for norecurse. This doesn't handle transitive callers.

I was hoping to avoid doing this, and instead get attributor/attributor-light
enabled in the default pass pipeline. nofpclass propagation enabled by
default is the main blocker for eliminating the finite_only_opt global
check in device-libs, but this single level of propagation is most likely
sufficient for that use. Implemnting this here is probably the most expedient
path to removing the control library.
DeltaFile
+317-0llvm/test/Transforms/FunctionAttrs/nofpclass.ll
+49-4llvm/lib/Transforms/IPO/FunctionAttrs.cpp
+366-42 files

LLVM/project f875f8fllvm/lib/Transforms/IPO Attributor.cpp

Attributor: Avoid double map lookup in updateAttrMap

This will leave behind the map entry in the unchanged case,
but this seems to not matter. Could erase the newly inserted
entry if that happens, but that also doesn't seem to make a
difference.
DeltaFile
+7-8llvm/lib/Transforms/IPO/Attributor.cpp
+7-81 files

LLVM/project 4a49122llvm/lib/Analysis ValueTracking.cpp, llvm/lib/Support KnownFPClass.cpp

ValueTracking: Handle tracking nan through powi (#179311)

DeltaFile
+161-1llvm/test/Transforms/Attributor/nofpclass-powi.ll
+15-0llvm/lib/Support/KnownFPClass.cpp
+1-1llvm/lib/Analysis/ValueTracking.cpp
+177-23 files