LLVM/project 20bce6dclang-tools-extra/clang-tidy/performance FasterStringFindCheck.cpp, clang-tools-extra/docs ReleaseNotes.rst

[clang-tidy] Teach `performance-faster-string-find` about `starts_with`, `ends_with`, and `contains` (#182633)

These aren't "find" functions per se, so they don't totally match the
check name, but the same optimization is applicable to them (for
example, see
https://en.cppreference.com/w/cpp/string/basic_string_view/starts_with.html).
This optimization could be expanded to `operator+=` as well, but that's
a bit more involved, so I'm not doing it in this PR.
DeltaFile
+15-0clang-tools-extra/test/clang-tidy/checkers/performance/faster-string-find.cpp
+6-8clang-tools-extra/clang-tidy/performance/FasterStringFindCheck.cpp
+4-3clang-tools-extra/docs/clang-tidy/checks/performance/faster-string-find.rst
+5-0clang-tools-extra/docs/ReleaseNotes.rst
+30-114 files

LLVM/project d01b078llvm/lib/Transforms/InstCombine InstCombineSelect.cpp, llvm/test/Transforms/InstCombine select-fcmp-fmul-zero-absorbing-value.ll

InstCombine: Fold absorbing fmul of compared 0 into select

This is similar to the select-bin-op identity case, except
in this case we are looking for the absorbing value for the
binary operator.

If the compared value is a floating-point 0, and the fmul is
implied to return a +0, put the 0 directly into the select
operand. This pattern appears in scale-if-denormal sequences
after  optimizations assume denormals are treated as 0.

Fold:
  %fabs.x = call float @llvm.fabs.f32(float %x)
  %mul.fabs.x = fmul float %fabs.x, known_positive
  %x.is.zero = fcmp oeq float %x, 0.0
  %select = select i1 %x.is.zero, float %mul.fabs.x, float %fabs.x

To:
  %fabs.x = call float @llvm.fabs.f32(float %x)

    [5 lines not shown]
DeltaFile
+39-19llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
+7-15llvm/test/Transforms/InstCombine/select-fcmp-fmul-zero-absorbing-value.ll
+46-342 files

LLVM/project 6e0054allvm/lib/Transforms/Scalar Scalarizer.cpp, llvm/test/Transforms/Scalarizer constant-extractelement.ll

[Scalarizer] Fix out-of-bounds crash (#180359)

When processing an extractelement instruction with an index that exceeds
the vector size (e.g., extracting index 2147483647 from a 4-element
vector), the scalarizer would calculate an out-of-bounds Fragment index
and crash with an assertion failure in `SmallVector::operator[]`.

This PR adds a bounds check in
`ScalarizerVisitor::visitExtractElementInst` to prevent a crash when the
extractelement index is out of bounds.

Fixes #179880
DeltaFile
+10-0llvm/test/Transforms/Scalarizer/constant-extractelement.ll
+2-0llvm/lib/Transforms/Scalar/Scalarizer.cpp
+12-02 files

LLVM/project a67bf7d.github/workflows lldb-pylint-action.yml

Remove whitespace on blank lines (#182574)

I removed some whitespace on a workflow job, which only had spaces.
I did not remove the newline completelty, only the whitespace junk,
which I found by git diffing the head.
DeltaFile
+2-2.github/workflows/lldb-pylint-action.yml
+2-21 files

LLVM/project 7a1c498llvm/test/Transforms/InstCombine select-fcmp-fmul-zero-absorbing-value.ll

[InstCombine] Update test

This was breaking buildbots due to a mid-air collision where some change
caused test differences between when the test was put up/passed CI and
when it landed.
DeltaFile
+1-1llvm/test/Transforms/InstCombine/select-fcmp-fmul-zero-absorbing-value.ll
+1-11 files

LLVM/project 7ed0aa2offload/plugins-nextgen/level_zero/include L0Plugin.h, offload/plugins-nextgen/level_zero/src L0Program.cpp L0Kernel.cpp

[OFFLOAD][L0] Remove leftover global constructor (#182611) (#182665)

fixes #182611
DeltaFile
+5-2offload/plugins-nextgen/level_zero/src/L0Program.cpp
+3-3offload/plugins-nextgen/level_zero/include/L0Plugin.h
+3-3offload/plugins-nextgen/level_zero/src/L0Kernel.cpp
+0-4offload/plugins-nextgen/level_zero/src/L0Plugin.cpp
+11-124 files

LLVM/project 15430ballvm/lib/CodeGen/SelectionDAG DAGCombiner.cpp

[DAGCombiner] Use APInt::isPower2() instead of popcount() == 1. NFC (#182600)

DeltaFile
+1-1llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+1-11 files

LLVM/project fe5096fllvm/include/llvm/IR PatternMatch.h

[PatternMatch] Use APInt::tryZExtValue. NFC (#182618)

DeltaFile
+3-2llvm/include/llvm/IR/PatternMatch.h
+3-21 files

LLVM/project 8d3e6e7llvm/lib/Transforms/InstCombine InstCombineCalls.cpp, llvm/test/Transforms/InstCombine vector-reductions.ll

[InstCombine] Transform splat before n x i1 for vec.reduce.add (#182213)

```llvm
define i1 @src(i1 %0) {
  %2 = insertelement <8 x i1> poison, i1 %0, i32 0
  %3 = shufflevector <8 x i1> %2, <8 x i1> poison, <8 x i32> zeroinitializer
  %4 = tail call i1 @llvm.vector.reduce.add.v8i1(<8 x i1> %3)
  ret i1 %4
}

define i1 @tgt(i1 %0) {
  ret i1 0
}
```

alive2: https://alive2.llvm.org/ce/z/vejxot

`vector_reduce_add(<n x i1>)` to `Trunc(ctpop(bitcast <n x i1> to in))`
interferes with the `vector_reduce_add(<splat>)` to `mul`, so I

    [2 lines not shown]
DeltaFile
+13-13llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
+1-6llvm/test/Transforms/InstCombine/vector-reductions.ll
+14-192 files

LLVM/project c936398llvm/test/Transforms/InstCombine select-fcmp-fmul-zero-absorbing-value.ll

InstCombine: Add baseline test for fcmp-0-select combine (#172380)

DeltaFile
+601-0llvm/test/Transforms/InstCombine/select-fcmp-fmul-zero-absorbing-value.ll
+601-01 files

LLVM/project f1bfed1llvm/docs ReleaseNotes.md, llvm/lib/Target/ARM ARMISelLowering.cpp

[ARM] support `r14` as an alias for `lr` in inline assembly (#179740)

In rustc (and I suspect Clang and Zig) there is some special logic to
rewrite `r14` into `lr` when used in inline assembly. LLVM should
probably support `r14` directly.


https://developer.arm.com/documentation/ddi0211/i/programmer-s-model/registers/the-arm-state-register-set

> You can treat r14 as a general-purpose register at all other times.

This heavily suggests that we should be able to use it as a clobber and
read its value.

This is the arm analogue to
https://github.com/llvm/llvm-project/pull/167783.
DeltaFile
+25-0llvm/test/CodeGen/ARM/inline-asm-clobber.ll
+4-0llvm/lib/Target/ARM/ARMISelLowering.cpp
+4-0llvm/docs/ReleaseNotes.md
+33-03 files

LLVM/project c3e318dlibcxx/test/std/atomics/atomics.types.operations/atomics.types.operations.req atomic_fetch_max.pass.cpp atomic_fetch_max_explicit.pass.cpp

header
DeltaFile
+1-0libcxx/test/std/atomics/atomics.types.operations/atomics.types.operations.req/atomic_fetch_max.pass.cpp
+1-0libcxx/test/std/atomics/atomics.types.operations/atomics.types.operations.req/atomic_fetch_max_explicit.pass.cpp
+1-0libcxx/test/std/atomics/atomics.types.operations/atomics.types.operations.req/atomic_fetch_min.pass.cpp
+1-0libcxx/test/std/atomics/atomics.types.operations/atomics.types.operations.req/atomic_fetch_min_explicit.pass.cpp
+4-04 files

LLVM/project a85a1dflibcxx/include/__atomic atomic.h atomic_ref.h, libcxx/include/__atomic/support gcc.h c11.h

address review comments
DeltaFile
+48-21libcxx/test/std/atomics/atomics.ref/fetch_max.pass.cpp
+48-21libcxx/test/std/atomics/atomics.ref/fetch_min.pass.cpp
+31-8libcxx/include/__atomic/atomic.h
+28-0libcxx/include/__atomic/atomic_ref.h
+12-14libcxx/include/__atomic/support/gcc.h
+8-8libcxx/include/__atomic/support/c11.h
+175-725 files not shown
+187-7611 files

LLVM/project 334502dllvm/docs/TableGen ProgRef.rst, llvm/lib/TableGen TGParser.cpp

[TableGen] Add let append/prepend syntax for field concatenation
DeltaFile
+110-0llvm/test/TableGen/let-append.td
+98-0mlir/test/mlir-tblgen/op-decl-and-defs.td
+82-0mlir/test/mlir-tblgen/typedefs.td
+82-0mlir/test/mlir-tblgen/attrdefs.td
+68-7llvm/lib/TableGen/TGParser.cpp
+41-2llvm/docs/TableGen/ProgRef.rst
+481-98 files not shown
+568-1614 files

LLVM/project b397c9dllvm/lib/Transforms/IPO FunctionAttrs.cpp, llvm/test/Transforms/FunctionAttrs nofpclass.ll

FunctionAttrs: Basic propagation of nofpclass (#182444)

DeltaFile
+317-0llvm/test/Transforms/FunctionAttrs/nofpclass.ll
+58-4llvm/lib/Transforms/IPO/FunctionAttrs.cpp
+375-42 files

LLVM/project aa13cd6llvm/lib/Transforms/IPO FunctionAttrs.cpp, llvm/test/Transforms/FunctionAttrs nofpclass.ll

Address comments
DeltaFile
+13-4llvm/lib/Transforms/IPO/FunctionAttrs.cpp
+1-1llvm/test/Transforms/FunctionAttrs/nofpclass.ll
+14-52 files

LLVM/project 5ecc64allvm/lib/Transforms/IPO FunctionAttrs.cpp, llvm/test/Transforms/FunctionAttrs nofpclass.ll

FunctionAttrs: Basic propagation of nofpclass

Perform caller->callee propagation of nofpclass on callsites. As
far as I can tell the only prior callsite to callee propagation here
was for norecurse. This doesn't handle transitive callers.

I was hoping to avoid doing this, and instead get attributor/attributor-light
enabled in the default pass pipeline. nofpclass propagation enabled by
default is the main blocker for eliminating the finite_only_opt global
check in device-libs, but this single level of propagation is most likely
sufficient for that use. Implemnting this here is probably the most expedient
path to removing the control library.
DeltaFile
+317-0llvm/test/Transforms/FunctionAttrs/nofpclass.ll
+49-4llvm/lib/Transforms/IPO/FunctionAttrs.cpp
+366-42 files

LLVM/project f875f8fllvm/lib/Transforms/IPO Attributor.cpp

Attributor: Avoid double map lookup in updateAttrMap

This will leave behind the map entry in the unchanged case,
but this seems to not matter. Could erase the newly inserted
entry if that happens, but that also doesn't seem to make a
difference.
DeltaFile
+7-8llvm/lib/Transforms/IPO/Attributor.cpp
+7-81 files

LLVM/project 4a49122llvm/lib/Analysis ValueTracking.cpp, llvm/lib/Support KnownFPClass.cpp

ValueTracking: Handle tracking nan through powi (#179311)

DeltaFile
+161-1llvm/test/Transforms/Attributor/nofpclass-powi.ll
+15-0llvm/lib/Support/KnownFPClass.cpp
+1-1llvm/lib/Analysis/ValueTracking.cpp
+177-23 files

LLVM/project ce5c193llvm/lib/Target/RISCV RISCVMergeBaseOffset.cpp, llvm/test/CodeGen/RISCV xqcisls-merge-base-offset-shladd.ll

[RISCV] Fold shladd into Xqcisls scaled load/store in RISCVMergeBaseOffset (#182221)

We can fold `shxadd\qc.shladd` into base+offset load/store instructions
by transforming the load/store into `Xqcisls` scaled load/store
instructions.

For eg.

```
qc.e.li vreg1, s 
shxadd vreg2, vreg3, vreg1
lx vreg4, imm(vreg2)

can be transformed to

qc.e.li vreg1, s+imm
qc.lrx vreg4, vreg1, vreg3, (1-7)
```


    [5 lines not shown]
DeltaFile
+192-0llvm/test/CodeGen/RISCV/xqcisls-merge-base-offset-shladd.ll
+118-0llvm/lib/Target/RISCV/RISCVMergeBaseOffset.cpp
+310-02 files

LLVM/project 1373aa0llvm/test/CodeGen/AArch64 clmul-fixed.ll

[AArch64] Add clmulh/r v16i8/v8i16/v4i32/v2i64 test coverage (#182305)

Some of the v16i8/v2i64 tests are currently disabled due to #182270 and
#182039
DeltaFile
+1,262-0llvm/test/CodeGen/AArch64/clmul-fixed.ll
+1,262-01 files

LLVM/project f08cb41llvm/lib/Transforms/IPO Attributor.cpp

Attributor: Avoid calling identifyDefaultAbstractAttributes on declarations

Previously it would be called and inserted into a visited map,
but would never be used. This could possibly go one step further
and never add declarations to the SetVector of Functions. If I try
that, only one call graph printing test fails.
DeltaFile
+8-2llvm/lib/Transforms/IPO/Attributor.cpp
+8-21 files

LLVM/project 24b9655llvm/include/llvm/MC MCStreamer.h MCLFIRewriter.h, llvm/include/llvm/MC/MCParser MCAsmParserExtension.h

[NFC][LFI] Reduce includes due to c-t impact (#182617)

Removes header includes that don't need to be made at the top-level by
moving transitive dependencies directly into source files and using
forward declarations. Biggest impact is that we no longer include
`MCLFIRewriter.h` in `MCStreamer.h` and `MCAsmParserExtension.h`.
DeltaFile
+2-4llvm/include/llvm/MC/MCStreamer.h
+5-0llvm/lib/MC/MCStreamer.cpp
+3-1llvm/lib/MC/MCLFI.cpp
+1-1llvm/include/llvm/MC/MCParser/MCAsmParserExtension.h
+1-1llvm/include/llvm/MC/MCLFIRewriter.h
+1-1llvm/lib/MC/MCLFIRewriter.cpp
+13-82 files not shown
+13-108 files

LLVM/project 7c61596clang-tools-extra/clang-tidy/misc ConstCorrectnessCheck.cpp, clang-tools-extra/docs ReleaseNotes.rst

[clang-tidy] Correctly handle array of pointers in misc-const-correctness (#179059)

In arrays of pointers, `misc-const-correctness` check wrongly inspects
whether the array element type was const-qualified, rather than the type
it points to, leading to redundant `const` suggestions. This patch fixes
the problem.

Closes [#178880](https://github.com/llvm/llvm-project/issues/178880)
DeltaFile
+27-0clang-tools-extra/test/clang-tidy/checkers/misc/const-correctness-pointer-as-pointers.cpp
+2-3clang-tools-extra/clang-tidy/misc/ConstCorrectnessCheck.cpp
+3-0clang-tools-extra/docs/ReleaseNotes.rst
+32-33 files

LLVM/project af9ca0eflang/lib/Optimizer/Builder CUDAIntrinsicCall.cpp, flang/module cuda_runtime_api.f90

Revert "[flang][cuda] Add entry points for cudastreamsynchronize (#181932)" (#182657)

This is causing some testing issue. Reverting for now.
DeltaFile
+0-11flang/test/Lower/CUDA/cuda-default-stream.cuf
+0-10flang/lib/Optimizer/Builder/CUDAIntrinsicCall.cpp
+0-10flang/module/cuda_runtime_api.f90
+0-313 files

LLVM/project 0ef4b21clang/docs CMakeLists.txt index.rst, clang/include/clang/Basic BuiltinsAMDGPUDocs.td BuiltinsAMDGPU.td

[Clang][AMDGPU][Docs] Add builtin documentation for AMDGPU builtins

Use the documentation generation infrastructure to document the AMDGPU builtins.
This PR starts with the ABI / Special Register builtins. Documentation for the
remaining builtin categories will be added incrementally in follow-up patches.
DeltaFile
+268-0clang/include/clang/Basic/BuiltinsAMDGPUDocs.td
+100-27clang/include/clang/Basic/BuiltinsAMDGPU.td
+1-0clang/docs/CMakeLists.txt
+1-0clang/docs/index.rst
+370-274 files

LLVM/project 1aaa338clang/include/clang/Basic BuiltinsBase.td, clang/test/TableGen builtin-docs.td

[Clang][TableGen] Add documentation generation infrastructure for builtins (#181573)

Add a `-gen-builtin-docs` TableGen backend that generates RST
documentation from builtin definitions, modeled after the existing
attribute documentation system (`-gen-attr-docs`).

The emitter generates per-builtin RST sections grouped by category,
including
prototype rendering with optional named parameters (via `ArgNames`),
target
feature annotations, and documentation content. A mismatch between
`ArgNames`
count and prototype parameter count is a fatal error.
DeltaFile
+265-0clang/test/TableGen/builtin-docs.td
+183-0clang/utils/TableGen/ClangBuiltinsEmitter.cpp
+50-0clang/include/clang/Basic/BuiltinsBase.td
+6-0clang/utils/TableGen/TableGen.cpp
+2-0clang/utils/TableGen/TableGenBackends.h
+506-05 files

LLVM/project 17ad555clang/docs ReleaseNotes.rst, clang/include/clang/Sema Sema.h

[Clang] Added clang diagnostic when snprintf/vsnprintf uses sizeof(dest) for the len parameter

Closes: [#162366](https://github.com/llvm/llvm-project/issues/162366)

---------

Co-authored-by: Bogdan Zunic <bzunic at cisco.com>
DeltaFile
+71-56clang/lib/Sema/SemaChecking.cpp
+116-0clang/test/SemaCXX/warn-memset-bad-sizeof.cpp
+3-0clang/docs/ReleaseNotes.rst
+2-0clang/include/clang/Sema/Sema.h
+192-564 files

LLVM/project d710b1cclang/docs CMakeLists.txt index.rst, clang/include/clang/Basic BuiltinsAMDGPUDocs.td BuiltinsAMDGPU.td

[Clang][AMDGPU][Docs] Add builtin documentation for AMDGPU builtins

Use the documentation generation infrastructure to document the AMDGPU builtins.
This PR starts with the ABI / Special Register builtins. Documentation for the
remaining builtin categories will be added incrementally in follow-up patches.
DeltaFile
+291-0clang/include/clang/Basic/BuiltinsAMDGPUDocs.td
+114-30clang/include/clang/Basic/BuiltinsAMDGPU.td
+1-0clang/docs/CMakeLists.txt
+1-0clang/docs/index.rst
+407-304 files

LLVM/project 802e1afclang/utils/TableGen ClangBuiltinsEmitter.cpp

review comments
DeltaFile
+18-22clang/utils/TableGen/ClangBuiltinsEmitter.cpp
+18-221 files