LLVM/project 9642bc5.github/workflows prune-unused-branches.py

feedback

Created using spr 1.3.7
DeltaFile
+0-4.github/workflows/prune-unused-branches.py
+0-41 files

LLVM/project fb0d7dfllvm/unittests/IR MetadataTest.cpp

Drop the summation unittest since it's already covered by the gvn lit tests
DeltaFile
+0-26llvm/unittests/IR/MetadataTest.cpp
+0-261 files

LLVM/project fccfd89llvm/lib/IR Metadata.cpp

Move the check after merging for calls to simplify the condition
DeltaFile
+3-6llvm/lib/IR/Metadata.cpp
+3-61 files

LLVM/project 1c88701llvm/lib/IR Metadata.cpp, llvm/unittests/IR MetadataTest.cpp

[Metadata][profcheck] Handle identical MDNodes in getMergedProfMetadata

This fixes a bug where !prof metadata was dropped from SelectInsts when GVN simplified/merged them.
Guarded by -profcheck-disable-metadata-fixes. Exposed by the tests in
Transforms/SampleProfile.
DeltaFile
+50-0llvm/unittests/IR/MetadataTest.cpp
+12-0llvm/lib/IR/Metadata.cpp
+62-02 files

LLVM/project e4b8d8alibcxx/include istream, libcxx/test/libcxx/input.output/iostream.format nodiscard.verify.cpp

[libc++][istream] Removed `[[nodiscard]]` from `peek()` (#175591)

Calling `peek()` after constructing a stream is something one can use to
make the stream ignore empty inputs:

```
#include <sstream>

int main() {
  std::istringstream s;
  s.peek();
  while (s && !s.eof()) {
    char c;
    s >> c;
    printf("not eof; read \'%c\' (%d)\n", c, c);
  }
}
```


    [2 lines not shown]
DeltaFile
+0-3libcxx/test/libcxx/input.output/iostream.format/nodiscard.verify.cpp
+1-1libcxx/include/istream
+1-42 files

LLVM/project bd28c6allvm/include/llvm/IR DebugInfoFlags.def, llvm/test/Assembler debug-info.ll disubprogram.ll

[DebugInfo] Add a new DI flag to record if the name of a template function/type has been simplified (1/3). (#175130)

This flag is used during debug info generation in the LLVM backend to
guide the selective generation of template parameters in the skeleton
CU. As described in [this
RFC](https://discourse.llvm.org/t/rfc-debuginfo-selectively-generate-template-parameters-in-the-skeleton-cu/89395).
DeltaFile
+5-2llvm/test/Assembler/debug-info.ll
+5-2llvm/test/Assembler/disubprogram.ll
+2-1llvm/include/llvm/IR/DebugInfoFlags.def
+12-53 files

LLVM/project bb008e7llvm/utils git-llvm-push

[llvm][utils] Make git-llvm-push set the skip-precommit-approval label (#174833)

skip-precommit-approval label is intended for simple PR that don't
require approval. To reduce the volume of notifications, label all PRs
created using the git-llvm-push script with the skip-precommit-approval
label.

Fixes #174825
DeltaFile
+33-0llvm/utils/git-llvm-push
+33-01 files

LLVM/project 9e16060llvm/include/llvm/CodeGen TargetInstrInfo.h, llvm/lib/Target/RISCV RISCVFrameLowering.cpp

[CodeGen][InlineSpiller] Add SubReg argument to loadRegFromStackSlot for subreg-reload (#175581)

This preparatory patch introduces an additional argument to the target hook
loadRegFromStackSlot. Ths is essential for targets to handle subregister-specific
reload in the future. See how this is used for AMDGPU target with PR #175002.
DeltaFile
+9-6llvm/lib/Target/RISCV/RISCVFrameLowering.cpp
+6-3llvm/include/llvm/CodeGen/TargetInstrInfo.h
+2-3llvm/lib/Target/X86/X86InstrInfo.h
+2-3llvm/lib/Target/XCore/XCoreInstrInfo.h
+1-2llvm/lib/Target/SystemZ/SystemZInstrInfo.h
+1-2llvm/lib/Target/VE/VEInstrInfo.h
+21-1942 files not shown
+63-5348 files

LLVM/project b21228bllvm/lib/IR Metadata.cpp, llvm/unittests/IR MetadataTest.cpp

[Metadata][profcheck] Handle identical MDNodes in getMergedProfMetadata

This fixes a bug where !prof metadata was dropped from SelectInsts when GVN simplified/merged them.
Guarded by -profcheck-disable-metadata-fixes. Exposed by the tests in
Transforms/SampleProfile.
DeltaFile
+24-0llvm/unittests/IR/MetadataTest.cpp
+8-0llvm/lib/IR/Metadata.cpp
+32-02 files

LLVM/project b3d3759llvm/lib/IR Metadata.cpp, llvm/unittests/IR MetadataTest.cpp

[Metadata][profcheck] Handle identical MDNodes in getMergedProfMetadata

This fixes a bug where !prof metadata was dropped from SelectInsts when GVN simplified/merged them.
Guarded by -profcheck-disable-metadata-fixes. Exposed by the tests in
Transforms/SampleProfile.
DeltaFile
+24-0llvm/unittests/IR/MetadataTest.cpp
+8-0llvm/lib/IR/Metadata.cpp
+32-02 files

LLVM/project ee3f4bcllvm/test/Transforms/LoopVectorize/RISCV tail-folding-complex-mask.ll

[LV][NFC] Follow-up fix for #173262 (#175513)

DeltaFile
+6-7llvm/test/Transforms/LoopVectorize/RISCV/tail-folding-complex-mask.ll
+6-71 files

LLVM/project da94edfllvm/include/llvm/ADT GenericUniformityImpl.h, llvm/test/Analysis/UniformityAnalysis/AMDGPU/irreducible reducible-subgraph.ll

[UniformityAnalysis] Jump over reducible cycles when locating join blocks (#174938)

When locating the join blocks of a divergent block, the algorithm relies
on pseudo-edges from the header of a reducible cycle to the cycle exits.
This was missed in the actual traversal, producing unnecessary joins
inside the reducible cycle. This caused an assert in the included test,
which expected that if a join existed in a reducible cycle for a
divergent branch outside the cycle, then it must be header.

This fixes the reverted commit from #174117
DeltaFile
+56-0llvm/test/Analysis/UniformityAnalysis/AMDGPU/irreducible/reducible-subgraph.ll
+22-24llvm/include/llvm/ADT/GenericUniformityImpl.h
+78-242 files

LLVM/project 77613aaflang/lib/Semantics resolve-names.cpp, flang/test/Lower/CUDA cuda-gpu-managed.cuf

[flang][CUDA] Apply implicit managed attribute when `-gpu=mem:managed` is used. (#175648)

When `-gpu=mem:managed` is used, allocatable arrays without explicit
CUDA data attributes are implicitly treated as managed. The
`-gpu=mem:managed` flag to enable this feature is currently only
supported in `bbc`.
DeltaFile
+166-0flang/test/Lower/CUDA/cuda-gpu-managed.cuf
+7-0flang/lib/Semantics/resolve-names.cpp
+2-2flang/tools/bbc/bbc.cpp
+175-23 files

LLVM/project e054384.github/workflows prune-unused-branches.py prune-branches.yml

[𝘀𝗽𝗿] initial version

Created using spr 1.3.7
DeltaFile
+89-0.github/workflows/prune-unused-branches.py
+30-0.github/workflows/prune-branches.yml
+119-02 files

LLVM/project 13cd700offload/plugins-nextgen/level_zero/src L0Memory.cpp

[NFC][Offload] Rename a function (#175673)

Renamed a function as suggested in #175664.
DeltaFile
+6-6offload/plugins-nextgen/level_zero/src/L0Memory.cpp
+6-61 files

LLVM/project 99d6141llvm/include/llvm/Transforms/Utils LowerMemIntrinsics.h, llvm/lib/Transforms/Utils LowerMemIntrinsics.cpp

Memset
DeltaFile
+112-37llvm/lib/Transforms/Utils/LowerMemIntrinsics.cpp
+31-18llvm/test/Transforms/PreISelIntrinsicLowering/X86/memcpy-inline-non-constant-len.ll
+20-8llvm/test/Transforms/PreISelIntrinsicLowering/X86/memset-inline-non-constant-len.ll
+4-2llvm/include/llvm/Transforms/Utils/LowerMemIntrinsics.h
+0-3llvm/utils/profcheck-xfail.txt
+167-685 files

LLVM/project adf7824llvm/lib/Transforms/Vectorize VPlan.cpp LoopVectorize.cpp, llvm/test/Transforms/LoopVectorize tripcount.ll

capture weights
DeltaFile
+23-6llvm/lib/Transforms/Vectorize/VPlan.cpp
+9-6llvm/test/Transforms/LoopVectorize/tripcount.ll
+2-0llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+34-123 files

LLVM/project 5d45dfdllvm/lib/CodeGen CodeGenPrepare.cpp

[WIP][profcheck] Codegen Prepare
DeltaFile
+15-2llvm/lib/CodeGen/CodeGenPrepare.cpp
+15-21 files

LLVM/project a204658llvm/test/Transforms/LoopVectorize tripcount.ll

[NFC] use UTC for LoopVectorize/tripcount.ll
DeltaFile
+213-29llvm/test/Transforms/LoopVectorize/tripcount.ll
+213-291 files

LLVM/project a01b7c2. .gitignore

[LLVM] Ignore two Cursor specific files. (#175683)

DeltaFile
+2-0.gitignore
+2-01 files

LLVM/project ed9f5c9llvm/lib/Target/RISCV RISCVInstrInfoVPseudos.td, llvm/test/tools/llvm-mca/RISCV/SiFiveX390 vector-fp.s

[RISCV] Add the missing SEW search table field to vector FMA instructions (#175646)

We split vector floating point FMA (pseudo) instructions' opcodes by SEW
since c6b7944be4dfbb1fb35301c670812726845acaa7 , but forgot to populate
their `SEW` field, which is used by various search tables. This results
in incorrect pseudo instruction opcodes lookup -- and to a larger
extent, incorrect scheduling class lookups -- in llvm-mca. This patch
fixes such issue.
DeltaFile
+129-129llvm/test/tools/llvm-mca/RISCV/SiFiveX390/vector-fp.s
+48-48llvm/test/tools/llvm-mca/RISCV/SpacemitX60/rvv/fp.test
+32-32llvm/test/tools/llvm-mca/RISCV/SpacemitX60/rvv/fma.test
+1-1llvm/lib/Target/RISCV/RISCVInstrInfoVPseudos.td
+210-2104 files

LLVM/project 438f887llvm CMakeLists.txt

[cmake] Make CMAKE_BUILD_TYPE=Release the default (#174520)

Currently, we report a fatal error if the user leaves CMAKE_BUILD_TYPE
blank. This was implemented in https://reviews.llvm.org/D124153 /
350bdf9227ceb , based on this RFC:

https://discourse.llvm.org/t/rfc-select-a-better-linker-by-default-or-warn-about-using-bfd/61899/1

Tom Stellard mentioned that he'd like to revisit this on Discord, and
Aiden, myself, and apparently most people on the original RFC agree, so
I'm proposing we do it. However, on the review, several folks objected
and insisted that Debug was a better default. I want to reopen the
question.

I think we've made the wrong tradeoff. I wish Debug builds worked out of
the box on most systems, but they don't, and LLVM has only gotten bigger
over the last four years, making the build scalability problems of Debug
builds worse. I think we should optimize our build configuration for new
developers, not experienced longtime contributors who are invested

    [9 lines not shown]
DeltaFile
+3-8llvm/CMakeLists.txt
+3-81 files

LLVM/project 2329d04llvm/test/CodeGen/AArch64 arm64-homogeneous-prolog-epilog-tail-call.mir

Remove cleanup of incorrect output in test dir (#171256)

This follows #171255 , removing the cleanup line.
DeltaFile
+0-1llvm/test/CodeGen/AArch64/arm64-homogeneous-prolog-epilog-tail-call.mir
+0-11 files

LLVM/project a7ad427llvm/lib/Transforms/Vectorize VPlan.cpp LoopVectorize.cpp, llvm/test/Transforms/LoopVectorize tripcount.ll

capture weights
DeltaFile
+23-6llvm/lib/Transforms/Vectorize/VPlan.cpp
+9-6llvm/test/Transforms/LoopVectorize/tripcount.ll
+2-0llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+34-123 files

LLVM/project 0dcd112llvm/test/Transforms/LoopVectorize tripcount.ll

[NFC] use UTC for LoopVectorize/tripcount.ll
DeltaFile
+213-29llvm/test/Transforms/LoopVectorize/tripcount.ll
+213-291 files

LLVM/project f9c561bllvm/lib/Transforms/Utils LoopUtils.cpp, llvm/test/Transforms/LoopVectorize branch-weights.ll

[profcheck] Fix encoding of 0 loopEstimatedTrip count (#174896)

We currently encode an estimated trip count of 0 as the latch having branch probabilities 0-0. That's an invalid pair of weights. The probability of a branch is computed as a fraction of its corresponding weight and the sum of the weights. In fact, `BranchProbabilityInfo::calcMetadataWeights` will convert this to a 1-1, meaning 50% - 50%, which isn't quite what we want. To indicate the loop is never taken, we just need to initialize the exit probability to non-zero (hence, 1)

Related: https://reviews.llvm.org/D67905

Issue #147390
DeltaFile
+5-5llvm/test/Transforms/LoopVectorize/AArch64/check-prof-info.ll
+4-1llvm/lib/Transforms/Utils/LoopUtils.cpp
+2-2llvm/test/Transforms/LoopVectorize/branch-weights.ll
+1-1llvm/unittests/Transforms/Utils/LoopUtilsTest.cpp
+12-94 files

LLVM/project 26624d5clang/include/clang/Basic BuiltinsAMDGPU.def, clang/test/CodeGenOpenCL amdgpu-features.cl amdgpu-cluster-dims.cl

[AMDGPU]Add specific instruction feature for multicast load (#175503)

DeltaFile
+7-7clang/include/clang/Basic/BuiltinsAMDGPU.def
+11-1llvm/lib/Target/AMDGPU/AMDGPU.td
+2-2clang/test/CodeGenOpenCL/amdgpu-features.cl
+2-2clang/test/CodeGenOpenCL/amdgpu-cluster-dims.cl
+2-2llvm/lib/Target/AMDGPU/FLATInstructions.td
+3-0llvm/lib/Target/AMDGPU/GCNSubtarget.h
+27-141 files not shown
+28-147 files

LLVM/project a9037dcorc-rt Maintainers.md

[orc-rt] Add Maintainers.md. (#175691)

DeltaFile
+10-0orc-rt/Maintainers.md
+10-01 files

LLVM/project 587bac6llvm/lib/Target/RISCV RISCVMakeCompressible.cpp, llvm/test/CodeGen/RISCV make-compressible-xqci.mir

[RISCV] Adjust base cost for Xqcilo loads/stores in RISCVMakeCompressible (#175572)

We only need two uses in Xqcilo load/store instructions for the base
adjustment to be profitable as compared to three uses in the base
load/store instructions.
DeltaFile
+48-5llvm/test/CodeGen/RISCV/make-compressible-xqci.mir
+38-7llvm/lib/Target/RISCV/RISCVMakeCompressible.cpp
+86-122 files

LLVM/project c6fc6adclang/lib/CIR/CodeGen CIRGenBuiltinX86.cpp, clang/test/CIR/CodeGenBuiltins/X86 avx512vlvp2intersect-builtins.c avx512vp2intersect-builtins.c

[CIR][X86] Add support for `intersect` builtins (#172554)

adds support for the
`__builtin_ia32_vp2intersect_d`/`__builtin_ia32_vp2intersect_q` x86
builtins.

Part of #167765

---------

Signed-off-by: vishruth-thimmaiah <vishruththimmaiah at gmail.com>
DeltaFile
+161-0clang/test/CIR/CodeGenBuiltins/X86/avx512vlvp2intersect-builtins.c
+77-0clang/test/CIR/CodeGenBuiltins/X86/avx512vp2intersect-builtins.c
+64-10clang/lib/CIR/CodeGen/CIRGenBuiltinX86.cpp
+302-103 files