LLVM/project 887d912llvm/lib/Transforms/Vectorize SLPVectorizer.cpp, llvm/test/Transforms/SLPVectorizer/AArch64 lcssa-phi-extract-scale.ll lcssa-phi-inner-loop-scale.ll

[𝘀𝗽𝗿] initial version

Created using spr 1.3.7
DeltaFile
+66-44llvm/test/Transforms/SLPVectorizer/AArch64/lcssa-phi-extract-scale.ll
+73-2llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+22-16llvm/test/Transforms/SLPVectorizer/AArch64/lcssa-phi-inner-loop-scale.ll
+161-623 files

LLVM/project 8498ccbllvm/include module.modulemap

[Object] Add missing BBAddrMap.def to module map

Added in 532940bdee66bf5f36a70578698aab66f16919af.
DeltaFile
+2-0llvm/include/module.modulemap
+2-01 files

LLVM/project ed11d7acompiler-rt/lib/builtins CMakeLists.txt, compiler-rt/lib/builtins/arm truncdfsf2.S extendsfdf2.S

[compiler-rt][ARM] Optimized FP double <-> single conversion (#179926)

This commit provides assembly versions of the conversions both ways
between double and float.
DeltaFile
+367-0compiler-rt/test/builtins/Unit/truncdfsf2new_test.c
+198-0compiler-rt/lib/builtins/arm/truncdfsf2.S
+196-0compiler-rt/lib/builtins/arm/extendsfdf2.S
+123-0compiler-rt/test/builtins/Unit/extendsfdf2new_test.c
+2-0compiler-rt/lib/builtins/CMakeLists.txt
+886-05 files

LLVM/project aee8bafllvm/lib/Transforms/Vectorize SLPVectorizer.cpp, llvm/test/Transforms/SLPVectorizer/AArch64 lcssa-phi-inner-loop-scale.ll

[𝘀𝗽𝗿] initial version

Created using spr 1.3.7
DeltaFile
+22-16llvm/test/Transforms/SLPVectorizer/AArch64/lcssa-phi-inner-loop-scale.ll
+30-1llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+52-172 files

LLVM/project 31d88a5llvm/test/CodeGen/AMDGPU accvgpr-spill-scc-clobber.mir pei-build-av-spill.mir, mlir/lib/Dialect/XeGPU/Transforms XeGPUSubgroupDistribute.cpp

Merge branch 'main' into users/meinersbur/flang_builtin-mods_3
DeltaFile
+5,568-0llvm/test/CodeGen/AMDGPU/accvgpr-spill-scc-clobber.mir
+3,000-96llvm/test/CodeGen/AMDGPU/pei-build-av-spill.mir
+3,075-0llvm/test/CodeGen/AMDGPU/debug-frame.ll
+2,208-72llvm/test/CodeGen/AMDGPU/pei-build-spill.mir
+0-2,280mlir/lib/Dialect/XeGPU/Transforms/XeGPUSubgroupDistribute.cpp
+2,196-0llvm/test/CodeGen/AMDGPU/eliminate-frame-index-s-mov-b32.mir
+16,047-2,4481,327 files not shown
+55,718-19,3811,333 files

LLVM/project bdaf3cfllvm/test/Transforms/SLPVectorizer/X86 scalarize-ctlz.ll sitofp.ll

[SLP] Improve InsertElement scalarization cost modeling

When costing InsertElement tree entries, pass getScalarizationOverhead the
per-lane insert operands via AdjustedVL, set ForPoisonSrc from whether the
base vector is entirely undef, and supply a VectorInstrContext hint derived
from the demanded insert instructions. Move the scalarization cost adjustment
to after InMask is computed so ForPoisonSrc reflects the actual base vector
state.

Reviewers: bababuck, RKSimon, hiraditya

Pull Request: https://github.com/llvm/llvm-project/pull/199514
DeltaFile
+31-37llvm/test/Transforms/SLPVectorizer/X86/scalarize-ctlz.ll
+30-15llvm/test/Transforms/SLPVectorizer/X86/sitofp.ll
+27-18llvm/test/Transforms/SLPVectorizer/X86/arith-fp-inseltpoison.ll
+27-18llvm/test/Transforms/SLPVectorizer/X86/arith-fp.ll
+21-17llvm/test/Transforms/SLPVectorizer/X86/vec_list_bias-inseltpoison.ll
+21-17llvm/test/Transforms/SLPVectorizer/X86/vec_list_bias_external_insert_shuffled.ll
+157-1229 files not shown
+244-18015 files

LLVM/project 5bfcf13llvm/lib/Transforms/Vectorize LoopVectorize.cpp LoopVectorizationPlanner.h, llvm/test/Transforms/LoopVectorize/AArch64 store-costs-sve.ll

[VPlan] Construct VPlan1 once, share across buildVPlans calls. (#197276)

Extract the VF-independent VPlan1 setup pipeline (header phis,
simplification, early-exit handling, middle check, loop regions, tail
folding, mask introduction) into a new helper tryToBuildVPlan1().

Construct the initial Vlan1 once, and pass to repeated buildVPlans
calls.

Note that this means we need to move collectInLoopReductions up. We not
may construct VPlan1 on code paths where we did not before, because we
failed UserVF validation/selection, but I think that should be fine as
this makes the overall code simpler and the UserVF code paths are for
testing.

PR: https://github.com/llvm/llvm-project/pull/197276
DeltaFile
+33-24llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+6-11llvm/test/Transforms/LoopVectorize/AArch64/store-costs-sve.ll
+8-3llvm/lib/Transforms/Vectorize/LoopVectorizationPlanner.h
+47-383 files

LLVM/project 3f28c15flang/test lit.cfg.py

Typo fix
DeltaFile
+2-1flang/test/lit.cfg.py
+2-11 files

LLVM/project 6134199llvm/test/Transforms/SLPVectorizer/AArch64 lcssa-phi-inner-loop-scale.ll

[SLP][NFC]Add some more tests with phi external uses, NFC



Reviewers: 

Pull Request: https://github.com/llvm/llvm-project/pull/199919
DeltaFile
+238-0llvm/test/Transforms/SLPVectorizer/AArch64/lcssa-phi-inner-loop-scale.ll
+238-01 files

LLVM/project 2245dd7clang-tools-extra/clangd HeaderSourceSwitch.cpp, clang-tools-extra/clangd/unittests HeaderSourceSwitchTests.cpp

[clangd] Prefer .hpp files over .h with header source switch (#198152)

Previously, The "Switch Between Source/Header" action picked `.h` over
`.hpp` when both files existed next to a `.cpp` file, because `.h` is
listed first in the header-extension list.

This patch reorders `HeaderExtensions` and `SourceExtensions` so the
`C++`-flavored extensions come before `.h` and `.c`. `C++`-flavor of
file is preffered since (at least in my opinion) more people using
`clangd` for `C++` than `C` with `.hpp` ext so switching from `.cpp`
should go into `.hpp`, not `.h`.

This brings an edje case that when swithing from `.c` it will go into
`.hpp` instead of `.h`, but I think this situation is more rare than
having `.cpp` with `.hpp` and `.h` combination since `.h` headers can be
used as `extern "C"` wrapper of cpp library.
DeltaFile
+17-0clang-tools-extra/clangd/unittests/HeaderSourceSwitchTests.cpp
+2-2clang-tools-extra/clangd/HeaderSourceSwitch.cpp
+19-22 files

LLVM/project 3e9607dllvm/lib/Transforms/InstCombine InstCombineAndOrXor.cpp, llvm/test/Transforms/InstCombine or-bitmask.ll

[InstCombine] Fix type mismatch in `foldBitmaskMul`
DeltaFile
+42-0llvm/test/Transforms/InstCombine/or-bitmask.ll
+5-0llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
+47-02 files

LLVM/project e6fe3f4lldb/test/Shell/Commands process-attach-dummy.test command-dil-diagnostics.test, lldb/test/Shell/ObjectFile/MachO section-overflow-binary.test

[lldb][test] Require Python for a few more tests (#199913)
DeltaFile
+2-0lldb/test/Shell/Commands/process-attach-dummy.test
+2-0lldb/test/Shell/ObjectFile/MachO/section-overflow-binary.test
+1-1lldb/unittests/DAP/Handler/DisconnectTest.cpp
+1-0lldb/test/Shell/Commands/command-dil-diagnostics.test
+1-0lldb/test/Shell/Commands/command-module-hook-fire.test
+7-15 files

LLVM/project c71f9f0llvm/lib/Transforms/Vectorize VPlanVerifier.cpp, llvm/test/Transforms/LoopVectorize/AArch64 alias-mask.ll

[LV] Handle loop.dependence.mask in verifyLastActiveLaneRecipe() (#199897)

This verification can be called after the alias-mask has been expanded
so needs to recognize loop.dependence.mask intrinsics.
DeltaFile
+17-2llvm/lib/Transforms/Vectorize/VPlanVerifier.cpp
+1-1llvm/test/Transforms/LoopVectorize/AArch64/alias-mask.ll
+18-32 files

LLVM/project b63787amlir/include/mlir/Dialect/AMDGPU/IR AMDGPUOps.td, mlir/include/mlir/Dialect/LLVMIR ROCDLOps.td

[MLIR][AMDGPU] Add permlane16.var and permlanex16.var intrinsic ops (#199501)

## Summary

Add ROCDL and AMDGPU dialect support for the GFX12+ variable-selector
permlane intrinsics (`v_permlane16_var_b32` / `v_permlanex16_var_b32`).

Unlike the existing fixed-selector `permlane16`/`permlanex16` ops where
source-lane indices come from SGPR immediates, the "var" variants take
per-lane source-lane indices from a VGPR, enabling arbitrary per-lane
intra-row and cross-row permutations within a wave32 subgroup.

### ROCDL dialect
- `ROCDL_Permlane16VarOp` → `llvm.amdgcn.permlane16.var`
- `ROCDL_PermlaneX16VarOp` → `llvm.amdgcn.permlanex16.var`
- Both take `(old, src0, src1, fi, boundControl)` with `fi` and
`boundControl` as immediate i1 attrs

### AMDGPU dialect

    [11 lines not shown]
DeltaFile
+84-0mlir/test/Conversion/AMDGPUToROCDL/permlane-var.mlir
+63-0mlir/include/mlir/Dialect/LLVMIR/ROCDLOps.td
+46-1mlir/lib/Conversion/AMDGPUToROCDL/AMDGPUToROCDL.cpp
+37-0mlir/include/mlir/Dialect/AMDGPU/IR/AMDGPUOps.td
+20-0mlir/test/Dialect/LLVMIR/rocdl.mlir
+16-0mlir/test/Target/LLVMIR/rocdl.mlir
+266-16 files

LLVM/project 2a57482llvm/lib/CodeGen/SelectionDAG SelectionDAGBuilder.cpp

[SelectionDAGBuilder] Replace asserts inside LLVM_DEBUG (#199748)

These assert were inside an LLVM_DEBUG macro, meaning they were very
rarely if ever tested. The second "LowerFormalArguments emitted a value
with the wrong type!" assert would fire in a number of tests so has been
removed. The other was replaced with an all_of assert.

Noticed when looking at #198107 / #199412.
DeltaFile
+2-8llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
+2-81 files

LLVM/project 3c16e92llvm/lib/Target/AArch64 AArch64TargetTransformInfo.cpp AArch64TargetTransformInfo.h, llvm/test/Transforms/EarlyCSE/AArch64 intrinsics-1xN.ll

[AArch64][TTI][EarlyCSE] Add support for ld1xN and st1xN intrinsics (#198765)

Handle ld1x2, ld1x3, ld1x4, st1x2, st1x3, st1x4 in:
- AArch64TTIImpl::getTgtMemIntrinsic
- AArch64TTIImpl::getOrCreateResultFromMemIntrinsic

This enables EarlyCSE to optimize these NEON load/store intrinsics.

To test the changes, a new testcase (intrinsics-1xN.ll) derived from
llvm/test/Transforms/EarlyCSE/AArch64/intrinsics.ll is added.
DeltaFile
+194-0llvm/test/Transforms/EarlyCSE/AArch64/intrinsics-1xN.ll
+28-3llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
+0-6llvm/lib/Target/AArch64/AArch64TargetTransformInfo.h
+222-93 files

LLVM/project 3d9cc99llvm/lib/Target/AMDGPU SIMemoryLegalizer.cpp

always diagnose unknown metadata
DeltaFile
+5-3llvm/lib/Target/AMDGPU/SIMemoryLegalizer.cpp
+5-31 files

LLVM/project 090df8flibsycl/test lit.cfg.py, llvm/utils/lit/lit TestingConfig.py main.py

Revert "[lit] Move maxIndividualTestTime from global to test suite config" (#199886)

Reverts llvm/llvm-project#198192

To fix  https://lab.llvm.org/buildbot/#/builders/195/builds/25357
DeltaFile
+0-16llvm/utils/lit/lit/TestingConfig.py
+12-1llvm/utils/lit/lit/main.py
+4-4llvm/utils/lit/lit/TestRunner.py
+1-4llvm/utils/lit/lit/LitConfig.py
+2-2llvm/utils/lit/lit/formats/googletest.py
+1-1libsycl/test/lit.cfg.py
+20-285 files not shown
+25-3311 files

LLVM/project 2c6fcedclang/lib/CodeGen CGBuiltin.cpp, clang/test/CIR/CodeGenHIP builtins-amdgcn-image.hip

Merge branch 'main' into users/chenshanzhi/AArch64-TTI-getTgtMemIntrinsic
DeltaFile
+1,521-0clang/test/CodeGenCXX/builtin-clear-padding-codegen.cpp
+993-0clang/test/CodeGen/builtin-clear-padding-codegen.c
+886-0libcxx/test/libcxx/atomics/builtin_clear_padding.pass.cpp
+466-0clang/test/CIR/CodeGenHIP/builtins-amdgcn-image.hip
+416-0mlir/test/Conversion/TosaToSPIRVTosa/tosa-to-spirv.mlir
+344-0clang/lib/CodeGen/CGBuiltin.cpp
+4,626-090 files not shown
+6,266-98596 files

LLVM/project 0d6aac7libcxx/include/__concepts referenceable.h, libcxx/include/__type_traits is_referenceable.h add_pointer.h

[libc++] Remove workarounds for __{add,remove}_pointer on AppleClang (#199821)

We've updated the supported AppleClang version, so we can drop those
workarounds now.

This also removes `__is_referenceable_v`, since it's no longer used.
DeltaFile
+0-190libcxx/test/libcxx/utilities/meta/is_referenceable.compile.pass.cpp
+0-34libcxx/include/__type_traits/is_referenceable.h
+3-28libcxx/include/__type_traits/add_pointer.h
+30-0libcxx/include/__concepts/referenceable.h
+3-16libcxx/include/__type_traits/remove_pointer.h
+0-12libcxx/test/std/utilities/meta/meta.trans/objc_support.compile.pass.mm
+36-28010 files not shown
+45-29616 files

LLVM/project 3060f65llvm/lib/Target/RISCV RISCVISelLowering.cpp, llvm/test/CodeGen/RISCV/rvv vector-interleave-fixed.ll fixed-vectors-shuffle-int-interleave.ll

Revert "[RISCV][CodeGen] Use vzip.vv for e64 interleave shuffles with Zvzip" (#199899)

Reverts llvm/llvm-project#199512

LLVM Buildbot has detected a build error for this PR.
DeltaFile
+20-6llvm/test/CodeGen/RISCV/rvv/vector-interleave-fixed.ll
+17-8llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-int-interleave.ll
+4-9llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+41-233 files

LLVM/project 4def779libcxx/test/libcxx-03/vendor/apple disable-availability.sh.cpp, libcxx/test/selftest/modules std-and-std.compat-module.sh.cpp

[libc++][NFC] Remove lit annotations for older AppleClang versions (#199817)

We don't support anything older than apple-clang-21, so we can remove
those annotations.
DeltaFile
+0-4libcxx/test/libcxx-03/vendor/apple/disable-availability.sh.cpp
+0-3libcxx/test/std/modules/std.compat.pass.cpp
+0-3libcxx/test/std/modules/std.pass.cpp
+0-3libcxx/test/std/numerics/c.math/signbit.pass.cpp
+0-3libcxx/test/std/utilities/meta/meta.unary/meta.unary.prop/is_implicit_lifetime.verify.cpp
+0-3libcxx/test/selftest/modules/std-and-std.compat-module.sh.cpp
+0-196 files not shown
+3-3112 files

LLVM/project d1324cfclang/lib/CodeGen CGBuiltin.cpp, clang/test/CIR/CodeGenHIP builtins-amdgcn-image.hip

Merge branch 'main' into users/ssahasra/refactor-acq-rel
DeltaFile
+1,521-0clang/test/CodeGenCXX/builtin-clear-padding-codegen.cpp
+993-0clang/test/CodeGen/builtin-clear-padding-codegen.c
+886-0libcxx/test/libcxx/atomics/builtin_clear_padding.pass.cpp
+466-0clang/test/CIR/CodeGenHIP/builtins-amdgcn-image.hip
+416-0mlir/test/Conversion/TosaToSPIRVTosa/tosa-to-spirv.mlir
+344-0clang/lib/CodeGen/CGBuiltin.cpp
+4,626-055 files not shown
+6,162-68861 files

LLVM/project c63a424llvm/test/CodeGen/AMDGPU accvgpr-spill-scc-clobber.mir pei-build-av-spill.mir, mlir/lib/Dialect/XeGPU/Transforms XeGPUSubgroupDistribute.cpp

Merge branch 'main' into users/statham-arm/arm-fp-f2d2f
DeltaFile
+5,568-0llvm/test/CodeGen/AMDGPU/accvgpr-spill-scc-clobber.mir
+3,000-96llvm/test/CodeGen/AMDGPU/pei-build-av-spill.mir
+3,075-0llvm/test/CodeGen/AMDGPU/debug-frame.ll
+2,208-72llvm/test/CodeGen/AMDGPU/pei-build-spill.mir
+0-2,280mlir/lib/Dialect/XeGPU/Transforms/XeGPUSubgroupDistribute.cpp
+2,196-0llvm/test/CodeGen/AMDGPU/eliminate-frame-index-s-mov-b32.mir
+16,047-2,4481,268 files not shown
+54,473-19,0211,274 files

LLVM/project 3496778llvm/lib/Target/AMDGPU SIMemoryLegalizer.cpp, llvm/test/CodeGen/AMDGPU memory-legalizer-av-unknown.ll

diagnose unknown metadata
DeltaFile
+21-3llvm/lib/Target/AMDGPU/SIMemoryLegalizer.cpp
+11-0llvm/test/CodeGen/AMDGPU/memory-legalizer-av-unknown.ll
+32-32 files

LLVM/project 09b607allvm/test/Transforms/LoopVectorize/AArch64 partial-reduce-usabs.ll

Add comment
DeltaFile
+3-0llvm/test/Transforms/LoopVectorize/AArch64/partial-reduce-usabs.ll
+3-01 files

LLVM/project 99e6632clang/lib/CIR/CodeGen CIRGenBuiltinAArch64.cpp, clang/test/CodeGen/AArch64 neon-intrinsics.c poly64.c

[CIR][AArch64] Upstream vector-shift-right-and-insert NEON builtins (#196776)

Related to https://github.com/llvm/llvm-project/issues/185382

CIR lowering for vector-shift-right-and-insert intrinsics
(https://arm-software.github.io/acle/neon_intrinsics/advsimd.html#vector-shift-right-and-insert)

Port tests from clang/test/CodeGen/AArch64/neon_intrinsics.c and
clang/test/CodeGen/AArch64/poly64.c to
clang/test/CodeGen/AArch64/neon/intrinsics.c
DeltaFile
+315-0clang/test/CodeGen/AArch64/neon/intrinsics.c
+0-282clang/test/CodeGen/AArch64/neon-intrinsics.c
+83-9clang/lib/CIR/CodeGen/CIRGenBuiltinAArch64.cpp
+0-28clang/test/CodeGen/AArch64/poly64.c
+398-3194 files

LLVM/project b186960llvm/lib/Target/RISCV RISCVISelLowering.cpp, llvm/test/CodeGen/RISCV/rvv vector-interleave-fixed.ll fixed-vectors-shuffle-int-interleave.ll

Revert "[RISCV][CodeGen] Use vzip.vv for e64 interleave shuffles with Zvzip (…"

This reverts commit a4b1361f33139e7a0a02edee1a1b012740951e01.
DeltaFile
+20-6llvm/test/CodeGen/RISCV/rvv/vector-interleave-fixed.ll
+17-8llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-int-interleave.ll
+4-9llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+41-233 files

LLVM/project c94e5f3llvm/lib/Target/AMDGPU AMDGPURegBankLegalizeHelper.cpp

AMDGPU/GlobalISel: Move executeInWaterfallLoop call from lower (#199701)

WFI is an argument to applyMappingSrc and lower,
move executeInWaterfallLoop after these two return.
Also set insert point in executeInWaterfallLoop to
avoid need to set insert point before calling it.
DeltaFile
+6-5llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeHelper.cpp
+6-51 files

LLVM/project 698d44bclang/lib/CodeGen CGBuiltin.cpp, clang/lib/Sema SemaChecking.cpp

[clang] Add builtin to clear padding bytes (prework for P0528R3) (#75371)

Add builtin to clear padding bytes. This is the pre-work to implement
`std::atomic::compare_exchange_[weak/strong]` that ignores padding bits.
PR draft here: https://github.com/llvm/llvm-project/pull/76180

This PR picked up this patch from 3 years ago
https://reviews.llvm.org/D87974

The above patch no longer works as things changed quite a lot. I've made
some changes on top of the above patch:


it handles:
- struct
- builtin types with paddings (like `long double` and types with
`__attribute__((ext_vector_type(N)))`
- _Complex long double
- constant array

    [7 lines not shown]
DeltaFile
+1,521-0clang/test/CodeGenCXX/builtin-clear-padding-codegen.cpp
+993-0clang/test/CodeGen/builtin-clear-padding-codegen.c
+886-0libcxx/test/libcxx/atomics/builtin_clear_padding.pass.cpp
+344-0clang/lib/CodeGen/CGBuiltin.cpp
+98-0clang/test/SemaCXX/builtin-clear-padding.cpp
+64-0clang/lib/Sema/SemaChecking.cpp
+3,906-03 files not shown
+3,968-09 files