LLVM/project 2c02e4cllvm/lib/Analysis ConstantFolding.cpp

[NFC][LLVM][ConstantFolding] Use Type* variant of ConstantFP::get when folding scalar intrinsics. (#172709)

This gives peace of mind the code paths will remain valid if enabled for vector types and -use-constant-fp-for-*-splat enabled.
DeltaFile
+44-44llvm/lib/Analysis/ConstantFolding.cpp
+44-441 files

LLVM/project 803886allvm/lib/Target/AArch64 AArch64InstrGISel.td, llvm/lib/Target/AArch64/GISel AArch64LegalizerInfo.cpp AArch64RegisterBankInfo.cpp

[AArch64][GlobalISel] Added support for sri intrinsic
DeltaFile
+8-0llvm/lib/Target/AArch64/AArch64InstrGISel.td
+7-0llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
+2-0llvm/lib/Target/AArch64/GISel/AArch64RegisterBankInfo.cpp
+17-03 files

LLVM/project 7dfe40ellvm/test/CodeGen/AArch64 arm64-vshift.ll

[AArch64][GlobalISel] Added test coverage for sri intrinsic

Previously, generation of sri intrinsics was tested during the ACLE -> IR stage, but not in the IR -> MIR stage. Now, correct generation of sri intrinsics is tested in both stages.
DeltaFile
+110-6llvm/test/CodeGen/AArch64/arm64-vshift.ll
+110-61 files

LLVM/project 87f1141llvm/lib/Target/AArch64 AArch64InstrGISel.td, llvm/lib/Target/AArch64/GISel AArch64RegisterBankInfo.cpp AArch64LegalizerInfo.cpp

[AArch64][GlobalISel] Renamed G_VSLI to G_SLI

The name now better reflects the machine code instruction, over the IR intrinsic.
DeltaFile
+2-2llvm/lib/Target/AArch64/GISel/AArch64RegisterBankInfo.cpp
+1-2llvm/lib/Target/AArch64/AArch64InstrGISel.td
+1-1llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
+4-53 files

LLVM/project 58bfd2bllvm/lib/Target/AArch64 AArch64InstrGISel.td

[AArch64][GlobalISel] Changed G_VSLI input operand list to correctly reflect operand types
DeltaFile
+1-1llvm/lib/Target/AArch64/AArch64InstrGISel.td
+1-11 files

LLVM/project d6aa26dllvm/lib/Target/AArch64 AArch64InstrGISel.td, llvm/lib/Target/AArch64/GISel AArch64LegalizerInfo.cpp AArch64RegisterBankInfo.cpp

[GlobalISel][AArch64] Added support for sli intrinsic

sli intrinsic now lowers correctly for all vector types.
DeltaFile
+6-9llvm/test/CodeGen/AArch64/arm64-vshift.ll
+7-0llvm/lib/Target/AArch64/AArch64InstrGISel.td
+7-0llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
+2-0llvm/lib/Target/AArch64/GISel/AArch64RegisterBankInfo.cpp
+22-94 files

LLVM/project 3790080llvm/test/CodeGen/X86 vselect-pcmp.ll

[X86] vselect-pcmp.ll - add test showing failure to fold icmp_eq(and(x,pow2),0) to shl(x,c) for v4f32 select masks (#173359)

Noticed while trying to tweak backend folds to workaround #172888
DeltaFile
+48-0llvm/test/CodeGen/X86/vselect-pcmp.ll
+48-01 files

LLVM/project c4088b2llvm/lib/CodeGen/SelectionDAG DAGCombiner.cpp, llvm/test/CodeGen/AArch64 sve-masked-ldst-sext.ll sve-masked-ldst-zext.ll

[LLVM][DAGCombiner] Look through freeze when combining extensions of extending-masked-loads. (#172484)

Extensions in this context mean post legalisation extensions (i.e. and,
sext-in-reg) because that's the point the freeze blocks the existing
combine.
DeltaFile
+16-19llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+27-0llvm/test/CodeGen/AArch64/sve-masked-ldst-sext.ll
+24-0llvm/test/CodeGen/AArch64/sve-masked-ldst-zext.ll
+0-1llvm/test/CodeGen/AArch64/complex-deinterleaving-reductions-predicated-scalable.ll
+67-204 files

LLVM/project c1e72dcllvm/include/llvm/Support Registry.h

[NFC] clang-format llvm/include/llvm/Support/Registry.h (#173295)

This is in preparation of #173290.
DeltaFile
+105-105llvm/include/llvm/Support/Registry.h
+105-1051 files

LLVM/project 9ca5e85llvm/test/Analysis/ScalarEvolution ptrtoaddr.ll ptrtoaddr-i32-index-width.ll

[SCEV] Avoid tests not passing the verifier (NFC)

Update these tests with the version from:
https://github.com/llvm/llvm-project/pull/158032
DeltaFile
+31-96llvm/test/Analysis/ScalarEvolution/ptrtoaddr.ll
+70-0llvm/test/Analysis/ScalarEvolution/ptrtoaddr-i32-index-width.ll
+101-962 files

LLVM/project 8c3e6aallvm/lib/Target/AMDGPU AMDGPUCodeGenPrepare.cpp, llvm/test/CodeGen/AMDGPU rsq.f32-safe.ll amdgpu-codegenprepare-fdiv.ll

AMDGPU: Stop requiring afn for f32 rsq formation

We were checking for afn or !fpmath attached to the sqrt. We
are not trying to replace a correctly rounded rsqrt; we're replacing
the two correctly rounded operations with the contracted operation.
It's net a better precision, so contract on both instructions should
be sufficient. Both the contracted and uncontracted sequences pass
the OpenCL conformance test, with a lower maximum error contracted.
DeltaFile
+504-1,529llvm/test/CodeGen/AMDGPU/rsq.f32-safe.ll
+52-45llvm/test/CodeGen/AMDGPU/amdgpu-codegenprepare-fdiv.ll
+6-25llvm/lib/Target/AMDGPU/AMDGPUCodeGenPrepare.cpp
+562-1,5993 files

LLVM/project 5b52376llvm/lib/Target/SPIRV SPIRVPrepareGlobals.cpp, llvm/test/CodeGen/SPIRV ga-gv.ll ga-noninterp-func-noninterp.ll

[SPIRV] Add support for non-interposable function aliases (#172730)

The backend was not handling GlobalAliases such as in call targets. This
patch pre-processes the aliases in the module and resolve them to their
aliasee when possible. The patch also documents those cases that are not
yet supported.
DeltaFile
+46-2llvm/lib/Target/SPIRV/SPIRVPrepareGlobals.cpp
+24-0llvm/test/CodeGen/SPIRV/ga-gv.ll
+22-0llvm/test/CodeGen/SPIRV/ga-noninterp-func-noninterp.ll
+17-0llvm/test/CodeGen/SPIRV/ga-interp-func-interp.ll
+16-0llvm/test/CodeGen/SPIRV/ga-interp-func-noninterp.ll
+16-0llvm/test/CodeGen/SPIRV/ga-noninterp-func-interp.ll
+141-21 files not shown
+152-27 files

LLVM/project a76084fllvm/lib/Target/AArch64 AArch64ISelDAGToDAG.cpp AArch64ISelLowering.cpp, llvm/lib/Target/AArch64/MCTargetDesc AArch64AddressingModes.h

[AArch64] Improve SIMD immediate generation with SVE. (#173273)

Allow using SVE DUPM instructions to materialise fixed-length vectors.

Fixes #122422.
DeltaFile
+86-33llvm/test/CodeGen/AArch64/movi64_sve.ll
+28-0llvm/lib/Target/AArch64/MCTargetDesc/AArch64AddressingModes.h
+1-24llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp
+2-4llvm/test/CodeGen/AArch64/sve-fixed-length-fcopysign.ll
+2-4llvm/test/CodeGen/AArch64/sve2-fixed-length-fcopysign.ll
+3-1llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+122-666 files

LLVM/project 64c4059clang/docs ReleaseNotes.rst, clang/include/clang/AST TypeProperties.td

[Clang] Serialize expansions of PackIndexingType (#173351)

We have already serialized isFullySubstituted, which hinges on the
expansions; if they were lost, we would never expand them correctly from
an imported AST.

Sadly this bug has been around a year, so there's a release note.

Fixes #172464
DeltaFile
+18-0clang/test/PCH/pack-indexing-2.cpp
+4-1clang/include/clang/AST/TypeProperties.td
+1-0clang/docs/ReleaseNotes.rst
+23-13 files

LLVM/project 261d2dallvm/lib/Analysis ValueTracking.cpp, llvm/test/Transforms/InstSimplify ptrtoaddr.ll

[ValueTracking] Support ptrtoaddr in isKnownNonZero() (#173275)

Add support for ptrtoaddr in isKnownNonZero(). We can directly forward
to isKnownNonZero() for the pointer here, as we define nonnull as
applying to the address bits.

Also adjust the ptrtoint implementation to match, by requiring that the
result type >= address size (rather than >= pointer size). This is just
for clarity, in practice this is a non-canonical form.
DeltaFile
+58-0llvm/test/Transforms/InstSimplify/ptrtoaddr.ll
+8-5llvm/lib/Analysis/ValueTracking.cpp
+66-52 files

LLVM/project aef4985llvm/lib/Target/SPIRV SPIRVPrepareGlobals.cpp, llvm/test/CodeGen/SPIRV ga-gv.ll ga-noninterp-func-noninterp.ll

[SPIRV] Add support for non-interposable function aliases

This patch implements support for calling functions through
non-interposable aliases in the SPIRV backend. Global aliases
are replaced at the IR level by their aliasee object when
possible.

Interposable aliases are explicitly not supported yet and will cause
compilation to fail. This was not supported prior to this patch.

Tests added for both the supported non-interposable case and the
unsupported interposable case.
DeltaFile
+46-2llvm/lib/Target/SPIRV/SPIRVPrepareGlobals.cpp
+24-0llvm/test/CodeGen/SPIRV/ga-gv.ll
+22-0llvm/test/CodeGen/SPIRV/ga-noninterp-func-noninterp.ll
+17-0llvm/test/CodeGen/SPIRV/ga-interp-func-interp.ll
+16-0llvm/test/CodeGen/SPIRV/ga-interp-func-noninterp.ll
+16-0llvm/test/CodeGen/SPIRV/ga-noninterp-func-interp.ll
+141-21 files not shown
+152-27 files

LLVM/project 0f572c1llvm/lib/Target/AMDGPU AMDGPUISelLowering.cpp AMDGPULegalizerInfo.cpp, llvm/test/CodeGen/AMDGPU fsqrt.f32.ll

AMDGPU: Teach lowering that exp and log intrinsics cannot return denormals (#172296)

DeltaFile
+103-0llvm/test/CodeGen/AMDGPU/fsqrt.f32.ll
+5-0llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
+3-0llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
+111-03 files

LLVM/project 79a8894libcxx/docs/ReleaseNotes 22.rst, libcxx/include/__algorithm rotate.h

[libc++] Optimize rotate (#120890)

This implements a new algorithm for `rotate` with random access
iterators, which uses `swap_ranges`. This reduces cache misses and
allows for vectorization.

Apple M4:
```
Benchmark                                                       old             new    Difference    % Difference
---------------------------------------------------  --------------  --------------  ------------  --------------
rng::rotate(deque<int>)_(1_element_backward)/1024             46.17           45.13         -1.04          -2.26%
rng::rotate(deque<int>)_(1_element_backward)/32                4.90            4.92          0.02           0.45%
rng::rotate(deque<int>)_(1_element_backward)/50                6.12            6.02         -0.10          -1.56%
rng::rotate(deque<int>)_(1_element_backward)/8192            329.97          330.49          0.52           0.16%
rng::rotate(deque<int>)_(1_element_forward)/1024              42.20           42.99          0.79           1.87%
rng::rotate(deque<int>)_(1_element_forward)/32                 4.99            5.26          0.27           5.37%
rng::rotate(deque<int>)_(1_element_forward)/50                 6.33            6.48          0.14           2.28%
rng::rotate(deque<int>)_(1_element_forward)/8192             317.40          318.68          1.28           0.40%
rng::rotate(deque<int>)_(by_1/2)/1024                        185.42          184.23         -1.18          -0.64%

    [154 lines not shown]
DeltaFile
+25-42libcxx/include/__algorithm/rotate.h
+2-0libcxx/docs/ReleaseNotes/22.rst
+27-422 files

LLVM/project 96ee7d2bolt/include/bolt/Passes LivenessAnalysis.h ReachingDefOrUse.h, bolt/lib/Passes ShrinkWrapping.cpp RegReAssign.cpp

[ADT] Make use of subsetOf and anyCommon methods of BitVector (NFC) (#170876)

Replace the code along these lines

    BitVector Tmp = LHS;
    Tmp &= RHS;
    return Tmp.any();

and

    BitVector Tmp = LHS;
    Tmp.reset(RHS);
    return Tmp.none();

with `LHS.anyCommon(RHS)` and `LHS.subsetOf(RHS)`, correspondingly,
which do not require creating temporary BitVector and can return early.
DeltaFile
+4-6bolt/lib/Passes/ShrinkWrapping.cpp
+2-6bolt/lib/Passes/RegReAssign.cpp
+4-4bolt/lib/Passes/TailDuplication.cpp
+2-4llvm/lib/CodeGen/RDFRegisters.cpp
+2-3bolt/include/bolt/Passes/LivenessAnalysis.h
+1-2bolt/include/bolt/Passes/ReachingDefOrUse.h
+15-251 files not shown
+16-277 files

LLVM/project 359abf8mlir/include/mlir/Bindings/Python IRCore.h, mlir/lib/Bindings/Python MainModule.cpp

fix after rebase
DeltaFile
+2-2mlir/lib/Bindings/Python/MainModule.cpp
+2-2mlir/include/mlir/Bindings/Python/IRCore.h
+4-42 files

LLVM/project d7577a8mlir/test/python/dialects python_test.py, mlir/test/python/lib PythonTestModuleNanobind.cpp

port mlir_attribute_subclass
DeltaFile
+21-13mlir/test/python/lib/PythonTestModuleNanobind.cpp
+3-3mlir/test/python/dialects/python_test.py
+24-162 files

LLVM/project 73aa103mlir/cmake/modules AddMLIRPython.cmake, mlir/examples/standalone CMakeLists.txt

massage cmake
DeltaFile
+127-31mlir/cmake/modules/AddMLIRPython.cmake
+10-61mlir/python/CMakeLists.txt
+3-0mlir/examples/standalone/CMakeLists.txt
+140-923 files

LLVM/project 5a09a60mlir/test/python/dialects python_test.py

format
DeltaFile
+24-6mlir/test/python/dialects/python_test.py
+24-61 files

LLVM/project 598d869mlir/include/mlir/Bindings/Python Globals.h IRCore.h, mlir/lib/Bindings/Python MainModule.cpp IRAttributes.cpp

rebase
DeltaFile
+53-0mlir/lib/Bindings/Python/MainModule.cpp
+0-44mlir/include/mlir/Bindings/Python/Globals.h
+7-7mlir/include/mlir/Bindings/Python/IRCore.h
+4-5mlir/python/CMakeLists.txt
+0-8mlir/lib/Bindings/Python/IRAttributes.cpp
+0-3mlir/lib/Bindings/Python/Globals.cpp
+64-671 files not shown
+65-677 files

LLVM/project 4381066mlir/examples/standalone/include/Standalone-c Dialects.h, mlir/examples/standalone/lib/CAPI Dialects.cpp

add standalone test/use of IRCore
DeltaFile
+25-0mlir/examples/standalone/python/StandaloneExtensionNanobind.cpp
+13-0mlir/examples/standalone/lib/CAPI/Dialects.cpp
+7-0mlir/examples/standalone/include/Standalone-c/Dialects.h
+4-0mlir/examples/standalone/test/python/smoketest.py
+0-1mlir/include/mlir/Bindings/Python/Globals.h
+49-15 files

LLVM/project 54b0f7amlir/include/mlir/Bindings/Python Globals.h, mlir/lib/Bindings/Python Globals.cpp

try fix windows badcast
DeltaFile
+9-9mlir/python/CMakeLists.txt
+3-9mlir/test/python/dialects/python_test.py
+5-0mlir/lib/Bindings/Python/Globals.cpp
+1-4mlir/include/mlir/Bindings/Python/Globals.h
+18-224 files

LLVM/project 73b0c6fmlir/python CMakeLists.txt

[mlir][Python] create MLIRPythonSupport
DeltaFile
+52-13mlir/python/CMakeLists.txt
+52-131 files

LLVM/project 9fd0469mlir/include/mlir/Bindings/Python IRCore.h, mlir/lib/Bindings/Python MainModule.cpp IRTypes.cpp

works
DeltaFile
+2-30mlir/lib/Bindings/Python/MainModule.cpp
+20-11mlir/test/python/lib/PythonTestModuleNanobind.cpp
+19-0mlir/include/mlir/Bindings/Python/IRCore.h
+3-15mlir/lib/Bindings/Python/IRTypes.cpp
+1-13mlir/lib/Bindings/Python/IRAttributes.cpp
+10-2mlir/python/CMakeLists.txt
+55-716 files not shown
+60-8012 files

LLVM/project 4de2c0bmlir/include/mlir/Bindings/Python IRCore.h NanobindUtils.h, mlir/lib/Bindings/Python IRCore.cpp MainModule.cpp

kind of working
DeltaFile
+17-3,300mlir/lib/Bindings/Python/IRCore.cpp
+2,355-0mlir/include/mlir/Bindings/Python/IRCore.h
+2,274-3mlir/lib/Bindings/Python/MainModule.cpp
+0-1,348mlir/lib/Bindings/Python/IRModule.h
+0-436mlir/lib/Bindings/Python/NanobindUtils.h
+436-0mlir/include/mlir/Bindings/Python/NanobindUtils.h
+5,082-5,08715 files not shown
+5,671-5,58321 files

LLVM/project 79796ecmlir/cmake/modules AddMLIR.cmake, mlir/include/mlir/IR EnumAttr.td

[MLIR][TblGen] add AttrOrTypeCAPIGen
DeltaFile
+367-0mlir/tools/mlir-tblgen/AttrOrTypeCAPIGen.cpp
+44-0mlir/tools/mlir-tblgen/AttrOrTypeFormatGen.h
+3-39mlir/tools/mlir-tblgen/AttrOrTypeDefGen.cpp
+3-0mlir/cmake/modules/AddMLIR.cmake
+1-0mlir/include/mlir/IR/EnumAttr.td
+1-0mlir/tools/mlir-tblgen/CMakeLists.txt
+419-396 files