LLVM/project a17a305llvm/lib/Analysis InstCount.cpp, llvm/test/Analysis/InstCount instcount.ll

[LLVM] Metric added - largest number of basic blocks in a single func… (#182970)

This metric gets the size of the biggest count of basic blocks in a
single function.
DeltaFile
+3-0llvm/lib/Analysis/InstCount.cpp
+1-0llvm/test/Analysis/InstCount/instcount.ll
+4-02 files

LLVM/project 6b63c59llvm/lib/CodeGen/AsmPrinter AsmPrinter.cpp, llvm/lib/Target/X86 X86AsmPrinter.cpp X86AsmPrinter.h

[NewPM][X86] Port AsmPrinter to NewPM

This patch makes AsmPrinter work with the NewPM. We essentially create
three new passes that wrap different parts of AsmPrinter so that we can
separate out doIntialization/doFinalization without needing to
materialize all MachineFunctions at the same time. This has two main
drawbacks for now:

1. We do not transfer any state between the three new AsmPrinter passes.
   This means that debuginfo/CFI currently does not work. This will be
   fixed in future passes by moving this state to MachineModuleInfo.
2. We probably incur some overhead by needing to setup up analysis
   callbacks for every MF rather than just per module. This should not
   be large, and can be optimized in the future on top of this if
   needed.
3. This solution is not really clean. However, a lot of cleanup is going
   to be difficult to do while supporting two pass managers. Once we
   remove LegacyPM support, we can make the code much cleaner and better
   enforce invariants like a lack of state between

    [5 lines not shown]
DeltaFile
+65-0llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
+46-0llvm/lib/Target/X86/X86AsmPrinter.cpp
+38-0llvm/lib/Target/X86/X86AsmPrinter.h
+17-6llvm/lib/Target/X86/X86CodeGenPassBuilder.cpp
+16-4llvm/test/CodeGen/X86/llc-pipeline-npm.ll
+9-0llvm/test/CodeGen/X86/npm-asmprint.ll
+191-101 files not shown
+198-107 files

LLVM/project 25f69d7llvm/lib/Target/X86 X86AsmPrinter.cpp X86AsmPrinter.h

[NFCi][NewPM][x86] Use callbacks to get analyses in AsmPrinter

This allows for overriding these call backs when using the NewPM which
has different methods for obtaining analysis results.

Reviewers: RKSimon, arsenm, phoebewang, mingmingl-llvm, aeubanks

Pull Request: https://github.com/llvm/llvm-project/pull/182796
DeltaFile
+15-5llvm/lib/Target/X86/X86AsmPrinter.cpp
+3-0llvm/lib/Target/X86/X86AsmPrinter.h
+18-52 files

LLVM/project abc443bllvm/include/llvm/Passes CodeGenPassBuilder.h, llvm/lib/Target/AMDGPU R600TargetMachine.cpp AMDGPUTargetMachine.cpp

[CodeGen][NewPM] Adjust pipeline for AsmPrinter

AsmPrinter needs to be split into three passes (begin, per MF, end) to
avoid the need to materialize all machine functions at the same time.
Update the CodeGenPassBuilder hooks for this.

Reviewers: aeubanks, paperchalice, arsenm

Pull Request: https://github.com/llvm/llvm-project/pull/182795
DeltaFile
+26-10llvm/include/llvm/Passes/CodeGenPassBuilder.h
+18-3llvm/lib/Target/AMDGPU/R600TargetMachine.cpp
+17-2llvm/lib/Target/X86/X86CodeGenPassBuilder.cpp
+14-2llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
+75-174 files

LLVM/project 683d15fllvm/include/llvm/Passes CodeGenPassBuilder.h, llvm/include/llvm/Target TargetMachine.h

[CodeGen][NewPM] Plumb MCContext through buildCodeGenPipeline

Otherwise we cannot create an MCStreamer without getting MMI, which we
cannot do until we have started running AsmPrinter without also plumbing
MMI through CodeGenPassBuilder.

Reviewers: arsenm, paperchalice, aeubanks

Pull Request: https://github.com/llvm/llvm-project/pull/182794
DeltaFile
+7-6llvm/include/llvm/Passes/CodeGenPassBuilder.h
+5-6llvm/include/llvm/Target/TargetMachine.h
+3-3llvm/lib/Target/X86/X86CodeGenPassBuilder.cpp
+3-2llvm/tools/llc/NewPMDriver.cpp
+2-2llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
+2-2llvm/lib/Target/AMDGPU/R600TargetMachine.cpp
+22-213 files not shown
+25-249 files

LLVM/project 757066cllvm/include/llvm/CodeGen AsmPrinter.h, llvm/lib/CodeGen/AsmPrinter AsmPrinter.cpp

[NFCi][AsmPrinter] Refactor getting analyses to callbacks

As part of making AsmPrinter work with the new pass manager, we need to
be able to override how we get analyses. This patch does that by
refactoring getting all analyses/other related functionality to
callbacks that are set by default but can be overriden later (like by a
NewPM wrapper pass).

Reviewers: aeubanks

Pull Request: https://github.com/llvm/llvm-project/pull/182793
DeltaFile
+62-43llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
+13-3llvm/include/llvm/CodeGen/AsmPrinter.h
+75-462 files

LLVM/project faf0432llvm/lib/Transforms/Scalar GVN.cpp, llvm/test/Transforms/GVN/PRE protected-field-ptr.ll

[𝘀𝗽𝗿] initial version

Created using spr 1.3.6-beta.1
DeltaFile
+41-0llvm/test/Transforms/GVN/PRE/protected-field-ptr.ll
+6-0llvm/lib/Transforms/Scalar/GVN.cpp
+47-02 files

LLVM/project 19daed3compiler-rt/lib/asan asan_fuchsia.cpp CMakeLists.txt

Revert "[ASan][Fuchsia] Have Fuchsia use a dynamic shadow start" (#182972)

Reverts llvm/llvm-project#180880

This is breaking Fuchsia's CI. something in the CMake needs to be
adjusted. Reverting on the author's request.
DeltaFile
+6-6compiler-rt/lib/asan/asan_fuchsia.cpp
+6-3compiler-rt/lib/asan/CMakeLists.txt
+1-1compiler-rt/lib/asan/asan_rtl_x86_64.S
+1-1compiler-rt/lib/asan/asan_mapping.h
+14-114 files

LLVM/project d181015compiler-rt/lib/asan asan_fuchsia.cpp CMakeLists.txt

Revert "[ASan][Fuchsia] Have Fuchsia use a dynamic shadow start (#180880)"

This reverts commit 9146da3a7bee5c62a12fed9b423c8292209da926.
DeltaFile
+6-6compiler-rt/lib/asan/asan_fuchsia.cpp
+6-3compiler-rt/lib/asan/CMakeLists.txt
+1-1compiler-rt/lib/asan/asan_rtl_x86_64.S
+1-1compiler-rt/lib/asan/asan_mapping.h
+14-114 files

LLVM/project 2330267clang/lib/Driver Driver.cpp, clang/test/Driver cuda-arch-translation.cu hip-toolchain-opt.hip

[HIP] Move HIP to the new driver by default (#123359)

Summary:
This patch matches CUDA, moving the HIP compilation jobs to the new
driver by default. The old behavior will return with
`--no-offload-new-driver`. The main difference is that objects compiled
with the old driver are no longer compatible and will need to be
recompiled or the old driver used.
DeltaFile
+18-18clang/test/Driver/cuda-arch-translation.cu
+10-23clang/test/Driver/hip-toolchain-opt.hip
+11-13clang/lib/Driver/Driver.cpp
+2-12clang/test/Driver/hip-code-object-version.hip
+2-11clang/test/Driver/hip-options.hip
+9-3clang/test/Driver/hip-toolchain-no-rdc.hip
+52-8016 files not shown
+93-12722 files

LLVM/project a8f3c97llvm/lib/Analysis InstCount.cpp, llvm/test/Analysis/InstCount instcount.ll

Implemented metric that gets biggest function's size (#182632)

This metric gets the size of the biggest function.
DeltaFile
+73-0llvm/test/Analysis/InstCount/instcount.ll
+0-69llvm/test/Other/instcount.ll
+6-1llvm/lib/Analysis/InstCount.cpp
+79-703 files

LLVM/project b204f73clang-tools-extra/clang-doc YAMLGenerator.cpp JSONGenerator.cpp

Format
DeltaFile
+2-4clang-tools-extra/clang-doc/YAMLGenerator.cpp
+3-2clang-tools-extra/clang-doc/JSONGenerator.cpp
+2-1clang-tools-extra/clang-doc/MDGenerator.cpp
+1-1clang-tools-extra/clang-doc/Representation.cpp
+8-84 files

LLVM/project f3d9f04clang-tools-extra/clang-doc MDGenerator.cpp Generators.cpp, clang-tools-extra/unittests/clang-doc GeneratorTest.cpp ClangDocTest.cpp

[clang-doc] Improve complexity of Index construction

The existing implementation ends up with an O(N^2) algorithm due to
repeated linear scans during index construction. Switching to a
StringMap allows us to reduce this to O(N), since we no longer need to
search the vector.

The `BM_Index_Insertion` benchmark measures the time taken to insert N
unique records into the index.

| Scale (N Items) | Baseline (ns) | Patched (ns) | Speedup | Change |
|----------------:|--------------:|-------------:|--------:|-------:|
| 10              | 9,977         | 11,004       | 0.91x   | +10.3% |
| 64              | 69,249        | 69,166       | 1.00x   | -0.1%  |
| 512             | 1,932,714     | 525,877      | 3.68x   | -72.8% |
| 4,096           | 92,411,535    | 4,589,030    | 20.1x   | -95.0% |
| 10,000          | 577,384,945   | 12,998,039   | 44.4x   | -97.7% |

The patch delivers significant improvements to scalability. At 10,000

    [13 lines not shown]
DeltaFile
+71-17clang-tools-extra/unittests/clang-doc/GeneratorTest.cpp
+21-10clang-tools-extra/clang-doc/MDGenerator.cpp
+13-11clang-tools-extra/clang-doc/Generators.cpp
+11-5clang-tools-extra/clang-doc/JSONGenerator.cpp
+3-3clang-tools-extra/clang-doc/YAMLGenerator.cpp
+2-2clang-tools-extra/unittests/clang-doc/ClangDocTest.cpp
+121-482 files not shown
+124-518 files

LLVM/project fb89c43clang-tools-extra/clang-doc CMakeLists.txt, clang-tools-extra/clang-doc/benchmarks ClangDocBenchmark.cpp CMakeLists.txt

[clang-doc] Add basic benchmarks for library functionality

clang-doc's performance is good, but we suspect it could be better. To
track this with more fidelity, we can add a set of GoogleBenchmarks that
exercise portions of the library. To start we try to track high level
items that we monitor via the TimeTrace functions, and give them their
own micro benchmarks. This should give us more confidence that switching
out data structures or updating algorthms will have a positive
performance impact.

Note that an LLM helped generate portions of the benchmarks and
parameterize them. Most of the internal logic was written by me, but
the LLM was used to handle boilerplate and adaptation to the harness.
DeltaFile
+220-0clang-tools-extra/clang-doc/benchmarks/ClangDocBenchmark.cpp
+20-0clang-tools-extra/clang-doc/benchmarks/CMakeLists.txt
+4-0clang-tools-extra/clang-doc/CMakeLists.txt
+244-03 files

LLVM/project 9146da3compiler-rt/lib/asan asan_fuchsia.cpp CMakeLists.txt

[ASan][Fuchsia] Have Fuchsia use a dynamic shadow start (#180880)

The dynamic shadow global is still set to zero, but this will change in the future.
DeltaFile
+6-6compiler-rt/lib/asan/asan_fuchsia.cpp
+3-6compiler-rt/lib/asan/CMakeLists.txt
+1-1compiler-rt/lib/asan/asan_mapping.h
+1-1compiler-rt/lib/asan/asan_rtl_x86_64.S
+11-144 files

LLVM/project 4f59da5clang/lib/CodeGen CGExpr.cpp, clang/test/CodeGenHLSL/BasicFeatures VectorElementwiseCast.hlsl

[HLSL][Matrix] EmitFromMemory when emitting load of vector and matrix element LValues (#178315)

Fixes #177712

The MatrixElt and VectorElt cases of `EmitLoadOfLValue` did not convert
the scalar value from its load/store type into its primary IR type like
the other cases do, which caused issues with HLSL in particular which
requires bools to be converted to and from i32 and i1 forms for its
load/store and primary IR types respectively.

This PR fixes the issue by applying `EmitFromMemory` to the loaded
scalar.
DeltaFile
+55-0clang/test/CodeGenHLSL/BasicFeatures/VectorElementwiseCast.hlsl
+12-7clang/lib/CodeGen/CGExpr.cpp
+67-72 files

LLVM/project a347e12offload/plugins-nextgen/level_zero/src L0Memory.cpp

[Offload] Enable memory usage printing with `alloc` debug type (#182938)

DeltaFile
+3-3offload/plugins-nextgen/level_zero/src/L0Memory.cpp
+3-31 files

LLVM/project 6b352aallvm/include/llvm/Transforms/Vectorize VPlanTestPass.h, llvm/lib/Transforms/Vectorize VPlanTestPass.cpp VPlanConstruction.cpp

Revert "[VPlan] Add simple driver option to run some individual transforms. (#178522)"

This reverts commit 3df1c6f88bfbbd76d9256c55358bb75e02e33779.

Causes build-failures without assertions
https://lab.llvm.org/buildbot/#/builders/159/builds/41683
DeltaFile
+0-69llvm/test/Transforms/LoopVectorize/VPlan/vplan-widen-from-metadata.ll
+0-53llvm/lib/Transforms/Vectorize/VPlanTestPass.cpp
+0-46llvm/lib/Transforms/Vectorize/VPlanConstruction.cpp
+0-38llvm/test/Transforms/LoopVectorize/VPlan/vplan-merge-blocks.ll
+0-36llvm/test/Transforms/LoopVectorize/VPlan/vplan-remove-dead-recipes.ll
+0-35llvm/include/llvm/Transforms/Vectorize/VPlanTestPass.h
+0-2777 files not shown
+0-38913 files

LLVM/project 6d37110llvm/include/llvm/IR RuntimeLibcalls.td, llvm/test/Transforms/SafeStack/SPARC safestack.ll

Revert "RuntimeLibcalls: Fix adding __safestack_pointer_address by default" (#182949)

Reverts llvm/llvm-project#182936
DeltaFile
+0-38llvm/test/Transforms/SafeStack/SPARC/safestack.ll
+4-5llvm/include/llvm/IR/RuntimeLibcalls.td
+4-432 files

LLVM/project 3df1c6fllvm/include/llvm/Transforms/Vectorize VPlanTestPass.h, llvm/lib/Transforms/Vectorize VPlanTestPass.cpp VPlanConstruction.cpp

[VPlan] Add simple driver option to run some individual transforms. (#178522)

Add an alternative to test VPlan in more isolation via a new
`vplan-test-transform` option, which builds VPlan0 for each loop in the
input IR and then can invoke a set of transforms on it.

In order to allow different recipe types to be created, a new
widen-from-metadata transform is added, which transforms VPInstructions
to different recipes, based on custom !vplan.widen metadata. Currently
this supports creating widen & replicate recipes, but can easily be
extended in the future.

Currently the handling is intentionally bare-bones, to be extended
gradually as needed.

PR: https://github.com/llvm/llvm-project/pull/178522
DeltaFile
+69-0llvm/test/Transforms/LoopVectorize/VPlan/vplan-widen-from-metadata.ll
+53-0llvm/lib/Transforms/Vectorize/VPlanTestPass.cpp
+46-0llvm/lib/Transforms/Vectorize/VPlanConstruction.cpp
+38-0llvm/test/Transforms/LoopVectorize/VPlan/vplan-merge-blocks.ll
+36-0llvm/test/Transforms/LoopVectorize/VPlan/vplan-remove-dead-recipes.ll
+35-0llvm/include/llvm/Transforms/Vectorize/VPlanTestPass.h
+277-07 files not shown
+389-013 files

LLVM/project 4a78803llvm/lib/Target/AMDGPU SIISelLowering.cpp

AMDGPU: Cleanup the handling of flags in getTgtMemIntrinsic (#179469)

Some of the flag handling seems a bit inconsistent and dodgy, but this
is meant to be a pure refactoring for now.
DeltaFile
+44-48llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+44-481 files

LLVM/project c3a86ffllvm/lib/Target/Hexagon HexagonISelLoweringHVX.cpp, llvm/test/CodeGen/Hexagon extract-hvx-subvector-pred-small.ll

[Hexagon] Fix extractHvxSubvectorPred shuffle mask for small predicates (#181364)

The loop generating the shuffle mask in extractHvxSubvectorPred used
HwLen/ResLen as the iteration count, but each iteration produces 8
elements (ResLen * Rep where Rep = 8/ResLen). This means the total mask
size was (HwLen/ResLen) * 8, which only equals HwLen when ResLen == 8.
For smaller predicate subvectors (e.g., <4 x i1> or <2 x i1>), the mask
was too large, causing an assertion failure in getVectorShuffle.

Fix by using HwLen/8 as the loop bound, which correctly produces HwLen
elements regardless of ResLen.
DeltaFile
+28-0llvm/test/CodeGen/Hexagon/extract-hvx-subvector-pred-small.ll
+1-1llvm/lib/Target/Hexagon/HexagonISelLoweringHVX.cpp
+29-12 files

LLVM/project 476492bllvm/lib/Target/AArch64 AArch64TargetTransformInfo.cpp, llvm/test/Analysis/CostModel/AArch64 cmp.ll

[AArch64] Add basic scmp and ucmp costs. (#182180)

This adds basic llvm.scmp and llvm.ucmp costs. Scalars are costed as
cmp+cset+csinv. Neon vectors can use cmgt - cmgt as the vectors write
full vector lanes.
DeltaFile
+16-16llvm/test/Analysis/CostModel/AArch64/cmp.ll
+21-0llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
+37-162 files

LLVM/project 6eb63c6llvm/include/llvm/CodeGen TargetLowering.h

[CodeGen] Remove legacy getTgtMemIntrinsic overload (#175846)

It is now fully unused.
DeltaFile
+1-19llvm/include/llvm/CodeGen/TargetLowering.h
+1-191 files

LLVM/project df128ceclang/include/clang/Analysis/Scalable/EntityLinker LUSummary.h

[clang][ssaf][NFC] Avoid incomplete EntitySummary type breakage (#182946)

When parsing LUSummary.h as a standalone header unit, EntitySummary is
an incomplete type, causing compilation to fail:

```
__memory/unique_ptr.h:72:19: error: invalid application of 'sizeof' to an incomplete type 'clang::ssaf::EntitySummary'
   72 |     static_assert(sizeof(_Tp) >= 0, "cannot delete an incomplete type");
...
clang/include/clang/Analysis/Scalable/EntityLinker/LUSummary.h:48:12: note: in instantiation of member function 'std::map<clang::ssaf::SummaryName, std::map<clang::ssaf::EntityId, std::unique_ptr<clang::ssaf::EntitySummary>>>::map' requested here
   48 |   explicit LUSummary(NestedBuildNamespace LUNamespace)
      |            ^
clang/include/clang/Analysis/Scalable/EntityLinker/LUSummary.h:27:7: note: forward declaration of 'clang::ssaf::EntitySummary'
   27 | class EntitySummary;
```

This is not a total breakage because this header file builds
successfully when used in a .cpp file that includes EntitySummary.h
prior to this.

See https://llvm.org/docs/CodingStandards.html#self-contained-headers
DeltaFile
+1-2clang/include/clang/Analysis/Scalable/EntityLinker/LUSummary.h
+1-21 files

LLVM/project 1287078llvm/include/llvm/IR RuntimeLibcalls.td, llvm/test/Transforms/SafeStack/SPARC safestack.ll

Revert "RuntimeLibcalls: Fix adding __safestack_pointer_address by default (#…"

This reverts commit 8604b52e380fb37a3599539b1d87a68666ab6ed5.
DeltaFile
+0-38llvm/test/Transforms/SafeStack/SPARC/safestack.ll
+4-5llvm/include/llvm/IR/RuntimeLibcalls.td
+4-432 files

LLVM/project c2cf1f8llvm/test/CodeGen/RISCV clmul.ll clmulr.ll, llvm/test/CodeGen/X86 vector-interleaved-store-i32-stride-7.ll clmul.ll

test

Created using spr 1.3.7
DeltaFile
+24,655-20,149llvm/test/CodeGen/RISCV/clmul.ll
+12,512-13,372llvm/test/CodeGen/RISCV/clmulr.ll
+12,350-13,322llvm/test/CodeGen/RISCV/clmulh.ll
+3,298-3,437llvm/test/CodeGen/X86/vector-interleaved-store-i32-stride-7.ll
+2,888-1,812llvm/test/CodeGen/X86/clmul.ll
+1,447-1,447llvm/test/tools/llvm-mca/RISCV/Andes45/rvv-fp.s
+57,150-53,539946 files not shown
+96,102-71,738952 files

LLVM/project 1c7cb39clang/docs OpenMPSupport.rst

[Clang][Docs] Update OpenMP support status for loop transformations (#182591)

Update loop fusion transformation codegen status to done and add
additional PR links. Mark loop index set splitting parsing as in
progress.

Co-authored-by: Cursor <cursoragent at cursor.com>
DeltaFile
+4-2clang/docs/OpenMPSupport.rst
+4-21 files

LLVM/project 26d786fllvm/lib/Transforms/Vectorize VPlanTransforms.cpp

Braces for outer `if`
DeltaFile
+2-1llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+2-11 files

LLVM/project 3b83c1ellvm/lib/Transforms/Vectorize LoopVectorize.cpp VPlanTransforms.cpp

Move to VPlanTransforms, have to pass Legal explicitly
DeltaFile
+1-78llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+76-0llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+3-1llvm/lib/Transforms/Vectorize/VPlanTransforms.h
+80-793 files