[VPlan] Add simple driver option to run some individual transforms. (#178522)
Add an alternative to test VPlan in more isolation via a new
`vplan-test-transform` option, which builds VPlan0 for each loop in the
input IR and then can invoke a set of transforms on it.
In order to allow different recipe types to be created, a new
widen-from-metadata transform is added, which transforms VPInstructions
to different recipes, based on custom !vplan.widen metadata. Currently
this supports creating widen & replicate recipes, but can easily be
extended in the future.
Currently the handling is intentionally bare-bones, to be extended
gradually as needed.
PR: https://github.com/llvm/llvm-project/pull/178522
AMDGPU: Cleanup the handling of flags in getTgtMemIntrinsic (#179469)
Some of the flag handling seems a bit inconsistent and dodgy, but this
is meant to be a pure refactoring for now.
[Hexagon] Fix extractHvxSubvectorPred shuffle mask for small predicates (#181364)
The loop generating the shuffle mask in extractHvxSubvectorPred used
HwLen/ResLen as the iteration count, but each iteration produces 8
elements (ResLen * Rep where Rep = 8/ResLen). This means the total mask
size was (HwLen/ResLen) * 8, which only equals HwLen when ResLen == 8.
For smaller predicate subvectors (e.g., <4 x i1> or <2 x i1>), the mask
was too large, causing an assertion failure in getVectorShuffle.
Fix by using HwLen/8 as the loop bound, which correctly produces HwLen
elements regardless of ResLen.
[AArch64] Add basic scmp and ucmp costs. (#182180)
This adds basic llvm.scmp and llvm.ucmp costs. Scalars are costed as
cmp+cset+csinv. Neon vectors can use cmgt - cmgt as the vectors write
full vector lanes.
[clang][ssaf][NFC] Avoid incomplete EntitySummary type breakage (#182946)
When parsing LUSummary.h as a standalone header unit, EntitySummary is
an incomplete type, causing compilation to fail:
```
__memory/unique_ptr.h:72:19: error: invalid application of 'sizeof' to an incomplete type 'clang::ssaf::EntitySummary'
72 | static_assert(sizeof(_Tp) >= 0, "cannot delete an incomplete type");
...
clang/include/clang/Analysis/Scalable/EntityLinker/LUSummary.h:48:12: note: in instantiation of member function 'std::map<clang::ssaf::SummaryName, std::map<clang::ssaf::EntityId, std::unique_ptr<clang::ssaf::EntitySummary>>>::map' requested here
48 | explicit LUSummary(NestedBuildNamespace LUNamespace)
| ^
clang/include/clang/Analysis/Scalable/EntityLinker/LUSummary.h:27:7: note: forward declaration of 'clang::ssaf::EntitySummary'
27 | class EntitySummary;
```
This is not a total breakage because this header file builds
successfully when used in a .cpp file that includes EntitySummary.h
prior to this.
See https://llvm.org/docs/CodingStandards.html#self-contained-headers
[Clang][Docs] Update OpenMP support status for loop transformations (#182591)
Update loop fusion transformation codegen status to done and add
additional PR links. Mark loop index set splitting parsing as in
progress.
Co-authored-by: Cursor <cursoragent at cursor.com>
AMDGPU: Cleanup the handling of flags in getTgtMemIntrinsic
Some of the flag handling seems a bit inconsistent and dodgy, but this
is meant to be a pure refactoring for now.
commit-id:99911619
RuntimeLibcalls: Fix adding __safestack_pointer_address by default (#182936)
This was accidentally added to the default set of libcalls, so move
it out of the giant let block over functions in the default set.
Should fix regression on SPARC bot.
[lld][MachO] Enable LoopVectorization and SLPVectorization for ThinLTO (#182748)
Commit 21a4710c67a97838dd75cf60ed24da11280800f8 previously enabled
LoopVectorization and SLPVectorization CodeGen options for the ELF and
COFF LTO backends. Since the Mach-O LTO port did not exist at the time,
it missed this configuration.
This patch adds these options to the Mach-O LTO setup for consistency
with the other backends. Without this, SLP and loop vectorization passes
are silently skipped during Mach-O LTO for O2 and O3 builds.