[clang][ssaf][NFC] Make SSAFOptions available in Builders and Extractors (#204684)
Now that we have SSAFOptions, it would make it a lot more ergonomic if
it was accessible from builders and extractors.
This PR does exactly that.
Part of rdar://179151023
Co-authored-by: Jan Korous <jkorous at apple.com>
Co-authored-by: Claude Opus 4.7 <noreply at anthropic.com>
[Clang][ABI] Validate consistency between ABI lowering implementation (#203281)
If the LLVM ABI library is used, and assertions are enabled, compute the
ABI both using Clang's implementation the the LLVM ABI library, and
verify that the results are the same.
[libc] Introduce the ioctl syscall wrapper and port all callers (#204640)
This patch adds an ioctl syscall wrapper in linux_syscalls namespace and
migrates all direct SYS_ioctl calls to use it.
To handle the polymorphic nature of ioctl arguments (where some commands
expect pointers, some expect scalar integers like queue_selector, and
some expect no argument at all), I use a helper struct IoctlArg with
implicit constructors. This avoids template bloat and overload
ambiguities (particularly around literal 0) while keeping call sites
clean.
Assisted by Gemini.
[orc-rt] Add return serialization to AllocActionFunction::handle. (#205271)
Add a Serializer template parameter to AllocActionFunction::handle and
apply it to the handler's return value before forwarding as the action
result. This lets handler authors return types other than
WrapperFunctionBuffer.
For SPS, AllocActionSPSSerializer is the default Serializer used by
SPSAllocActionFunction::handle. It accepts either:
- WrapperFunctionBuffer (identity pass-through, the existing behavior),
or
- Error (success → empty WFB; failure → out-of-band-error WFB carrying
toString(Err)).
Adds AllocActionTest coverage for both Error-return paths.
AMDGPU: Move AMDGPUTargetID to AMDGPUTargetParser
Move the AMDGPUTargetID class and TargetIDSetting enum from
AMDGPUBaseInfo to AMDGPUTargetParser, making them available in the
MC-independent TargetParser library.
Currently there is this backend implementation, and a second one in
clang. Move this here so in the future the clang copy can be deleted.
Co-Authored-By: Claude <noreply at anthropic.com>
Reland "[clang][ssaf][NFC] Move SSAF flags from FrontendOptions to a dedicated SSAFOptions" (#204798)
Second attempt of #204686
This class will help keeping SSAF options apart from generic
FrontendOptions. It is inspired by AnalyzerOptions.
This way all of these SSAF (and future) options will be at a
centralized place.
In preparation of rdar://179151023
[orc-rt] Replace AAHandlerTraits with CallableArgInfo. NFCI. (#205257)
CallableArgInfo provides a superset of AAHandlerTraits functionality, so
we don't need the latter.
[RISCV] Update Xqcilo Pseudos (#196422)
This changes the Xqcilo pseudos to instead emit a sequence of
`qc.e.li` followed by a standard load/store annotated with %qc.access.
The new sequence is easier for our linker to relax.
This Change was written with the assistance of AI.
Patch tryCanonicalizeStructToVector to handle split slice tails (#201434)
We choose a vector alloca over a struct alloca when all users of the
alloca are memory or lifetime intrinsics. But we only accounted for
slices that start in the corresponding partition. We have to also check
that all split slice tails overlapping the partition are memory or
lifetime intrinsics
I also updated the `PassRegistry.def` to include the new pass option
because we forgot to add that.
[mlir][arith] Fix APInt bitwidth mismatch crash in int-range-optimizations (#205110)
Fixes https://github.com/llvm/llvm-project/issues/204909
When an op's `areTypesCompatible()` hook accepts integers of different
widths across a region boundary, the range analysis can propagate a
constant range whose APInt bitwidth does not match the IR type of the
destination value.
This caused `IntegerAttr::get` to `assert` in
`maybeReplaceWithConstant`.
Fix by bailing out in `maybeReplaceWithConstant` when the bitwidths
mismatch, and adding the same check to the needsReplacing lambda in
matchAndRewrite.
The second guard is necessary to mirror the existing isIntOrIndex()
guard — without it the pattern claims success without changing the IR,
causing the greedy rewrite driver to loop.
[14 lines not shown]
[RISCV][P-ext] Rename pwcvt/pncvt pseudoinstructions for RV64. (#205227)
We need to add a 'w' to the suffix to indicate it operates on a word and
not a register pair like on RV32. See https://github.com/riscv/riscv-p-spec/pull/303
[SandboxVectorizer] Implement topdown vectorizer
This patch introduces the `top-down-vec` pass to the Sandbox Vectorizer,
adding the ability to traverse use-def chains top-down to discover and
collect vectorization opportunities.
Key changes include:
* TopDownVec Pass: Implemented `TopDownVec` which recursively processes
value bundles top-down, creates vectorization actions (widening, packing,
shuffles), and emits the final vector IR.
* Shared Infrastructure (VecPassBase): Extracted common IR emission logic
out of `BottomUpVec` and into a new shared base class, `VecPassBase`.
Functions for generating vector instructions, handling diamond reuse,
creating shuffles/packs, and collecting dead instructions are now shared
between the bottom-up and top-down vectorizers to prevent code
duplication.
* Pass Registration: Exposed `top-down-vec` in `PassRegistry.def` and
`SandboxVectorizerPassBuilder`, allowing it to be invoked within pass
pipelines via `opt`.
[3 lines not shown]
Reland [Allocator] Keep bump pointer at a minimum alignment (#205240)
Reland #203718 (reverted in #205091) by making computation in integer
domain to avoid UB (nullptr + non-zero offset).
Add a `MinAlign` template parameter (default 8, sizeof(size_t) on 64-bit
platforms) so that the common case `Alignment <= MinAlign` can skip
realigning `CurPtr`.
This is achieved by rounding each allocation's size up to MinAlign, so
the bump pointer stays MinAlign-aligned between allocations.
SpecificBumpPtrAllocator::DestroyAll() walks objects at a fixed
sizeof(T) stride and needs tight packing, so it uses MinAlign=1.
(alignof(T) would
pack just as tightly and reuse the default instantiation, but T may be
incomplete here, e.g. `SpecificBumpPtrAllocator<MCSectionELF>`.)
Its `Allocate` still skips the realign: the slab is max_align_t-aligned
[9 lines not shown]
[OpenMPOpt][Attributor] Selectively seed deglobalization AAs (#198710)
This addresses a compile-time issue observed on a large generated C++
translation unit compiled with `-fopenmp`.
The source code is not OpenMP-heavy. It mainly consists of generated
function-registration wrappers, template instantiations, lambdas, and
small helper functions. However, because the TU is compiled with OpenMP
enabled, `OpenMPOptCGSCCPass` runs and drives Attributor on a module
with many functions.
`OpenMPOpt::registerAAsForFunction` currently eagerly creates the
deglobalization AAs for every function in OpenMP device modules:
* `AAHeapToShared`
* `AAHeapToStack`
Most generated wrapper/helper functions in the motivating workload do
not contain `__kmpc_alloc_shared`, removable allocations, or free-like
[25 lines not shown]
[AMDGPU] Fold constant offsets into named barrier addresses
Allow isOffsetFoldingLegal to fold a constant offset into an LDS
named-barrier global, and include the node offset when materializing the
LDS address in LowerGlobalAddress. s_barrier_signal_var on a GEP'd named
barrier now selects the immediate form, matching a bare global and GlobalISel.
With object linking the offset folds into the relocation addend.
Change-Id: I639bc723eb001573585cc05d0ad19f2773054f21
Assisted-by: Cursor
[AMDGPU] Pre-commit test for constant-offset named barrier signal_var
A GEP into a named-barrier array (&bars[1]) lowers s_barrier_signal_var to
the dynamic m0 form on SelectionDAG, unlike the bare global and GlobalISel.
With object linking it emits a runtime add of the offset instead of folding
it into the relocation addend.
Change-Id: I7cea0dd64d050eb3e2143841e7136355cbb3bc50
Assisted-by: Cursor
[AMDGPU] Fold constant offsets into named barrier addresses
Allow isOffsetFoldingLegal to fold a constant offset into an LDS
named-barrier global, and include the node offset when materializing the
LDS address in LowerGlobalAddress. s_barrier_signal_var on a GEP'd named
barrier now selects the immediate form, matching a bare global and GlobalISel.
With object linking the offset folds into the relocation addend.
Change-Id: Ie05b8c8cd127604ff174c423a74340fd2de4e405
Assisted-by: Cursor
[AMDGPU] Pre-commit test for constant-offset named barrier signal_var
A GEP into a named-barrier array (&bars[1]) lowers s_barrier_signal_var to
the dynamic m0 form on SelectionDAG, unlike the bare global and GlobalISel.
With object linking it emits a runtime add of the offset instead of folding
it into the relocation addend.
Change-Id: I59f0e6fe6a72b4c96c8efb926610f7f2d3833e38
Assisted-by: Cursor
[RISCV][P-ext] packed exchanged add/sub codegen (#203473)
Wire up the already-defined exchanged add/sub instructions
pas/psa/psas/pssa/paas/pasa with llvm.riscv.* intrinsics and isel
patterns.