[X86][Inline] Make ABI compatibility check more precise (#205106)
When inlining a function that contains calls with vector arguments, we
have to be careful that inlining does not change the ABI of the call.
E.g. we generally can't inline a function without `+avx` into a
function with `+avx` if there are calls using vectors of size 256 or
larger, because they'd switch from passing in two xmm registers to
passing in a ymm register.
However, the current check is very crude and only allows inlining with
interior calls if the target features match *exactly* (via the base
areTypesABICompatible implementation). This is unnecessarily
conservative, as many target features do not affect the call ABI at all.
Make this check more precise by checking the result of
getRegisterTypeForCallingConv for the type between the TLI instances for
the caller and callee.
[IR] Remove IRBuilder AddMetadataToInst (#202280)
This avoids an extra check for metadata on every instruction insertion,
makes constructing an IRBuilder cheaper, and therefore slightly improves
performance.
As the C API doesn't expose CollectMetadataToCopy or any other way to
add additional metadata to the IRBuilder, make LLVMAddMetadataToInst an
alias for LLVMSetInstDebugLocation and undeprecate the latter.
Revert some SSAF patches (#205279)
I've started seeing some failures on Windows permissive bots.
I'll revert my patches for now until further investigation.
errors:
https://lab.llvm.org/buildbot/#/builders/107/builds/20548
```
C:\b\slave\sanitizer-windows\llvm-project\clang\lib\Frontend\CompilerInvocation.cpp
C:\b\slave\sanitizer-windows\build\tools\clang\include\clang/Options/Options.inc(9981): error C2065: 'SSAFOpts': undeclared identifier
C:\b\slave\sanitizer-windows\build\tools\clang\include\clang/Options/Options.inc(9982): note: see reference to function template instantiation 'auto GenerateSSAFArgs::<lambda_5f504a9e8792b8b03f1d39701f31dbec>::operator ()<T>(const T &) const' being compiled
with
[
T=std::vector<std::string,std::allocator<std::string>>
]
```
Revert "Reland "[clang][ssaf][NFC] Move SSAF flags from FrontendOptions
to a dedicated SSAFOptions" (#204798)"
[4 lines not shown]
Revert "[libc] Introduce the ioctl syscall wrapper and port all callers" (#205277)
Reverts llvm/llvm-project#204640
Breaks libc-x86_64-debian-fullbuild. Reverting while I investigate.
[X86] Prevent folding of volatile scalar loads into masked loads in selects (#205103)
X86 select patterns were folding scalar FP loads into AVX-512 masked
loads. Since masked loads suppress memory access when the mask is 0,
this can incorrectly eliminate the observable access of volatile loads,
leading to miscompilation. Non-volatile loads are unaffected.
Multi-use loads already avoid folding, since folding consumes the load
into the instruction's memory operand and leaves no value for the other
users, forcing it to be materialized into a register. Single-use
volatile loads did not, and this must also be prevented, as volatile
loads are required to always perform their memory access.
Fix this by using the isSimple()-guarded simple_load pattern instead of
loadf32/loadf64, ensuring volatile loads are not folded.
Found via @jlebar's X86 LLVM bug hunt / FuzzX effort:
https://github.com/SemiAnalysisAI/FuzzX/blob/master/x86/bugs/093-avx512-vmovs-x86selects-load-fold-mask-suppress
clang: Change TargetInfo::setCPU to take StringRef
The related APIs all use StringRef, so use StringRef for
consistency.
Co-Authored-By: Claude (Opus 4.8) <noreply at anthropic.com>
AMDGPU: Move AMDGPUTargetID to AMDGPUTargetParser
Move the AMDGPUTargetID class and TargetIDSetting enum from
AMDGPUBaseInfo to AMDGPUTargetParser, making them available in the
MC-independent TargetParser library.
Currently there is this backend implementation, and a second one in
clang. Move this here so in the future the clang copy can be deleted.
Co-Authored-By: Claude <noreply at anthropic.com>
AMDGPU: Use module flags to control xnack and sramecc
This ensures these ABI details are encoded in the IR module
rather than depending on external state from command-line flags.
Previously, these were encoded as function-level subtarget features.
The code object output was a single target ID directive implied
by the global subtarget. The backend would previously check if a
function's subtarget feature mismatched the global subtarget. This
is avoided by making xnack and sramecc module-level properties from
the start. This also provides proper linker compatibility
enforcement, moving the error point earlier.
The old encoding was also an abuse of the subtarget feature system.
Subtarget features are a bitvector, and later features in the string
can override earlier ones. The old handling added a special case
where explicit settings were preserved: ordinarily +feature,-feature
should result in the feature being disabled, but +xnack,-xnack would
preserve the explicit "-xnack" state, which differs from the absence
of any xnack setting.
[25 lines not shown]
[clang][ssaf][NFC] Make SSAFOptions available in Builders and Extractors (#204684)
Now that we have SSAFOptions, it would make it a lot more ergonomic if
it was accessible from builders and extractors.
This PR does exactly that.
Part of rdar://179151023
Co-authored-by: Jan Korous <jkorous at apple.com>
Co-authored-by: Claude Opus 4.7 <noreply at anthropic.com>
[Clang][ABI] Validate consistency between ABI lowering implementation (#203281)
If the LLVM ABI library is used, and assertions are enabled, compute the
ABI both using Clang's implementation the the LLVM ABI library, and
verify that the results are the same.
[libc] Introduce the ioctl syscall wrapper and port all callers (#204640)
This patch adds an ioctl syscall wrapper in linux_syscalls namespace and
migrates all direct SYS_ioctl calls to use it.
To handle the polymorphic nature of ioctl arguments (where some commands
expect pointers, some expect scalar integers like queue_selector, and
some expect no argument at all), I use a helper struct IoctlArg with
implicit constructors. This avoids template bloat and overload
ambiguities (particularly around literal 0) while keeping call sites
clean.
Assisted by Gemini.
[orc-rt] Add return serialization to AllocActionFunction::handle. (#205271)
Add a Serializer template parameter to AllocActionFunction::handle and
apply it to the handler's return value before forwarding as the action
result. This lets handler authors return types other than
WrapperFunctionBuffer.
For SPS, AllocActionSPSSerializer is the default Serializer used by
SPSAllocActionFunction::handle. It accepts either:
- WrapperFunctionBuffer (identity pass-through, the existing behavior),
or
- Error (success → empty WFB; failure → out-of-band-error WFB carrying
toString(Err)).
Adds AllocActionTest coverage for both Error-return paths.
AMDGPU: Move AMDGPUTargetID to AMDGPUTargetParser
Move the AMDGPUTargetID class and TargetIDSetting enum from
AMDGPUBaseInfo to AMDGPUTargetParser, making them available in the
MC-independent TargetParser library.
Currently there is this backend implementation, and a second one in
clang. Move this here so in the future the clang copy can be deleted.
Co-Authored-By: Claude <noreply at anthropic.com>
Reland "[clang][ssaf][NFC] Move SSAF flags from FrontendOptions to a dedicated SSAFOptions" (#204798)
Second attempt of #204686
This class will help keeping SSAF options apart from generic
FrontendOptions. It is inspired by AnalyzerOptions.
This way all of these SSAF (and future) options will be at a
centralized place.
In preparation of rdar://179151023
[orc-rt] Replace AAHandlerTraits with CallableArgInfo. NFCI. (#205257)
CallableArgInfo provides a superset of AAHandlerTraits functionality, so
we don't need the latter.
[RISCV] Update Xqcilo Pseudos (#196422)
This changes the Xqcilo pseudos to instead emit a sequence of
`qc.e.li` followed by a standard load/store annotated with %qc.access.
The new sequence is easier for our linker to relax.
This Change was written with the assistance of AI.
Patch tryCanonicalizeStructToVector to handle split slice tails (#201434)
We choose a vector alloca over a struct alloca when all users of the
alloca are memory or lifetime intrinsics. But we only accounted for
slices that start in the corresponding partition. We have to also check
that all split slice tails overlapping the partition are memory or
lifetime intrinsics
I also updated the `PassRegistry.def` to include the new pass option
because we forgot to add that.