[TTI] Remove masked/gather-scatter/strided/expand-compress costing from TTIImpl (#169885)
Following #165532, this patch moves scalarization‑cost computation into
BaseT::getMemIntrinsicCost and lets backends override it via their
getMemIntrinsicCost.
It also removes the masked/gather‑scatter/strided/expand‑compress
costing interfaces from TTIImpl.
Targets may keep them locally if needed.
Stacked on #170426 and #170436.
[libclc] Fix memory fence scope mapping for OpenCL (#170542)
The function `__opencl_get_memory_scope` incorrectly assumed that the
Clang built-in `__MEMORY_SCOPE_*` macros defined as bitmasks, while they
are actually defined as distinct integer values. This led to incorrect
mapping of OpenCL memory fence flags to LLVM memory scopes, causing
issues in generated code.
The fix involves updating the `__opencl_get_memory_scope` function to
return the correct `__MEMORY_SCOPE_*` values based on the provided
`cl_mem_fence_flags`. Additionally, the `__opencl_get_memory_semantics`
and the `__opencl_get_memory_scope` functions are marked as `static`
to avoid potential multiple definition issues during linking.
[llvm][DebugInfo] Allow DIDerivedType as a bound in DISubrangeType (#165880)
Consider this Ada type:
```
type Array_Type is array (Natural range <>) of Integer;
type Record_Type (L1, L2 : Natural) is record
I1 : Integer;
A1 : Array_Type (1 .. L1);
I2 : Integer;
A2 : Array_Type (1 .. L2);
I3 : Integer;
end record;
```
Here, the array fields have lengths that depend on the discriminants of
the record type. However, in this case the array lengths cannot be
expressed as DWARF location expressions, with the issue being that "A2"
has a non-constant offset, but an expression involving
[24 lines not shown]
DAG: Avoid asserting on libcall action if function is unavailable
Eventually the set of available functions will be a program
dependent property, which could diverge from the static table of
functions for the subtarget. In that case, fall back to the usual
expansion.
AMDGPU: Fix broken exp10 lowering for f16
This was calling the exp handling, so multiplying by the wrong
constant.
GlobalISel is still broken, but missing the fast exp10 path.
This is tracked in https://github.com/llvm/llvm-project/issues/170576
[clang] Use tighter lifetime bounds for C temporary arguments
In C, consecutive statements in the same scope are under
CompoundStmt/CallExpr, while in C++ they typically fall under
CompoundStmt/ExprWithCleanup. This leads to different behavior with
respect to where pushFullExprCleanUp inserts the lifetime end markers
(e.g., at the end of scope).
For these cases, we can track and insert the lifetime end markers right
after the call completes. Allowing the stack space to be reused
immediately. This partially addresses #109204 and #43598 for improving
stack usage.
[clang] warn_cstruct_memaccess and warn_cxxstruct_memaccess are too agressive about initializers
These warnings are triggered for zeroing initializers of non-trivially
initializable and non-trivially copyable types.
This results in significant numbers of warnings in idiomatic object
initialization code, where memset and similar are used to ensure no
stale data is present in fields or padding bytes.
Addresses #156996
[ORC] Add CallViaEPC and CallSPSViaEPC utilities. (#170464)
These utilities simplify making typed async calls via
ExecutorProcessControl::callWrapperAsync.
CallViaEPC.h provides utilities for making typed async calls using a
given Serializer to serialize arguments and deserialize results.
callViaEPC takes a result handler function object (accepting
Expected<T>), an EPC reference, a Serializer, a target function address,
and arguments. The return type T is inferred from the handler's argument
type using CallableTraitsHelper.
EPCCaller wraps an ExecutorProcessControl& and Serializer, simplifying
repeated calls with the same serialization.
EPCCall wraps an EPCCaller and target function address, simplifying
repeated calls to a specific wrapper function.
[9 lines not shown]
[lldb] Refactor LookupInfo object to be per-language (#168797)
Some months ago, the LookupInfo constructor logic was refactored to not
depend on language specific logic, and use languages plugins instead. In
this refactor, when the language type is unknown, a single LookupInfo
object will handle multiple languages. This doesn't work well, as
multiple languages might want to configure the LookupInfo object in
different ways. For example, different languages might want to set the
m_lookup_name differently from each other, but the previous
implementation would pick the first name a language provided, and
effectively ignored every other language. Other fields of the LookupInfo
object are also configured in incompatible ways.
This approach doesn't seem to be a problem upstream, since only the
C++/Objective-C language plugins are available, but it broke downstream
on the Swift fork, as adding Swift to the list of default languages when
the language type is unknown breaks C++ tests.
This patch makes it so instead of building a single LookupInfo object
[3 lines not shown]
[CIR][X86] Implement lowering for pmuldq / pmuludq builtins (#169853)
part of [#167765](https://github.com/llvm/llvm-project/issues/167765)
This patch adds CIR codegen support for X86 pmuldq and pmuludq
operations, covering the signed and unsigned variants across all
supported vector widths. The builtins now lower to the expected CIR
representation matching the semantics of the corresponding LLVM
intrinsics.
[clang] warn_cstruct_memaccess and warn_cxxstruct_memaccess are too agressive about initializers
These warnings are triggered for zeroing initializers of non-trivially
initializable and non-trivially copyable types.
This results in significant numbers of warnings in idiomatic object
initialization code, where memset and similar are used to ensure no
stale data is present in fields or padding bytes.
Addresses #156996
[OpenMP][Clang] Parsing/Sema support for `use_device_ptr(fb_preserve/fb_nullify)`.
Depends on #169603.
This is the `use_device_ptr` counterpart of #168905.
With OpenMP 6.1, a `fallback` modifier can be specified on the
`use_device_ptr` clause to control the behavior when a pointer lookup
fails, i.e. there is no device pointer to translate into.
The default is `fb_preserve` (i.e. retain the original pointer), while
`fb_nullify` means: use `nullptr` as the translated pointer.
[clang][DependencyScanning] Separate clangDependencyScanning and DependencyScanningTool (NFC) (#169962)
This patch is the first of two in refactoring Clang's dependency
scanning tooling to remove its dependency on clangDriver.
It separates Tooling/DependencyScanningTool.cpp from the rest of
clangDependencyScanning and moves clangDependencyScanning out of
clangTooling into its own library. No functional changes are
introduced.
The follow-up patch (#169964) will restrict clangDependencyScanning to
handling only -cc1 command line inputs and will move all functionality
related to handling driver commands into clangTooling.
(Tooling/DependencyScanningTool.cpp).
This is part of a broader effort to support driver-managed builds for
compilations using C++ named modules and/or Clang modules. It is
required for linking the dependency scanning tooling against the driver
without introducing cyclic dependencies, which would otherwise cause
[4 lines not shown]
[lldb-dap] Fix format string on Mac OS (#169933)
On Mac `unsigned long long` corresponds to `uint64_t` so `%llu` is
needed.
Simply always using `%llu` and upcasting to `unsigned long long` should
make it work fine.
[PowerPC] Add AMO load signed builtins
This commit adds two Clang builtins for AMO load signed operations:
__builtin_amo_lwat_st for 32-bit signed operations
__builtin_amo_ldat_s for 64-bit signed operations