[CodeGen] Fix lpad padding at section start after empty block (#112595)
If a landing pad is at the very start of a split section, it has to be
padded by a nop instruction. Otherwise its offset is marked as zero in
the LSDA, which means no landing pad (leading it to be skipped).
LLVM already handles this. If a landing pad is the first machine block
in a section, a nop is inserted to ensure a non-zero offset. However, if
the landing pad is preceeded by an empty block, the nop would be
omitted.
To fix this, this patch adds a field to machine blocks indicating
whether this block contains the first instruction in its section. This
variable is then used to determine whether to emit the padding.
Co-authored-by: Jinjie Huang <huangjinjie at bytedance.com>
[AMDGPU] Readd assertions requirement to test after #170468
This was removed in #170468 now that debug counters are enabled by
default rather than requiring asserts. This AMDGPU test exercises
functionality in SIInsertWaitcnts.cpp that is fully wrapped in NDEBUG
though, so this test still needs an assertions requirement to pass.
expandFMINIMUMNUM_FMAXIMUMNUM: Improve compare between zeros (#140193)
1. On GPR32 platform, expandIS_FPCLASS may fail due to ISD::BITCAST
double to int64 may fail. Let's FP_ROUND double to float first.
Since we use it if MinMax is zero only, so the flushing won't
break anything.
2. Only one IS_FPCLASS is needed. MinMax will always be RHS if equal.
So we can select between LHS and MinMax.
It will even safe if FP_ROUND flush a small LHS, as if LHS is not zero
then, MinMax won't be Zero, so we will always use MinMax.
---------
Co-authored-by: Nikita Popov <github at npopov.com>
Co-authored-by: Matt Arsenault <arsenm2 at gmail.com>
[llvm-dwp] Fix FoundCUUnit problem on soft-stop with DWARF5 (#169783)
Currently, when a 'soft-stop' is triggered due to debug_info overflow,
there is an additional check for Dwarf5 to verify if the dwo contains a
split_compile unit (CU). However, since split_type units (TUs) are
typically placed before CUs in debug_info for Dwarf5, if an overflow is
detected within a TU causing an early break, the logic incorrectly
assumes this DWO lacks a CU and triggers an error.
Since the overflowing DWO will be discarded anyway, this validation is
redundant. This patch tries to fix this by removing the CU check during
a soft-stop.
Before this patch:
```
llvm-dwp main.dwo -continue-on-cu-index-overflow=soft-stop -o main.dwp
warning: debug_info Section Contribution Offset overflow 4G. Previous Offset 4294967271, After overflow offset 38.
error: no compile unit found in file: main.dwo
```
[4 lines not shown]
[MLIR][Presburger] optimize bound computation by pruning orthogonal constraints (#164199)
IntegerRelation uses Fourier-Motzkin elimination and Gaussian
elimination to simplify constraints. These methods may repeatedly
perform calculations and elimination on irrelevant variables.
Preemptively eliminating irrelevant variables and their associated
constraints can speed up up the calculation process.
Utils: Inhibit load/store folding through phis for llvm.protected.field.ptr.
Protected pointer field loads/stores should be paired with the intrinsic
to avoid unnecessary address escapes.
Reviewers: nikic
Reviewed By: nikic
Pull Request: https://github.com/llvm/llvm-project/pull/151649
[TTI] Remove masked/gather-scatter/strided/expand-compress costing from TTIImpl (#169885)
Following #165532, this patch moves scalarization‑cost computation into
BaseT::getMemIntrinsicCost and lets backends override it via their
getMemIntrinsicCost.
It also removes the masked/gather‑scatter/strided/expand‑compress
costing interfaces from TTIImpl.
Targets may keep them locally if needed.
Stacked on #170426 and #170436.
[libclc] Fix memory fence scope mapping for OpenCL (#170542)
The function `__opencl_get_memory_scope` incorrectly assumed that the
Clang built-in `__MEMORY_SCOPE_*` macros defined as bitmasks, while they
are actually defined as distinct integer values. This led to incorrect
mapping of OpenCL memory fence flags to LLVM memory scopes, causing
issues in generated code.
The fix involves updating the `__opencl_get_memory_scope` function to
return the correct `__MEMORY_SCOPE_*` values based on the provided
`cl_mem_fence_flags`. Additionally, the `__opencl_get_memory_semantics`
and the `__opencl_get_memory_scope` functions are marked as `static`
to avoid potential multiple definition issues during linking.
[llvm][DebugInfo] Allow DIDerivedType as a bound in DISubrangeType (#165880)
Consider this Ada type:
```
type Array_Type is array (Natural range <>) of Integer;
type Record_Type (L1, L2 : Natural) is record
I1 : Integer;
A1 : Array_Type (1 .. L1);
I2 : Integer;
A2 : Array_Type (1 .. L2);
I3 : Integer;
end record;
```
Here, the array fields have lengths that depend on the discriminants of
the record type. However, in this case the array lengths cannot be
expressed as DWARF location expressions, with the issue being that "A2"
has a non-constant offset, but an expression involving
[24 lines not shown]
DAG: Avoid asserting on libcall action if function is unavailable
Eventually the set of available functions will be a program
dependent property, which could diverge from the static table of
functions for the subtarget. In that case, fall back to the usual
expansion.
AMDGPU: Fix broken exp10 lowering for f16
This was calling the exp handling, so multiplying by the wrong
constant.
GlobalISel is still broken, but missing the fast exp10 path.
This is tracked in https://github.com/llvm/llvm-project/issues/170576
[clang] Use tighter lifetime bounds for C temporary arguments
In C, consecutive statements in the same scope are under
CompoundStmt/CallExpr, while in C++ they typically fall under
CompoundStmt/ExprWithCleanup. This leads to different behavior with
respect to where pushFullExprCleanUp inserts the lifetime end markers
(e.g., at the end of scope).
For these cases, we can track and insert the lifetime end markers right
after the call completes. Allowing the stack space to be reused
immediately. This partially addresses #109204 and #43598 for improving
stack usage.
[clang] warn_cstruct_memaccess and warn_cxxstruct_memaccess are too agressive about initializers
These warnings are triggered for zeroing initializers of non-trivially
initializable and non-trivially copyable types.
This results in significant numbers of warnings in idiomatic object
initialization code, where memset and similar are used to ensure no
stale data is present in fields or padding bytes.
Addresses #156996
[ORC] Add CallViaEPC and CallSPSViaEPC utilities. (#170464)
These utilities simplify making typed async calls via
ExecutorProcessControl::callWrapperAsync.
CallViaEPC.h provides utilities for making typed async calls using a
given Serializer to serialize arguments and deserialize results.
callViaEPC takes a result handler function object (accepting
Expected<T>), an EPC reference, a Serializer, a target function address,
and arguments. The return type T is inferred from the handler's argument
type using CallableTraitsHelper.
EPCCaller wraps an ExecutorProcessControl& and Serializer, simplifying
repeated calls with the same serialization.
EPCCall wraps an EPCCaller and target function address, simplifying
repeated calls to a specific wrapper function.
[9 lines not shown]
[lldb] Refactor LookupInfo object to be per-language (#168797)
Some months ago, the LookupInfo constructor logic was refactored to not
depend on language specific logic, and use languages plugins instead. In
this refactor, when the language type is unknown, a single LookupInfo
object will handle multiple languages. This doesn't work well, as
multiple languages might want to configure the LookupInfo object in
different ways. For example, different languages might want to set the
m_lookup_name differently from each other, but the previous
implementation would pick the first name a language provided, and
effectively ignored every other language. Other fields of the LookupInfo
object are also configured in incompatible ways.
This approach doesn't seem to be a problem upstream, since only the
C++/Objective-C language plugins are available, but it broke downstream
on the Swift fork, as adding Swift to the list of default languages when
the language type is unknown breaks C++ tests.
This patch makes it so instead of building a single LookupInfo object
[3 lines not shown]
[CIR][X86] Implement lowering for pmuldq / pmuludq builtins (#169853)
part of [#167765](https://github.com/llvm/llvm-project/issues/167765)
This patch adds CIR codegen support for X86 pmuldq and pmuludq
operations, covering the signed and unsigned variants across all
supported vector widths. The builtins now lower to the expected CIR
representation matching the semantics of the corresponding LLVM
intrinsics.
[clang] warn_cstruct_memaccess and warn_cxxstruct_memaccess are too agressive about initializers
These warnings are triggered for zeroing initializers of non-trivially
initializable and non-trivially copyable types.
This results in significant numbers of warnings in idiomatic object
initialization code, where memset and similar are used to ensure no
stale data is present in fields or padding bytes.
Addresses #156996