[clang] implement CWG2064: ignore value dependence for decltype
The 'decltype' for a value-dependent (but non-type-dependent) should be known,
so this patch makes them non-opaque instead.
This patch also implements what's neceessary to allow overloading
on pure differences in instantiation dependence, making `std::void_t`
usable for SFINAE purposes.
This also readds a few test cases from da98651, which was a previous attempt
at resolving CWG2064.
Fixes #8740
Fixes #61818
Fixes #190388
[RISCV] Don't check isApplicableToPLI for simm12 constants. (#192522)
It won't match except when the constant is -1, which we should use li
for. This avoids an unecessary call for hasAllWUsers in that case.
[OpenMP] Fix convention of SPIRV outline functions (#192450)
When creating an outline function for device code we're not setting the
right calling convention when the target is SPIRV. This results in the
calls to the function to be removed by the InstCombine pass as it thinks
is not callable.
[mlir][acc] Ensure implicit declare hoisting works for compute_region (#192501)
Any hoisting across `acc.compute_region` needs to be wired through block
arguments as the region is `IsolatedFromAbove`. Thus update
`ACCImplicitDeclare` to do so by using new API
`wireHoistedValueThroughIns` which handles the value wiring after
hoisting.
[RISCV] Use unsigned comparison for stack clash probing loop (#192485)
The stack clash probing loop generated in `emitDynamicProbedAlloc` used
a signed comparison (`RISCV::COND_BLT`) to determine when the allocation
target had been reached.
In 32-bit mode, memory addresses above `0x80000000` have the sign bit
set. If the stack pointer lands in this region, treating the addresses
as signed integers causes the comparison logic to fail.
This patch changes the condition code to `RISCV::COND_BLTU` (Branch if
Less Than Unsigned), which generates an unsigned comparison. This
ensures that addresses are treated correctly as unsigned quantities on
all targets.
On 64-bit systems, this change has no practical effect on valid
user-space addresses because they do not use the sign bit (being
restricted to the lower half of the address space). However, using
unsigned comparison is the correct behavior for pointer arithmetic and
[2 lines not shown]
[RISCV][P-ext] Use pli.b when only the lower 2 bytes are used. (#192400)
If the lower 2 bytes are the same and are the only bytes used we
can use pli.b instead of lui+addi.
Revert "[Fuchsia] Stack analysis flags for runtimes" (#192515)
Reverts llvm/llvm-project#175677
We noticed using -fexperimental-call-graph-section with Control Flow
Integrity causes link failures in certain situations. Reverting this
change that sets the call graph section flag until we investigate the
root cause of the problem and handle it in the compiler well.
[AMDGPU] Add `.amdgpu.info` section for per-function metadata
AMDGPU object linking requires the linker to propagate resource usage
(registers, stack, LDS) across translation units. To support this, the compiler
must emit per-function metadata and call graph edges in the relocatable object
so the linker can compute whole-program resource requirements.
This PR introduces a `.amdgpu.info` ELF section using a tagged, length-prefixed
binary format: each entry is encoded as:
```
[kind: u8] [len: u8] [payload: <len> bytes]
```
A function scope is opened by an `INFO_FUNC` entry (containing a symbol
reference), followed by per-function attributes (register counts, flags, private
segment size) and relational edges (direct calls, LDS uses, indirect call
signatures). String data such as function type signatures is stored in a
companion `.amdgpu.strtab` section.
[4 lines not shown]
[Clang] Add multilib support for GPU targets (#192285)
Summary:
This PR uses the new, generic multilib support added in
https://github.com/llvm/llvm-project/pull/188584
to also function for GPU targets. This will allow toolchains to easy
provide variants of these GPU libraries (for debug or asan). In
practice, this will look something like this:
```console
-DRUNTIMES_amdgcn-amd-amdhsa+debug_CMAKE_BUILD_TYPE=Debug \
-DRUNTIMES_amdgcn-amd-amdhsa+debug_LIBOMPTARGET_ENABLE_DEBUG=ON \
-DRUNTIMES_amdgcn-amd-amdhsa+debug_LLVM_ENABLE_RUNTIMES=openmp \
-DLLVM_RUNTIME_MULTILIBS=debug \
-DLLVM_RUNTIME_MULTILIB_debug_TARGETS="amdgcn-amd-amdhsa" \
```
This will then install it into the tree like this:
```
[7 lines not shown]
[flang][OpenMP] Get final label from nested constructs
Non-block DO loops can share termination statements. When parsing
a non-block DO loop, account for labels on terminating statements
from recursively parsed ExecutionPartConstructs.
Fixes https://github.com/llvm/llvm-project/issues/188892
[flang] Inline scalar-to-scalar TRANSFER for same-size trivial types (#191589)
Inline the TRANSFER intrinsic for scalar-to-scalar cases where the
result is a trivial type (integer, real, etc.) and source and result
have the same storage size. Instead of calling _FortranATransfer, the
lowering now emits a fir.convert on the source address followed by a
fir.load, effectively performing a reinterpret cast.
[MCP] Never eliminate frame-setup/destroy instructions (#186237)
Presumably targets only insert frame instructions which are significant,
and there may be effects MCP doesn't model. Similar to reserved
registers this
is probably overly conservative, but as this causes no codegen change in
any lit test I think it is benign.
The motivation is just to clean up #183149 for AMDGPU, as we can spill
to physical registers, and currently have to spill the EXEC mask purely
to enable debug-info.
Change-Id: I9ea4a09b34464c43322edd2900361bf635efd9f7
[OpenMP][Device] Fix __llvm_omp_indirect_call_lookup function pointer types (#192502)
`__llvm_omp_indirect_call_lookup` takes in and returns a function
pointer, so make sure the types are correct, which includes the correct
address space.
The FE was recently changed to generate the correct code
[here](https://github.com/llvm/llvm-project/pull/192470).
With this change, three function pointer tests start passing.
Signed-off-by: Nick Sarnie <nick.sarnie at intel.com>
[CIR] Implement EH handling for field initializers (#192360)
This implements the handling to call the dtor for any previously
initialized fields of destructed type if an exception is thrown later in
the initialization of the containing class.
The basic infrastructure to handle this was already in place. We just
needed a function to push an EH-only destroy cleanup on the EH stack and
a call to that function.