[CIR] Add EH handling for lifetime extended cleanups (#192305)
This adds code to call pushDestroyAndDeferDeactivation from the
pushLifetimeExtendedDestroy function. This was needed to generate the
correct code for lifetime extended cleanups when exceptions are enabled.
An extended version of the cleanup with automatic storage duration is
used as a test case.
To make this work correctly, I had to add a CleanupDeactivationScope to
RunCleanupsScope and force deactivation when forceCleanup is called.
This matches the corresponding code in classic codegen.
I surveyed other places where classic codegen is using
CleanupDeactivationScope and added a MissingFeatures marker in one
location where it was not previously marked. Other places where it was
missing were already marked in this way.
[flang] implements a rewrite pattern to constant fold fir::BoxEleSizeOp (#192320)
Implements a rewrite pattern to constant fold an `fir::BoxEleSizeOp`
when possible.
[flang][test] Experimental support of MemoryEffectOpInterface for fir.call. (#191580)
I would like to experiment with `fir.call` implementing
`MemoryEffectOpInterface`. So the main change is the fall-through
path in FIR AA. It should be NFC for Flang.
[CUDA] Change __CUDACC__ definition to 1 (#189457)
I recently encountered an issue where `nccl` used `#if __CUDACC__` ,
assuming `__CUDACC__` is not only defined but having a #if-able value.
https://github.com/NVIDIA/nccl/blob/v2.28.3-1/src/include/nccl_device/coop.h#L18
Looking at nvcc invocation, I see that:
```
echo "" | nvcc -x cu -E -Xcompiler -dM - | grep __CUDACC__
#define __CUDACC__ 1
```
Changing __CUDACC__ to 1 to match what NVIDIA downstream code
assumptions.
[PowerPC] Rework AMO load with Compare and Swap Not Equal to use post-RA pseudo expansion (#190698)
Replace the dummy call lowering with a PPCPostRAExpPseudo that hardcodes
X8/X9/X10 post-RA to satisfy the 3 consecutive register constraint for
lwat/ldat FC=16, addressing reviewer feedback.
[MLIR][XeGPU] Refactor isEvenlyDistributable() to Layout attribute interface (#191945)
This PR refactor isEvenlyDistributable() to layout attribute interface
isDistributable(), and used them in all anchor operations to check the
shape can be ditributed with the anchor layout.
[OpenASIP] Update the TCE target defs for OpenASIP 2.2 (#176698)
OpenASIP (ex. TCE*) is a special target which has only a stub target
definition in the LLVM side that has resided in LLVM for over 15 years.
I'm the original contributors of this stub.
Due to needing various other patches to LLVM that were not nicely
upstreamable, the upstream TCE target defs have long been unupdated.
However, with the recent changes to the vectorization types etc. I
managed to minimize the required LLVM TCE patch to this one and with
this patch OpenASIP can be (finally!) used without a patched LLVM for
VLIW/TTA customization. RISC-V operation set customization still
requires a patch to polish and upstream (TBD).
This patch:
* Introduces a 64b variant of an OpenASIP target.
* Unifies the datalayouts of the different target variants to make it
compatible with OpenASIP v2.2 and above.
[17 lines not shown]