[AArch64] Ensure FPR128 callee-save stack offsets are aligned (#184314)
This was benign for Linux targets (as when dividing by the scale the
offset would be correctly truncated), so only resulted in failures with
`-DLLVM_ENABLE_ASSERTIONS=On`. On Windows, this was a miscompile as the
lack of alignment would result in the FPR128 callee-save getting
assigned to the same offset as the previous GPR.
Fixes: #183708
libclc: Add uintptr_t overloads of atomic_fetch_add and sub (#185263)
This is a special case because the pointee type is unsigned, but the
input value is signed. Directly use the opencl builtins, because these
work correctly without any ugly casting required, and it's not worth
putting a wrapper in clc.
libclc: Add uintptr_t overloads of atomic_fetch_add and sub
This is a special case because the pointee type is unsigned, but the
input value is signed. Directly use the opencl builtins, because these
work correctly without any ugly casting required, and it's not worth
putting a wrapper in clc.
[CIR] Extract base classes for complex ops
Extract CIR_ComplexPartOp for ComplexRealOp/ComplexImagOp,
CIR_ComplexPartPtrOp for ComplexRealPtrOp/ComplexImagPtrOp,
CIR_ComplexBinOp for ComplexAddOp/ComplexSubOp, and
CIR_ComplexRangeBinOp for ComplexMulOp/ComplexDivOp to
eliminate duplicated arguments, results, format, and traits.
libclc: Move subgroup functions into clc (#185220)
It turns out there was a generic implementation of the id and sizes.
The practice of splitting every single function into its own file is
kind of a pain here, so introduce a utility header for amdgpu.
libclc: Move subgroup functions into clc
It turns out there was a generic implementation of the id and sizes.
The practice of splitting every single function into its own file is
kind of a pain here, so introduce a utility header for amdgpu.
libclc: Move subgroup functions into clc
It turns out there was a generic implementation of the id and sizes.
The practice of splitting every single function into its own file is
kind of a pain here, so introduce a utility header for amdgpu.
[CIR] Extract CIR_VAOp base class for VAStartOp and VAEndOp
Both ops share identical arguments and assembly format. Extract a common
base class to eliminate the duplication.
[libunwind][PAC] Defang ptrauth's PC in valid CFI range abort
It turns out making the CFI check a release mode abort causes many,
if not the majority, of JITs to fail during unwinding as they do not
set up CFI sections for their generated code. As a result any JITs
that do nominally support unwinding (and catching) through their JIT
or assembly frames trip this abort.
rdar://170862047
[CIR] Change CmpOp assembly format to use bare keyword style
Update the assembly format of cir.cmp from the parenthesized style
cir.cmp(gt, %a, %b) : !s32i, !cir.bool
to the bare keyword style used by other CIR ops like cir.cast:
cir.cmp gt %a, %b : !s32i
The result type (!cir.bool) is now automatically inferred as it is
always cir::BoolType.