libclc: Update acosh (#188224)
libclc: Update acosh
This was originally ported from rocm device libs
in ca4d382e119e1389c83dbb07d9ca0085e88b2944. Merge in
more recent changes.
Remove unused ep_log.
[clang][SYCL] Strip references from generated kernel argument types (#186788)
Prior to this patch kernel generation copied kernel argument types
exactly as they're written in function's signature of the function
attributed with [[clang::sycl_kernel_entry_point]] attribute. This
caused generation of kernels that have reference kernel argument instead
of byval kernel argument.
SYCL 2020 doesn't allow reference kernel arguments and it doesn't seem
to work with the backends.
Arguments to [[clang::sycl_kernel_entry_point]]-attributed function can
be big, so we preserve references during host code generation to avoid
performance issues in SYCL Runtime library implementation because the
same function will be used for actual kernel argument setting via
sycl_kernel_launch interface.
Note that we still need to diagnose references in user's kernel
arguments since they are explicitly not allowed by SYCL 2020 spec and
this task is in TODO list. This patch simply removes references from
types written in SYCL Runtime library.
Assisted-by: claude in test writing.
AMDGPU: Codegen for v_dual_dot2acc_f32_f16/bf16 from VOP3
Codegen for v_dual_dot2acc_f32_f16/bf16 for targets that only have VOP3
version of the instruction.
Since there is no VOP2 version, instroduce temporary mir DOT2ACC pseudo
that is selected when there are no src_modifiers. This DOT2ACC pseudo
has src2 tied to dst (like the VOP2 version), PostRA pseudo expansion will
restore pseudo to VOP3 version of the instruction.
CreateVOPD will recoginize such VOP3 pseudo and generate v_dual_dot2acc.
[AArch64][GlobalISel] Add test for v4i32 vector extract sqdmlal/sqdmlsl
1. Tests only test v4i32 versions of the intrinsic, as v2i32 currently doesn't work.
2. GlobalISel currently generates poor code in the sqdmlsl case. To fix, the sqdmlalvi64_indexed pattern needs to be copied over for sqdmlsl.
Address review comments: mark unused param and move var decl
- Mark the unused 'clauses' parameter in TaskloopOp::build with
[[maybe_unused]]
- Move the declaration of 'wrapperClauseOps' in genStandaloneTaskloop
to immediately before its first use
Assisted-by: Copilot, Claude Sonnet 4.6
[MLIR][Python][Docs] Fix example of Python-defined dialects (#188174)
Some breaking changes are introduced in #186574 for Python-defined
dialects. So we need to fix the example in the docs.
[libunwind][PAC] Defang ptrauth's PC in valid CFI range abort
It turns out making the CFI check a release mode abort causes many,
if not the majority, of JITs to fail during unwinding as they do not
set up CFI sections for their generated code. As a result any JITs
that do nominally support unwinding (and catching) through their JIT
or assembly frames trip this abort.
rdar://170862047
[libc] cbrtf guard to sync with general format (#188207)
This PR intends to make the cbrtf guard sync with the general format (
as other functions consistently use `LLVM_LIBC_SHARED_MATH_<func>_H`).
This was found during refactoring cbrtbf16.