[GlobalISel] Add `widenScalarFor()` function (#187731)
The function is mentioned in `Legalizer.rst` but has been missing. This
also fixes the asymetry between `narrowScalarXXX()` that has both
`narrowScalarFor()` and `narrowScalarIf()`, and `widenScalarXXX()` that
only had `widenScalarIf()`.
[AArch64] Combine cases with the same code in `expandMOVImm` (NFC) (#187843)
Combine cases for `ORRWri`, `ORRXri`, `ANDXri` and `EORXri` in
`AArch64ExpandPseudoImpl::expandMOVImm`, because these cases are handled
with exactly the same code.
[AArch64] Fix _sys implemantation and MRS/MSR Sema checks (#187290)
This patch fixes lowering of _sys builtin, which used to lower into
invalid MSR S1... instruction. This was fixed by adding new sys llvm
intrinsic and proper lowering into sys instruction and its aliases.
I also fixed the sema check for _sys, _ReadStatusRegister and
_WriteStatusRegister builtins so they correctly capture invalid
usecases.
libclc: Implement remainder with remquo
(#187999)
This fixes conformance failures for double and
without -cl-denorms-are-zero. Optimizations are
able to eliminate the unusued quo handling without
duplicating most of the code.
libclc: Update remquo (#187998)
This was failing in the float case without -cl-denorms-are-zero
and failing for double. This now passes in all cases.
This was originally ported from rocm device libs in
8db45e4cf170cc6044a0afe7a0ed8876dcd9a863. This is mostly a port
in of more recent changes with a few changes.
- Templatification, which almost but doesn't quite enable
vectorization yet due to the outer branch and loop.
- Merging of the 3 types into one shared code path, instead of
duplicating per type with 3 different functions implemented together.
There are only some slight differences for the half case, which mostly
evaluates as float.
- Splitting out of the is_odd tracking, instead of deriving it from the
accumulated quotient. This costs an extra register, but saves several
[6 lines not shown]
[mlir][LLVM] Add more `llvm.intr.experimental.constrained.*` ops (#187948)
Add additional "constrained" intrinsic ops. A rounding mode can be
specified for these ops.
Assisted by: claude-4.6-opus-high
[clang][bytecode] Create fewer pointers in __builtin_nan() (#187990)
Check the elements directly for initialization state and keep track of
whether we found a NUL byte.
games/retroarch: Transfer maintainership
PR: 291648
Approved by: Daniel Menelkir <dmenelkir at gmail.com> (maintainer, timeout 3 months, inactive since ~2024.08)
libclc: Update remainder
Previously this was failing conformance without -cl-denorms-are-zero
in the float case, and always failing in the double case.
libclc: Implement remainder with remquo
This fixes conformance failures for double and
without -cl-denorms-are-zero. Optimizations are
able to eliminate the unusued quo handling without
duplicating most of the code.