[AArch64] Add a performBICiCombine function.
This moves the code out of PerformDAGCombine directly, changing the return
to return SDValue(N, 0) to match other uses of SimplifyDemandedBits.
graphics/nvidia-drm-latest-kmod: Refresh distinfo
This was missed in the update of drm-latest-kmod
Fixes: bd06ee8f20bd ("graphics/drm-latest-kmod: Fix build on -CURRENT")
Sponsored by: Beckhoff Automation GmbH & Co. KG
graphics/nvidia-drm-latest-kmod: Refresh distinfo
This was missed in the update of drm-latest-kmod
Fixes: bd06ee8f20bd ("graphics/drm-latest-kmod: Fix build on -CURRENT")
Sponsored by: Beckhoff Automation GmbH & Co. KG
[RISCV] Custom legalize i32 saddo/ssubo on RV64 to return a sign extended value for the data result. (#172112)
This is consistent with how we handle regular ADD/SUB and helps with
computeNumSignBits optimizations.
Fixes #172089
[orc-rt] Prevent RTTIExtends from being used for errors. (#172250)
Custom error types (ErrorInfoBase subclasses) should use ErrorExtends as
of 8f51da369e6. Adding a static_assert allows us to enforce that at
compile-time.
[CIR] Rename allEnumCasesCovered to all_enum_cases_covered (#172153)
Use the convetional snake_case for MLIR assembly and align with
operation documentation that already mentions snake_cased attribute.
[offload] Fix CUDA args size by subtracting tail padding (#172249)
This commit makes the cuLaunchKernel call to pass the total arguments size without tail padding.
fix srp_follow to close a window on use-after-free
Use srp_enter() to get a new reference to the next element while
keeping the current element alive. Afterwards the old reference can
safely be released and the hazard in the caller provided srp_ref
struct can be updated to the hazard of the new element.
This is just in time for almost all the SRP code in the tree to go away.
from Carsten Beckmann carsten_beckmann at genua.de
ok jmatthew@
[AArch64] Support USDOT in performAddDotCombine (#171864)
This function does
// ADD(UDOT(zero, x, y), A) --> UDOT(A, x, y)
Which can equally apply to USDOT too now that we have a node for it.
[AArch64] use `isTRNMask` to calculate shuffle costs (#171524)
This builds on #169858 to fix the divergence in codegen
(https://godbolt.org/z/a9az3h6oq) between two very similar
functions initially observed in #137447 (represented in the diff by test
cases `@transpose_splat_constants` and `@transpose_constants_splat`:
```
int8x16_t f(int8_t x)
{
return (int8x16_t) { x, 0, x, 1, x, 2, x, 3,
x, 4, x, 5, x, 6, x, 7 };
}
int8x16_t g(int8_t x)
{
return (int8x16_t) { 0, x, 1, x, 2, x, 3, x,
4, x, 5, x, 6, x, 7, x };
}
```
[7 lines not shown]
[orc-rt] Add Error / Exception interop. (#172247)
The ORC runtime needs to work in diverse codebases, both with and
without C++ exceptions enabled (e.g. most LLVM projects compile with
exceptions turned off, but regular C++ codebases will typically have
them turned on). This introduces a tension in the ORC runtime: If a C++
exception is thrown (e.g. by a client-supplied callback) it can't be
ignored, but orc_rt::Error values will assert if not handled prior to
destruction. That makes the following pattern fundamentally unsafe in
the ORC runtime:
```
if (auto Err = orc_rt_operation(...)) {
log("failure, bailing out"); // <- may throw if exceptions enabled
// Exception unwinds stack before Error is handled, triggers Error-not-checked
// assertion here.
return Err;
}
```
[29 lines not shown]
[llvm][RISCV] Add bf16 vfabs and vfneg intrinsics for zvfbfa. (#172130)
These are pseudoinstruction aliases for vfsgnjx and vfsgnjn.
Co-authored-by: Craig Topper <craig.topper at sifive.com>
gfortran.mk: Express more confidence.
Darwin/aarch64 is a relatively new platform, it's only supported if someone
backports it. This was done in pkgsrc as far back as gcc12, and it's unlikely
someone will do it for older versions.
gcc12: update darwin/aarch64 patch for 12.5.0
This is my own work of forward porting the gcc 12.4.0 patch used by
homebrew. For future reference it seems like github.com/iains might be
the originator, but they haven't updated their gcc-12 branch yet.