rtw88: Add bus attachments to the module Makefile
In addition to PCIe we will support USB and also prepare for SDIO (still
disabled locally). The module SRCS are split up into a common part,
which we always add. All three bus parts are guarded by a local
variable in the Makefile.
In addition the PCI parts require PCI to be compiled into the kernel.
We add that check in case of, e.g., SoCs with SDIO but no PCI, which
may not have PCI in the kernel config and thus the module would fail
to attach.
USB has no additional check as it is fully loadable and does not have
to be in a kernel config.
SDIO depends on an MMCCAM-enabled kernel but is otherwise loadable.
While we could, we are not splitting the various bus attachments into
individual modules as we generally do not do that in FreeBSD. [1]
Sponsored by: The FreeBSD Foundation
MFC after: 3 days
[3 lines not shown]
rtw89: harmonize all MODULE_DEPEND to rtw89
rtw89 came like rtw88 was done. Given rtw88 once was split up rtw89
got modelled the same way. Clean this up too.
Sponsored by: The FreeBSD Foundation
MFC after: 3 days
rtw89: cleanup static_assert() calls
These days we can use static_assert() without trouble so remove the
FreeBSD-specific rtw89_static_assert implementation. This reduces
the diff to upstream and will ease future driver updates.
Sponsored by: The FreeBSD Foundation
MFC after: 3 days
rtw88: harmonize all MODULE_DEPEND to rtw88
From the time I used to split up the driver into a core part and
bus attachment sub-drivers the various bus attachments had their own
module name but all is "rtw88" now.
Core functionality depends on linuxkpi, linuxkpi_wlan, and for debug.c
lindebugfs.
Each bus attachment then depends on its own parent layer if needed:
PCI gets pull in through linuxkpi, USB: depends on [the future] linuxkpi_usb,
and SDIO: depends on [the future] linuxkpi_sdio.
Sponsored by: The FreeBSD Foundation
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D55021
[AArch64] Add support for intent to read prefetch intrinsic (#179709)
This patch adds support in Clang for the PRFM IR instruction, by adding
the following builtin:
void __pldir(void const *addr);
This builtin is described in the following ACLE proposal:
https://github.com/ARM-software/acle/pull/406
Reland "[LV] Support conditional scalar assignments of masked operations" (#180708)
This patch extends the support added in #158088 to loops where the
assignment is non-speculatable (e.g. a conditional load or divide).
For example, the following loop can now be vectorized:
```
int simple_csa_int_load(
int* a, int* b, int default_val, int N, int threshold)
{
int result = default_val;
for (int i = 0; i < N; ++i)
if (a[i] > threshold)
result = b[i];
return result;
}
```
[9 lines not shown]
[mlir][vector] Reuse vector TD op in vector.xfer flatten tests (#180606)
This change adds a `RUN` line in vector-transfer-flatten.mlir that will
use `vector.flatten_vector_transfer_ops` that was introduced in #178134.
It also removes a test added in the original PR whose coverage is
already provided by pre-existing tests.
Extending UniformQuantizedType with interface-based support for new storage types in Quant dialect (#152966)
Currently, UniformQuantizedType only supports built-in MLIR storage
types such as Integer. LLM quantization research introducing feature of
using NF4 as a low precision datatype (see
https://arxiv.org/pdf/2305.14314). There is a growing need to make the
system extensible and maintainable as more types are added. Ensuring
that MLIR can natively support NF4 through a clean, extensible interface
is essential for both current and future quantization workflows.
**Current Approach and Its Limitations:**
- The present implementation relies on dynamic checks (e.g., type
switches or if-else chains) to determine the storage type and retrieve
type-specific information for legality checks.
- This approach works for a small, fixed set of types, but as the number
of supported types grows, the code becomes harder to read, maintain, and
extend.
[23 lines not shown]
InstCombine: Use SimplifyDemandedFPClass on fmul (#177490)
Start trying to use SimplifyDemandedFPClass on instructions, starting
with fmul. This subsumes the old transform on multiply of 0. The
main change is the introduction of nnan/ninf. I do not think anywhere
was systematically trying to introduce fast math flags before, though
a few odd transforms would set them.
Previously we only called SimplifyDemandedFPClass on function returns
with nofpclass annotations. Start following the pattern of
SimplifyDemandedBits, where this will be called from relevant root
instructions.
I was wondering if this should go into InstCombineAggressive, but that
apparently does not make use of InstCombineInternal's worklist.