un-ifdef i8259
We don't need different code variants for the legacy PIC. Just keep the
default variant and remove lots of #ifdefs
always defined:
ICU_HARDWARE_MASK
never defined:
ICU_SPECIAL_MASK_MODE
AUTO_EOI_1
AUTO_EOI_2
PIC_MASKDELAY
MASKDELAY
REORDER_IRQ
ok kettenis@ hshoexer@
bcmsdhost: Set bus clock after reset
The host reset during attach nukes SDCDIV that the bus clock setup has
initialized right before. Reorder to keep the correct value in SDCDIV.
ok kettenis@
In SEV-ES mode, guest userland is allowed to execute the vmgexit
instruction, although it has no control over the GHCB. Therefore,
it is important that the GHCB does not contain a valid request after
use.
In all "vmgexit paths" the GHCB is cleared by ghcb_sync_in() (it
calls ghcb_clear()) after returning from the hypervisor back into
the guest.
However, in _ghcb_mem_rw() I missed this when requesting MMIO writes
from the hypervisor. The diff below corrects this.
I want to keep this pattern in all vmgexit paths:
ghcb_sync_out
vmgexit
ghcb_verify_bm
ghcb_sync_in
[4 lines not shown]
As vmd(8) direct kernel launch now uses 32-bit legacy mode (with
paging disabled) we do not need the 64-bit #VC handling in locore0
anymore.
ok mlarkin@
pfctl(8): change default limiter action from no-match to block
pf(4) users who use limiters in current should update the rules
accordingly to reflect the change in default behavior. The existing
rule which reads as follows:
pass in from any to any state limiter test
needs to be changed to:
pass in from any to any state limiter test (no-match)
OK dlg@
vio: Support MTU feature
Add support for the VIRTIO_NET_F_MTU which allows to get the hardmtu
from the hypervisor. Also set the current mtu to the same value. The
virtio standard is not clear if that is recommended, but Linux does
this, too.
Use ETHER_MAX_HARDMTU_LEN as upper hardmtu limit instead of MAXMCLBYTES,
as this seems to be more correct.
If the hypervisor requests a MTU larger than ETHER_MAX_HARDMTU_LEN,
redo feature negotiation without VIRTIO_NET_F_MTU.
With this commit, OpenBSD finally works on Apple Virtualization.
Input and testing from @helg
ok jan@
make aq_start check the link is up before putting packets on the ring.
without link the hardware seems to hold onto the packets. if you
keep pushing packets onto the interface then the driver goes oactive
and then the ifqs fill up and then the system ends up short of
mbufs.
reported by Alisdair MacLeod on misc@ and narrowed down with sthen@
ok jmatthew@
Move the function reset and qportcfg operations to prepare for host memory
allocations required to support newer hardware generations.
tested by bluhm@ and stsp@ (as part of a larger diff)
ok bluhm@
Make the output of bse(4) mp-safe. Use consumer and provider indexes
instead of sc_tx.queued to determine the number of used tx slots.
Tested on RPI4.
Feedback and OK from jmatthew@/
Emulate AMD SysCfg MSR in vmm(4).
Linux kernels like to poke this to check for memory encryption
settings. Return 0's on reads instead of injecting #GP. Writes
continue to be ignored.
This reduces some noise for Linux guests on boot.
ok hshoexer@, mlarkin@
Increase MAXCPUs on amd64 to 255
Now that we have larger bitmask support for more than 64 CPUs, we can increase
the max to 255. 255 is the max that xapic can support; this number can be
bumped later if we want to discriminate x2apic vs xapic.
with input from and ok deraadt. also ok kettenis
Support more than 64 bits for amd64 TLB shootdown IPI masks
The TLB shootdown code used a uint64_t to track which CPUs needed to have
their TLB remotely flushed during pmap operations. This allowed for up to
64 CPUs maximum on amd64.
This diff changes the single uint64_t mask to an array of uint8_t masks,
sized based on MAXCPUS, and utilizes the bitmask macros in param.h to
manipulate these masks.
with input from and ok deraadt. also ok kettenis
pmap functions send various TLB shootdown operations by IPI to other cpus.
A lock is grabbed to serialize this. Then recipient cpus get sent an IPI
demanding this work. The lock is reused as a counter of cpus doing the work,
and each cpu's IPI handler decrements the counter.
The local cpu can do some operations in the parallel, before verifying
the TLB operations have completed in pmap_tlb_shootwait() which spins
for the counter to reach 0. But the counter is also a lock, and 0
means other cpu can grab it. So if the latency for the local work
exceeds the latency on the recepient cpus, the "counter-lock" can be
grabbed by a different cpu for its own TLB shootdown operations. The
original cpu will now spin waiting for this second cpu's work to
finish, creating pmap function latency.
To fix this, I create per-cpu counters which are seperate from the lock.
The IPI functions written in asm now decrement this per-cpu counter, and
when it reaches 0, the shared lock is cleared allowing another cpu to
being shootdowns tracked by its own per-cpu counter. The waiting
function only spins on the correct per-cpu counter.
As a bonus, the lock (and new variable indicating the shooting cpu)
are now in cache-aligned.
[2 lines not shown]
stack.c: avoid arithmetic on pointers to void
In stack.c r1.34 I converted one 'char *' too many to 'void *', thereby
relying on a gcc/clang extension which interprets the fictional void
type as a type of size 1 (that's what the stack code wants, fortunately).
As pointed out in the link below, -Wpointer-arith would have caught this:
https://gcc.gnu.org/onlinedocs/gcc/Pointer-Arith.html
MSVC flags this as follows:
D:\a\portable\portable\crypto\stack\stack.c(211,23): error C2036: 'const void *': unknown size [D:\a\portable\portable\build\crypto\crypto_obj.vcxproj].
Pull in workaround from the portable repo which undoes the char * -> void *
conversion.
ok jsing millert
Simplify vmd(8) structs, removing embedded vmm(4) structs.
This removes some hard dependencies from vmctl(8) on the structures
from vmm(4) and makes naming of identifiers more explicit.
Oh the surface, this is cosmetic, but the intention is to decouple
as much as possible from the dev/vmm/vmm.h to allow for upcoming
work to change vmm(4) without causing a large blast radius.
Testing help from mlarkin@ & bluhm@.
ok mlarkin@
Use scsi_io_get rather than nvme_ccb_get for passthrough commands, so we'll
sleep if there are no ccbs available, avoiding a panic that mlarkin@ ran into.
While here, take the rwlock around passthrough commands that come in through
the scsi ioctl path for consistency with the bioctl path.
ok dlg@ krw@
Remove support for validating Geofeed data
RPKI-based Geofeed authentication (RFC 9632) perhaps was a bit of a ruse
to pass IESG review. Nobody is planning on using it. Time to take it
behind the barn.
OK tb@
vio: Add more feature bit definitions
Add all non-legacy feature bit definitions from virtio 1.3 and the
definitions from 1.4 that are not >= bit 64. Remove VIRTIO_NET_F_GSO
which never worked and has been removed in virtio 1.x. Also add config
register definitions, fix a comment.
vio: Improve feature negotiation for LRO/TSO
OpenBSD requires that LRO can be switched on and off for things like
bridged vlan(4), vxlan(4), bpe(4). We currently only support switching
LRO on/off if the VIRTIO_NET_F_CTRL_GUEST_OFFLOADS feature was
negotiated. But this means if the hypervisor only offers
VIRTIO_NET_F_GUEST_TSO4/6 but not VIRTIO_NET_F_CTRL_GUEST_OFFLOADS,
things will break. In this case we must redo feature negotation without
the GUEST_TSO4/6 features.
Also, if the hypervisor offers GUEST_TSO4/6 but not the
VIRTIO_NET_F_MRG_RXBUF feature, we currently put rx buffers with a
single 4k mbuf into the rx queue while the standard says we SHOULD
insert buffers of at least 65562 bytes. Apple Virtualization refuses to
work with this configuration. As 65562 is larger than MAXMCLBYTES, we
would need to rework how we allocate our rx buffers to make this work.
For now, we would to like to simply disable GUEST_TSO4/6 if MRG_RXBUF is
missing. Unfortunately, Apple Virtualization still refuses to work
unless HOST_TSO4/6 is also disabled. Therefore, we disable all TSO if
[5 lines not shown]
rpki-client: only accept BGPsec certs with a single AS number
We've long been pointing out that the possibility of adding multiple AS
numbers and in particular AS ranges to BGPsec Router Certificates is at
best dubious. Enforce that there is a single AS, encoded as an ASID, not
as an ASRange with a single element (cf. eid7653 to RFC 3779).
Prompted by a report by Xie Yifan
with/ok claudio job
this is errata/7.7/018_rpki.patch.sig
rpki-client: check purpose for .cer files in Manifests
Only intermediate CAs and BGPsec certificates are allowed in a Manifest
fileList. Check this is the case, otherwise stop processing the cert.
Missing check reported by Xie Yifan
ok claudio job
rpki-client: only accept BGPsec certs with a single AS number
We've long been pointing out that the possibility of adding multiple AS
numbers and in particular AS ranges to BGPsec Router Certificates is at
best dubious. Enforce that there is a single AS, encoded as an ASID, not
as an ASRange with a single element (cf. eid7653 to RFC 3779).
Prompted by a report by Xie Yifan
with/ok claudio job
this is errata/7.8/012_rpki.patch.sig
viogpu_wsmmap() returns a kva but instead should return a physical
address via bus_dmamem_mmap(9). Without this, QEMU would only show a
black screen when starting X11. On the Apple Hypervisor, the kernel
would panic.
Also add calls to bus_dmamap_sync(9) before transferring the framebuffer
to host memory. It was working for me without this, but this ensures
that the host running on another CPU will see updates to the
framebuffer.
Thanks to kettenis@ for reviewing and providing feedback.
ok sf@
rpki-client: only accept BGPsec certs with a single AS number
We've long been pointing out that the possibility of adding multiple AS
numbers and in particular AS ranges to BGPsec Router Certificates is at
best dubious. Enforce that there is a single AS, encoded as an ASID, not
as an ASRange with a single element (cf. eid7653 to RFC 3779).
Prompted by a report by Xie Yifan
with/ok claudio job
rpki-client: check purpose for .cer files in Manifests
Only intermediate CAs and BGPsec certificates are allowed in a Manifest
fileList. Check this is the case, otherwise stop processing the cert.
Missing check reported by Xie Yifan
ok claudio job
regress/xstate: Dynamic xstate buffer size
The current implementation leads to an "xstate buffer too small" error
on newer machines with an xstate area bigger than 1KiB. Allocate the
buffer dynamically from PT_GETXSTATE_INFO kernel info.
ok anton@