kqueue: compare against the size in kqueue_expand
This is a cosmetic change, rather than a functional one: comparing the
knlistsize against the fd requires a little bit of mental gymnastics to
confirm that this is fine and not doing unnecessary work in some cases.
Notably, one must consider that kq_knlistsize only grows in KQEXTENT
chunks, which means that concurrent threads trying to grow the kqueue
to consecutive fds will usually not result in the list being replaced
twice. One can also more clearly rule out classes of arithmetic
problems in the final `else` branch.
Reviewed by: kib, markj
Differential Revision: https://reviews.freebsd.org/D56209
kqueue: add some kn_knlist assertions around knlist_(add|remove)
We currently assert that kn_status is accurate, but there's more room
for error. Neither of these are very likely, but currently we'd blow up
in SLIST*() macros instead of providing more obvious diagnostics. It's
perhaps only worth testing these because knlist_remove() requires
getting logic across both f_attach() and f_detach() correct.
Reviewed by: kib, markj
Differential Revision: https://reviews.freebsd.org/D56211
kqueue: simplify knote_fdclose()
The influx logic in knote_fdclose() is a little misguided, the resulting
wakeup() call should always be redundant: knote_drop_detached() will
always issue a wakeup before it returns, so anything waiting on *that*
knote that had entered fluxwait should have been woken up then. This is
the obvious divergence from the other influx/wakeup pattern in the
implementation, which will kn_influx-- and then issue the wakeup after
it has processed all of the knotes it can make progress on.
While we're here, the kq_knlist cannot shrink, so we can avoid that
condition in the loop and avoid potentially excessive wakeups from
fluxwait on kqueues that we didn't touch.
Reviewed by: kib, markj
Differential Revision: https://reviews.freebsd.org/D56210
kqueue: avoid a possible fork-deadlock
kqueue_fork_copy() is likely to have transitioned at least one knote
through a flux state, so we should check whether we need to wake
anything up on the way out to avoid a possible deadlock.
This was a part of D56210, but we'll close the review with the next
commit.
Fixes: b11289f87123f ("kqueuex(2): add KQUEUE_CPONFORK")
Reviewed by: kib, markj
compat/linux: map TCP_USER_TIMEOUT sockopt into TCP_MAXUNACKTIME
After reading both manual pages, our TCP_MAXUNACKTIME is fairly
similar to the TCP_USER_TIMEOUT, the only considerable difference
is ours is in seconds and linux's in milliseconds.
Round up linux's in setsockopt(2) to a next whole second and
clamp ours getter to UINT_MAX ms.
Reviewed by: tuexen, glebius
Differential Revision: https://reviews.freebsd.org/D56168
MFC after: 2 weeks
Sponsored by: Sippy Software, Inc.
bhyve/virtio: Fix comparison of integer expressions of different signedness
It's a bit silly to have iov_to_buf() and buf_to_iov() return a ssize_t
to begin with, just to be able to return -1 for error. Change this to
size_t and use 0 as an error indicator, which won't require any changes
to the code using these functions.
While here, switch iov_to_buf() to use reallocf() instead of realloc().
Reviewed by: jhb
Fixes: 2a514d377b37 ("bhyve/virtio-scsi: Preallocate all I/O requests")
Differential Revision: https://reviews.freebsd.org/D55800
loader(8): embedded MD should be the most preferred currdev
A loader built with MD_IMAGE_SIZE is almost always meant for use with
its embedded image and should try that as currdev before anything else.
Recent changes (d69fc3a9dc71, 784150fd2535) seem to have relaxed the ZFS
code's search for a rootfs and exposed this problem.
Reviewed by: imp, tsoome
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D55979
(cherry picked from commit 0661997cea165e951e4e215e6aed41596d8b1d52)
cd9660: Add length checks to Rock Ridge parser
* cd9660_rrip_slink() did not check that the lengths of individual
entries do not exceed the length of the overall record.
* cd9660_rrip_altname() did not check that the length of the record
was at least 5 before subtracting 5 from it.
Note that in both cases, a better solution would be to check the length
of the data before calling the handler, or immediately upon entry of
the handler, but this would require significant refactoring.
MFC after: 1 week
Reported by: Calif.io in collaboration with Claude and Anthropic Research
Reported by: Adam Crosser, Praetorian
Reviewed by: markj
Differential Revision: https://reviews.freebsd.org/D56215
linuxkpi: Handle bin attributes in sysfs attribute groups
For instance, this is used by DRM drivers to declare the EDID property
of an GPU output connector:
sysctl -b sys.device.drmn1.card0.card0-DP-1.edid | edid-decode
...
Block 0, Base EDID:
EDID Structure Version & Revision: 1.4
Vendor & Product Identification:
Manufacturer: SAM
Model: 29814
Serial Number: 810635354 (0x3051505a)
Made in: week 15 of 2025
...
Reviewed by: bz, emaste, wulf
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D55176
acpi_spmc: Fail probe if acpi_spmc device already attached
We cannot have more than one SPMC device.
Reviewed by: olce
Approved by: olce
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D56062
ip6_mroute: Fix the type name in sysctl_mfctable()
No functional change since apparently it's fine to compute the size of
a pointer type when the base type is undefined.
Fixes: 0bb9c2b665d9 ("ip6_mroute: FIBify")
vmm: Restore the ability to create VMs as root in a jail
The new PRIV_VMM_CREATE and DESTROY permissions should be allowed by
jails, so need to be added to the list in prison_priv_check(). Then,
modify vmmdev_create() to verify that the jail was created with the
allow.vmm flag. This is already verified when opening /dev/vmmctl, but
checking again doesn't hurt and ensures that one can't pass the
allow.vmm policy by passing a vmmctl fd along a unix domain socket from
outside the jail.
Rename vmm_priv_check() to vmm_jail_priv_check() to make the function's
purpose more clear.
Reported by: novel
Reviewed by: bnovkov
Fixes: d4c05edd410e ("vmm: Add privilege checks to vmmctl operations")
Differential Revision: https://reviews.freebsd.org/D56119
pmap: Do not use PMAP_LOCK_INIT with kernel_pmap
The kernel_pmap lock is a bit special: it does not need the DUPOK flag,
and it really belongs to a different lock class. If it belongs to the
same class as regular pmap locks, then witness may report warnings when
performing UMA allocations under a regular pmap lock, if the allocation
triggers a pmap_growkernel() call.
Replace instances of PMAP_LOCK_INIT(kernel_pmap) with inline mtx_init()
calls to silence some witness warnings for harmless behaviour I see with
some uncommitted test programs.
Reviewed by: alc, kib
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D56185
bce: Fix SYSCTL_IN error check in bce_sysctl_nvram_write()
The condition after SYSCTL_IN was inverted: success (error == 0) returned
immediately and skipped the NVRAM write path, while failure fell through.
Return only when SYSCTL_IN fails.
Signed-off-by: Weixie Cui <cuiweixie at gmail.com>
Reviewed-by: ngie
Pull-Request: https://github.com/freebsd/freebsd-src/pull/2113