kernel - Fix overflow in alist and blist code
* This code tracks swap space and large blocks of contiguous DMA memory.
* Fix overflows in array size calculations that did not take into
account terminator entries.
(a) Remove terminals from alists entirely.
(b) Account for space used by the terminator in blists.
Found-by: tuxillo / AI
kernel - Fix serious root vulnerabilities in the caps code
* The caps code was inadvertently allowing many root-only operations
to be run from user mode, particularly mount/umount ops by assuming
a root creds check that was not taking place in some of the API
calls, but was taking place in others.
* All API calls now check root creds by default unless passed the
appropriate flag.
Found-by: ivadasz (Imre Vadasz)
kernel - Fix overflow in alist and blist code
* This code tracks swap space and large blocks of contiguous DMA memory.
* Fix overflows in array size calculations that did not take into
account terminator entries.
(a) Remove terminals from alists entirely.
(b) Account for space used by the terminator in blists.
Found-by: tuxillo / AI
netstat(1): Use same width for 'Netif' column in IPv4/IPv6 cases
This utility used a narrower 'Netif' column for IPv4 than IPv6, which
looks a bit strange and is actually insufficient nowadays, since we
supports to customize the interface name. So just use the same column
width for both IPv4 and IPv6.
libstdc++: Fix unsigned wraparound in codecvt::do_length [PR105857]
When the max argument to std::codecvt<wchar_t, char, mbstate_t>::length
is SIZE_MAX/4+1 or greater the multiplication with sizeof(wchar_t) will
wrap to a small value, and the alloca call will have a buffer that's
smaller than requested. The call to mbsnrtowcs then has a buffer that is
smaller than the value passed as the buffer length. When libstdc++.so is
built with -D_FORTIFY_SOURCE=3 the mismatched buffer and length will get
detected and will abort inside Glibc.
When it doesn't abort, there's no buffer overflow because Glibc's
mbsnrtowcs has the same len * sizeof(wchar_t) calculation to determine
the size of the buffer in bytes, and that will wrap to the same small
number as the alloca argument. So luckily Glibc agrees with the caller
about the real size of the buffer, and won't overflow it.
Even when the max argument isn't large enough to wrap, it can still be
much too large to safely pass to alloca, so we should limit that. We
already have a loop that processes chunks so that we can handle null
[4 lines not shown]
nvmm(4): Expose ARCH_CAP to guest only if the host CPU supports it
* Don't expose ARCH_CAP to guest on AMD CPUs, because the ARCH_CAP
feature bit and the IA32_ARCH_CAPABILITIES MSR are Intel-specific and
unavailable on AMD systems. I decided to not follow Linux KVM, which
chose to always provide ARCH_CAP and emulate the MSR for AMD CPUs.
* Check whether the host CPU supports the ARCH_CAP feature bit and only
expose it to the guest if the host supports it.
Credit to tuxillo and Claude Opus LLM for the analyses and initial
patches.
cpu/x86_64: Fix do_cpuid() to explicitly set ECX=0
The old do_cpuid() did not initialize ECX before executing the CPUID
instruction, so the results could be incorrect when ECX contained a
non-zero garbage value.
This issue was observed on Intel CPUs when booting a FreeBSD 14.x/15.x
ISO under NVMM, where it caused a general protection fault (#GP) shortly
after the FreeBSD kernel was loaded:
qemu-system-x86_64: NVMM: Mem Assist Failed [gpa=0xbfff8]
qemu-system-x86_64: NVMM: Failed to execute a VCPU.
Abort trap (core dumped)
It occurred when NVMM tried to handle the reading of
IA32_ARCH_CAPABILITIES MSR but the second do_cpuid() returned the
incorrect results indicating that the MSR was unavailable.
The problem was first reported by mneumann in bug #3310 on 2025-11-26 [1].
[11 lines not shown]
KERN_PROC - Fix KERN_PROC_ARGS and KERN_PROC_CWD to return length if oldptr==NULL.
Sysctl handlers still have to compute the full output, even when
oldptr == NULL. This is necessary to implement the behavior documented in
sysctl(3), that it will return the required buffer length in "oldlenp", if
"oldptr" is NULL and "oldlenp" is not NULL.
procfs - Don't reset fd offset when reading regs, fpregs, or dbregs files.
* For repeated reads, we can simply use pread(2). And this way, we allow for
normal shell tooling to work correctly on these procfs files.
* This also matches the behavior on NetBSD.
wlan - Remove NULL free in fallback "none" ratectl code.
This avoids a kernel panic when detaching a wlan interface that was created
with the "none" ratectl code (i.e. when the wlan_amrr module wasn't loaded).
kernel - Do readonly check in .d_open method in mmcsd(4) and virtio_blk(4).
* Makes read-write open fail properly for read-only storage in mmcsd(4) and
virtio_blk(4), instead of only resulting in transfer errors for the
disk writes.
virtio_blk - Enable D_KVABIO API.
This should avoid some unnecessary page invalidations.
This driver already was compliant, since it never accesses any of the data
in the bio buffer.
fdisk(8): Support 4096 sector size and recognize pMBR of 4Kn disks
Tweak the fdisk(8) utility to support 4096-byte sector size, so it now
can read the pMBR on GPT-formatted 4Kn (aka 4K native) disks, e.g.,
# fdisk -s vbd0
/dev/vbd0: 14628 cyl 16 hd 56 sec
Part Start Size Type Flags
1: 1 13107199 0xee 0x00
In addition, tweak read_disk() to report the read error message.
rc.d/root: Add 'nojail' keyword to fix jail boot
A jail cannot remount the root filesystem so it was failing to boot
because of error:
Mounting root filesystem rw failed; startup aborted.
Add the 'nojail' keyword to exclude this rc script in jail boot.
FreeBSD also has this keyword for this script for 20+ years.
With this fix, a jail boots OK but there are still some errors during
the boot, so there are more rc scripts that need the 'nojail' keyword
or need tweaks for jail. Will look into this later.
Reported-by: fgudin (Francis GUDIN) on IRC
cd9660: Explicitly treat the timezone byte as a signed value
Otherwise, timezone information for time zones west of GMT gets
discarded.
Obtained-from: FreeBSD (PR kern/128934, commit 5c423e0640bcad0eb90d9c968658347228bc2818)
cd9660: Fix ISO_SUSP_CFLAG_ROOT handling in RockRidge
When encountering a ISO_SUSP_CFLAG_ROOT element in Rock Ridge processing,
this actually means there's a double slash recorded in the symbolic
link's path name. We used to start over from / then, which caused link
targets like "../../bsdi.1.0/include//pathnames.h" to be interpreted as
/pathnahes.h. This is both contradictionary to our conventional slash
interpretation, as well as potentially dangerous.
The right thing to do is (obviously) to just ignore that element.
Obtained-from: FreeBSD (commit f7d5a5328faa1cb0b6ad60860e8f46d748507c88)
route(8): Fix routename() for AF_LINK addresses with sdl_index
The link_ntoa() function is able to handle such link addresses with zero
lengths but only 'sdl_index', so route(8) would print an empty string
for some RTA_GATEWAY and RTA_IFP sockaddrs in the monitor output.
Fix routename() to correctly print such addresses by copying the code
from netstat(1).