Jail sysctls: deprecate generic sysctls in favour of allow-flags
- add a missing sysctl to the deprecated list
- add a comment to not add new generic sysctls and point to SYSCTL_JAIL_PARAM instead
Reviewed by: jamie
Differential Revision: https://reviews.freebsd.org/D51150
Makefile.inc1: Drop AS and RANLIB variables
These are not used in our world and kernel build targets. We use the
compiler driver for assembly, and ar adds the archive index (symbol
table) automatically.
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D55964
_exit.2: Cross-reference atexit(3)
atexit(3) is one of the cases when _exit(2) must be used instead of
exit(3).
MFC after: 3 days
Reviewed by: mhorne, ziaee
Differential Revision: https://reviews.freebsd.org/D54467
rss: make toeplitz.c standard part of the kernel
This will fix LINT-NOIP build. This actually adds very little to the
kernel text, e.g. 500 bytes on amd64. A perfect solution would be to
instead declare rss_config.c as 'optional inet | inet6', but that would
fail to build LINT-NOIP in several NIC drivers, that use RSS and
absolutely ignore that both INET and INET6 are optional. It is very
unlikely that vendors who maintain these drivers will will ever chase the
holy grail of a build that doesn't support IPv4 and IPv6.
Fixes: d9c55b2e8cd6b79f6926278e10a79f1bcca27a4b
nullfs: Fix handling of doomed vnodes in nullfs_unlink_lowervp()
nullfs_unlink_lowervp() is called with the lower vnode locked, so the
nullfs vnode is locked too. The following can occur:
1. the vunref() call decrements the usecount 2->1,
2. a different thread calls vrele() on the vnode, decrements the
usecount 0->1, then blocks on the vnode lock,
3. the first thread tests vp->v_usecount == 0 and observes that it is
true,
4. the first thread incorrectly unlocks the lower vnode.
Fix this by testing VN_IS_DOOMED directly. Since
nullfs_unlink_lowervp() holds the vnode lock, the value of the
VIRF_DOOMED flag is stable.
Thanks to leres@ for patiently helping to track this down.
PR: 288345
MFC after: 1 week
[4 lines not shown]
bhyve: Fix unchecked stream I/O in RFB handler
Convert rfb_send_* helpers to return status codes and check their
results. Add missing checks for stream_read() and stream_write() returns
during the handshake in rfb_handle() to avoid acting on failed I/O.
Signed-off-by: Hayzam Sherif <hayzam at gmail.com>
Reviewed by: markj
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D55343
(cherry picked from commit 818971cc403e78d42b77eb6c18a2d2a073e5541f)
sysctl: Avoid calling priv_check() unnecessarily
After commit 7d1d9cc440f80 we only serialize large sysctl requests for
non-root users, but we should avoid calling priv_check() unless the
request actually is large, as that's not the common case. In
particular, priv_check() might not be cheap to evaluate if MAC hooks are
installed.
Reviewed by: olce, kib
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D55377
(cherry picked from commit 0fa6ce255661acc984a45deaf2d710149b957ce6)
bhyve: Propagate errors from rfb_recv_* functions
Update rfb_recv_* functions to return -1 on failure and 0 on success.
Update rfb_handle to check these return values and drop the connection
if an error occurs.
Signed-off-by: Hayzam Sherif <hayzam at gmail.com>
Reviewed by: markj
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
(cherry picked from commit 757b0bf5cf46230bcbeeb298f734b9bb7cde1817)
divert: Use a better source identifier for netisr_queue_src() calls
These opaque IDs are used by netisr to distribute work among threads.
The mapping function is simply SourceID % numthreads, so using socket
addresses as source IDs isn't going to distribute packets well due to
alignment.
Use the divert socket's generation number instead, as that suits this
purpose much better.
Reviewed by: zlei, glebius
MFC after: 1 week
Sponsored by: OPNsense
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D55537
(cherry picked from commit 5547a7bb39accd8f151b53e90b41d13b55f84c95)
vmm: Deduplicate VM and vCPU state management code
Now that the machine-independent fields of struct vm and struct vcpu are
available in a header, we can move lots of duplicated code into
sys/dev/vmm/vmm_vm.c. This change does exactly that.
No functional change intended.
MFC after: 2 months
Sponsored by: The FreeBSD Foundation
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D53585
(cherry picked from commit ed85203fb7a0334041db6da07e45ddda4caef13d)
bhyve: Fix a misleading error message
The ioctl might fail because it's run in a jail which doesn't have
permission to invoke ppt ioctls.
Reviewed by: jhb
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D55070
(cherry picked from commit 7ab5e3f29a50bc9294a139cc0e8e661a7c036ba3)
vmm: Consolidate vm and vcpu definitions
There is quite a lot of duplication of code between amd64, arm64 and
riscv with respect to VM and vCPU state management. This is a bit
tricky to resolve since struct vm and struct vcpu are private to vmm.c
and both structures contain a mix of machine-dependent and
machine-independent fields.
To allow deduplication without also introducing a lot of churn, follow
the approach of struct pcpu and 1) lift the definitions of those
structures into a new header, sys/dev/vmm/vmm_vm.h, and 2) define
machine-dependent macros, VMM_VM_MD_FIELDS and VMM_VCPU_MD_FIELDS which
lay out the machine-dependent fields.
One disadvantage of this approach is that the two structures are no
longer private to vmm.c, but I think this is acceptable.
No functional change intended. A follow-up change will move a good deal
of machine/vmm/vmm.c into sys/dev/vmm/vmm_vm.c.
[7 lines not shown]
vmm: Move common accessors and vm_eventinfo into sys/dev/vmm
Now that struct vm and struct vcpu are defined in headers, provide
inline accessors. We could just remove the accessors outright, but they
don't hurt and it would result in unneeded churn.
As a part of this, consolidate definitions related to struct
vm_eventinfo as well. I'm not sure if struct vm_eventinfo is really
needed anymore, now that vmmops_run implementations can directly access
vm and vcpu fields, but this can be resolved later.
No functional change intended.
MFC after: 2 months
Sponsored by: The FreeBSD Foundation
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D53586
(cherry picked from commit 5f13d6b60740c021951ae0e4d096903cfa1679e2)
pipe: Avoid unnecessary priv_check() calls in pipespace_new()
Running out of pipe map KVA is a rare case, so reorder checks
accordingly, presuming that calling priv_check() is more expensive than
the calculation. In particular, priv_check() might not be cheap to
evaluate if MAC hooks are installed.
Reviewed by: olce, kib
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D55378
(cherry picked from commit fa77660a3ccbd5f30e88093703b0f93892ef35d7)
OptionalObsoleteFiles: Don't mark /usr/lib/debug/boot directory obsolete
The intent of the currect code is to ignore anything under
/usr/lib/debug/boot/*. But we also should make sure that
/usr/lib/debug/boot directory is also ignored and is not marked
obsolete. If we don't do that, `make DBATCH_DELETE_OLD_FILES
delete-old` will try to rmdir(1) this directory, which will cause an
error, since /usr/lib/debug/boot may have nested directories like
kernel/ and modules/.
Reviewed by: markj
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D55077
(cherry picked from commit c8191c3d613928d8bd6060aa2f7da349b4090cc1)
vmm: Allow the use of PCI passthrough in a jail
After commit e11768e94787 ("vmm: Add PRIV_DRIVER checks for passthru
ioctls"), it is not possible to use PCI passthru from jails, as
PRIV_DRIVER is not granted to jails. Apparently some users expect this
to work, understanding that jailing bhyve provides little security
benefit in this configuration.
I believe we should disable ppt access in jails even when allow.vmm is
configured. To provide an escape hatch for users, add a new
allow.vmm_ppt jail configuration knob, and check it when handling ppt
ioctls in jails. Also add a new PRIV_VMM_PPTDEV to replace the use of
PRIV_DRIVER.
PR: 292750
Reviewed by: corvink
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Sponsored by: Klara, Inc.
[3 lines not shown]
buf: Relax an assertion in BUF_UNLOCK
The BUF_UNLOCK macro asserts that B_REMFREE is not set, as it is up to
the lock owner to complete the dequeue from the free list before
releasing the lock. However, if the thread has acquired the lock
multiple times, then releasing the recursive lock should be ok. Modify
the assertion to reflect this.
This was triggered by an out-of-tree filesystem.
Reviewed by: kib
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D55418
(cherry picked from commit eaeb356ce3491f05b6a99ccd485180a42df22c46)
bhyve: Move the slirp backend out into a separate process
The previous implementation implemented hostfwd rules which would allow
the host to connect to the guest via a NATed TCP connection. libslirp
also permits NAT in the other direction, but this was prevented by
bhyve's capsicum sandbox.
To make the slirp backend more useful, split the backend out into a
separate process which does not enter capability mode if outbound
connections are permitted (enabled by setting the new "open" keyword).
The process communicates with the bhyve network frontend (typically a
virtio network interface) using a unix SOCK_SEQPACKET socket pair. If
the bhyve process exits, the helper will automatically exit.
Aside from this restructuring, there is not much actual change. Many
slirp parameters are still hard-coded for now, though this may change.
The "restricted" feature is toggled by the new "open" keyword; in
particular, the backend is restricted by default for compatibility with
15.0 and 14.3.
[11 lines not shown]
kmsan: Implement __msan_test_shadow()
This is needed when building OpenZFS with KMSAN enabled, as the bundled
zstd uses it.
MFC after: 1 week
(cherry picked from commit bf149f2e88cb3836e02ddabd9944eb58650a72ae)
ip_mroute: Make the routing socket private
I have some patches which make ip_mroute and ip6_mroute multi-FIB-aware.
This enables running per-FIB routing daemons, each of which has a
separate routing socket.
Several places in the network stack check whether multicast routing is
configured by checking whether the multicast routing socket is non-NULL.
This doesn't directly translate in my proposed scheme, as each FIB would
have its own socket. I'd like to modify the ip(6)_mroute code to store
all state, including the socket, in a per-FIB structure. So, take a
step towards that and 1) hide the socket, 2) add a boolean flag which
indicates whether a multicast router is registered.
Reviewed by: pouria, zlei, glebius, adrian
MFC after: 2 weeks
Sponsored by: Stormshield
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D55236
[2 lines not shown]
ndp tests: Fix flakiness in ndp_slaac_default_route
The test sends RAs in order to test SLAAC handling in another host.
The router needs to also set net.inet6.ip6.forwarding=1, otherwise NAs
sent from it have the ROUTER flag clear, and upon receiving such an NA
the host will automatically delete routes learned from the router.
Fixes: feda329622bc ("netinet6 tests: Add a regression test for default router handling")
MFC after: 1 week
Sponsored by: Klara, Inc.
(cherry picked from commit 1eb727727a9acb5f1e66e3f70b0146e7c9c5f710)
bhyve: support MTU configuration for SLIRP net backend
Support configuring MTU for the SLIRP net backend, for example:
-s 1:0,virtio-net,slirp,mtu=2048,open
Update the manual page accordingly. While here, also document
MAC address configuration.
Reviewed by: markj
Approved by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D54133
(cherry picked from commit 90b9a77ebeb8019fbd22b3cf992370cd9b0004a2)
bhyve: Simplify passthru_msix_addr()
It can use the passthru_mmio_map() helper function. Make that change,
and also make passthru_mmio_map() use EPRINTLN to fix formatting when
the guest console is stdio.
Reviewed by: corvink, jhb
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D55067
(cherry picked from commit 86150ed98b7903feaba942f01619e74894cd23c4)
ip_mroute: Try to make function pointer declarations more consistent
The ip_mroute and ip6_mroute modules hook into the network stack via
several function pointers. Declarations for these pointers are
scattered around several headers. Put them all in the same place,
ip(6)_mroute.h.
No functional change intended.
Reviewed by: glebius
MFC after: 2 weeks
Sponsored by: Stormshield
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D55058
(cherry picked from commit 9df6a7f9a60b76eda2ac82826528487ca43edf46)
ndp tests: Fix an assertion in ndp_prefix_lifetime_extend
Here we have two interface addresses sharing a v6 prefix with finite
lifetime. The intent was to make sure that adding the second address
didn't cause the prefix's valid lifetime to drop from 20s to 10s, but of
course, while the test is running it may drop from 20s to 19s, causing
the test to fail spuriously. Relax the check a bit to avoid this.
PR: 293152
Fixes: 74999aac5eff ("in6: Modify address prefix lifetimes when updating address lifetimes")
MFC after: 1 week
Sponsored by: Klara, Inc.
(cherry picked from commit eb425dfab19be8720cf29d560b4e778fc3531106)
krb5: Make the build a bit quieter
compile_et.sh is run during buildworld and prints a bunch of debug
output. It's intrusive and probably not needed, at least by default, so
let's make the build output a bit cleaner. This is an upstream script,
but it hasn't been modified in 15 years so the local modification is
unlikely to cause any pain.
Also remove a print that shows up in buildworld -s output.
Reviewed by: cy
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D55317
(cherry picked from commit 4c247f120492d999ac90efcfc73e5fea29206d1f)
ip_mroute: Use a local variable to store a VIF pointer
This is cleaner and will make it a bit easier to add some more
indirection to the VIF table, specifically, to add per-FIB tables.
No functional change intended.
Reviewed by: glebius
MFC after: 2 weeks
Sponsored by: Stormshield
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D55057
(cherry picked from commit 0a757ef9a79d101bb4b7429ab5802579888dce98)
amd64/vmm: Lock global PCI passthrough structures
There is a global list of ppt-claimed devices, accessed via several
vmm ioctls. The ioctls are locked by per-VM locks, but this isn't
sufficient to prevent multiple VMs from trying to bind a given device.
Add a sleepable lock and use that to synchronize all access to ppt
devices.
Reviewed by: corvink, jhb
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D55065
(cherry picked from commit 36b855f1892575cbfe1cd5455b989bfc8ae07502)
syslogd: Improve handling of configuration errors
Make parse_selector() print a warning to stderr and continue parsing the
config if it encounters an invalid facility or priority. Note that
because the parsing is done from a casper service, there isn't a good
mechanism to log errors; the warnings are visible only when syslogd is
started in debug mode.
Reported by: Doug Hardie <bc979 at lafn.org>
MFC after: 1 week
Fixes: f4b4a10abb26 ("syslogd: Move selector parsing into its own function")
Reviewed by: jfree, jlduran, eugen, delphij
Differential Revision: https://reviews.freebsd.org/D55033
(cherry picked from commit 29ec3907f193e205a1c2118c182ec43e51baf717)
vmm: Fix routines which create maps of the guest physical address space
In vm_mmap_memseg(), use vm_map_insert() instead of vm_map_find().
Existing callers expect to map the GPA that they passed, whereas
vm_map_find() merely treats the GPA as a hint. Also check for overflow
and remove a test for first < 0 since "first" is unsigned.
In vmm_mmio_alloc(), return an error number instead of an object
pointer, since the sole caller doesn't need the pointer. As in
vm_mmap_memseg(), use vm_map_insert() instead of vm_map_find() and
validate parameters. This function is not directly reachable via
ioctl(), but we ought to be careful anyway.
Reviewed by: corvink, kib
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D53246
(cherry picked from commit 20a38e847251076b12c173d7aa0b37eef261fd32)