epoch: Don't idle CPUs when there's pending epoch work
The epoch(9) subsystem implements per-CPU queues of object destructors
which get invoked once it is safe to do so. These queues are polled via
hardclock().
When a CPU is about to go idle, we reduce the hardclock frequency to 1Hz
by default, to avoid unneeded wakeups. This means that if there is any
garbage in these destructor queues, it won't be cleared for at least 1s
(and possibly longer) even if it would otherwise be safe to do so.
epoch_drain_callbacks() is used in some places to provide a barrier,
ensuring that all garbage present in the destructor queues is cleaned up
before returning. It's implemented by adding a fake destructor in the
queues and blocking until it gets run on all CPUs. The above-described
phenomenon means that it can take a long time for these calls to return,
even (especially) when some CPUs are idle. This causes long delays when
destroying VNET jails, for instance, as epoch_drain_callbacks() is
invoked each time a network interface is destroyed.
[13 lines not shown]
eventhandler: Fix a race when pruning eventhandlers
By default, eventhandler_deregister() blocks until it reaches some point
where no threads are invoking the event. At this point, it knows that
1) no threads are currently executing the handler,
2) some thread has freed the eventhandler structure by virtue of having
called eventhandler_prune_list(),
so it is safe to return.
Suppose a thread is trying to deregister an event handler. A different
thread prunes it, and wakes up the first thread. Before the first
thread runs, a third thread grabs the event handler lock, and starts
executing handlers. The first thread observes el_runcount > 0, and goes
back to sleep. The third thread sees no event handlers to prune, and
doesn't wake up the first thread, which sleeps forever.
This change fixes the race and tries to make eventhandler_invoke() more
efficient: keep a count of the number of dead list entries and only
prune the list if there is at least one dead entry. Also, in
[9 lines not shown]
fts: Improve the description of FTS_NOSTAT
Note that we still need to stat directories and the roots.
MFC after: 1 week
Sponsored by: Klara, Inc.
Reviewed by: kevans
Differential Revision: https://reviews.freebsd.org/D57325
fts: Check link count before using it
* Check the range of the link count before trying to use it.
* Rewrite the comment explaining what the link count is used for.
MFC after: 1 week
Sponsored by: Klara, Inc.
Reviewed by: kevans
Differential Revision: https://reviews.freebsd.org/D57324
fts: Add some depth to the options test
MFC after: 1 week
Sponsored by: Klara, Inc.
Reviewed by: kevans
Differential Revision: https://reviews.freebsd.org/D57323
spi: switch to switch
use recommended switch with default case to catch invalid values
Reviewed by: kevans, adrian
Differential Revision: https://reviews.freebsd.org/D54759
tools/test/stress2/misc: Add msdosfs tests (currently failing)
Test msdos22.sh creates 1000 files with long random names consisting
of only ASCII characters. The mount is performed without -L option,
therefore no use of iconv to convert between character sets.
Test msdos23.sh mixes some non-ASCII characters into the file names.
The file system is therefore mounted with -L C.UTF-8 to include tests
of the conversions between UTF-8 and UTF-16.
Test msdos24.sh adds emojis to the names to test the (not yet
committed) support of UTF-16 surrogate pairs in filenames.
All 3 tests succeed with a small number of files (e.g., 10), but fail
most of the time when testing with 1000 files.
The tests have been added to all.exclude since they are expected to
fail. They shall be enabled as regression tests, when the msdosfs code
has been fixed.
sys: Renumber MTE SEGV codes
Some third party software expects these to not conflict. As the MTE
support isn't fully in the tree, and these values aren't in a release
we can renumber them without any backwards compatibility issues.
Sponsored by: Arm Ltd
Revert "kern_time: Honor the precise option when counting diff"
This will not work because this kernel version does not support a
precise option. We handle the clock uniformly in all cases.
This reverts commit 3886f1b488e47eba98e1523f85cb570694e97385.
MAC/do: Add consistency tests
Test that:
1. Concurrent changes to different parameters on the same jail are
independent/atomic.
2. Inheritance works.
3. Relaxing only parent jail rules does not leak to a subjail thanks to
sequential consistency.
4. Sysctl knobs and jail parameters stay consistent.
Some of these tests may be extended in the future with several layers of
jails (there is only a single subjail currently).
Reviewed by: bapt
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
Pull Request: https://ron-dev.freebsd.org/FreeBSD/src/pulls/38
MAC/do: Tests: Add support for exec paths, jail parameters, subjails
And also allow configuration of the mdo(1) executable path.
This commit only contains new or modified infrastructure. No functional
change intended at this point.
Reviewed by: bapt
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
Pull Request: https://ron-dev.freebsd.org/FreeBSD/src/pulls/38
MAC/do: Tests: Quote the source directory
In a standard test suite installation, this is not necessary, but be
bullet-proof to custom ones, however improbable.
Reviewed by: bapt
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
Pull Request: https://ron-dev.freebsd.org/FreeBSD/src/pulls/38
MAC/do: Tests: Fix copyrights
No comma needed after a single year. Add SPDX.
Reviewed by: bapt
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
Pull Request: https://ron-dev.freebsd.org/FreeBSD/src/pulls/38
MAC/do: Tests: Remove shebang lines
They are automatically added by <bsd.test.mk>.
Reviewed by: bapt
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
Pull Request: https://ron-dev.freebsd.org/FreeBSD/src/pulls/38
mac_do.4: Document executable paths, default jail values and consistency
While here, fix the bug of mentioning 'enable' as a possible value for
the 'mac.do' jail parameter whereas it is 'new' instead.
Reviewed by: bapt
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
Pull Request: https://ron-dev.freebsd.org/FreeBSD/src/pulls/38
MAC/do: Update copyright
Update years for the Foundation.
While here, remove the initial '/*-' which has been useless for a long
time.
While here, add a missing space on bapt@'s copyright line (approved by
him).
Reviewed by: bapt
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
Pull Request: https://ron-dev.freebsd.org/FreeBSD/src/pulls/38
MAC/do: Do not skip blanks when parsing executable paths
The kind of tolerance we apply to parsing rules, whose format we have
defined, cannot be applied to paths since blank characters are allowed
there.
There is still the limitation that no escape character is currently
supported, and so it is not possible to configure a path having a ':'
character.
Reviewed by: bapt
Fixes: 9818224174c4 ("MAC/do: Executable paths feature (GSoC 2025's final state)")
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
Pull Request: https://ron-dev.freebsd.org/FreeBSD/src/pulls/38
MAC/do: Serialize installing/modifying some jail's configuration
See the immediately preceding commit for explanations on what this is
fixing.
When setting 'mac.do' to 'inherit' on a jail with 'mac.do.rules' and
'mac.do.exec_paths' also specified in the same call, ensure that the
check that these passed parameters are the same as those to be inherited
is atomic with respect to enabling the inheritance (i.e., removing the
jail's 'struct conf' object). (See previous commit "MAC/do: Fix the
recent logic to set jail parameters, make it more tolerant" as for why
this check exists.)
Because we currently only modify a single configuration object per
transaction, we introduce the parse_and_commit_conf() wrapper around
parse_and_set_conf() to remove duplicated code that would ensue from
calling the latter directly, namely, releasing the 'mac_do_rwl' lock and
freeing the old configuration object (if any).
[9 lines not shown]
MAC/do: Support for atomically modifying configurations
As mentioned in previous commits "MAC/do: parse_and_set_conf(): Require
the model configuration" and "MAC/do: Sequential consistency for
configuration retrieval", the introduction of the "executable path"
feature, more fundamentally, the fact that there is now more than one
per-jail parameter and that parameters can be independently modified or
copied, causes an atomicity problem in case of concurrent accesses to of
a jail's applicable configuration.
Partially modifying a configuration is indeed akin to
a read-modify-write operation, where the read is either to the current
or an inherited configuration. More precisely, once pointed to by
a jail, a configuration object is immutable, and changing the jail's
configuration means making the jail point to another configuration
object. To change a jail's configuration, a new configuration object is
thus built, and if only some parameters have been explicitly specified,
those that have not been are set by copying the corresponding values
from an existing configuration object (in case of partial modification
[34 lines not shown]
MAC/do: Sequential consistency for configuration retrieval
Since the inception of mac_do(4), find_conf(), used to retrieve the
applicable configuration, has been weakly consistent with respect to
concurrent modifications to configuration inheritance that influence its
result (and it has been sequentially consistent with respect to other
configuration modifications, which the initial executable paths feature
and introduction of implicit parameters broke and which will be fixed in
a subsequent commit).
Indeed, find_conf() climbs the jail tree to find an applicable
configuration, which is not an atomic operation. It examines the
current jail's configuration pointer for each browsed jail, which does
not prevent concurrent modifications of the configuration pointer for
jails below or above it. Modifications above the current jail are not
a problem, since if climbing needs to continue (i.e., the current jail
inherits), these modifications will be seen if performed before that
check (and may or may not be seen if performed after that check).
However, modifications below the current jail impair sequential
[48 lines not shown]
MAC/do: Comment to explain the main invariant for configurations
Once visible, configuration structures must *never* change.
Spell that out in a comment to help future readers/contributors
understand the design.
Reviewed by: bapt
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
Pull Request: https://ron-dev.freebsd.org/FreeBSD/src/pulls/38
MAC/do: Allocate only one default configuration
When mac_do(4) is loaded, all jails get the same default configuration
(disabled, with only one allowed executable path: '/usr/bin/mdo').
Share it between all jails instead of creating a separate copy for each.
Reviewed by: bapt
Fixes: 9818224174c4 ("MAC/do: Executable paths feature (GSoC 2025's final state)")
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
Pull Request: https://ron-dev.freebsd.org/FreeBSD/src/pulls/38
MAC/do: Visually separate some file sections
With additional empty lines.
No functional change (intended).
Reviewed by: bapt
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
Pull Request: https://ron-dev.freebsd.org/FreeBSD/src/pulls/38
MAC/do: Fix reporting of "mac.do" post-"executable paths"
In mac_do_jail_get(), computation of 'jsys' had not been updated to take
into account executable paths.
Reviewed by: bapt
Fixes: 9818224174c4 ("MAC/do: Executable paths feature (GSoC 2025's final state)")
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
Pull Request: https://ron-dev.freebsd.org/FreeBSD/src/pulls/38
MAC/do: Configuration: Fix default values: Remove jail creation method
mac_do_jail_create() would create a default configuration on the
just-created jail, erroneously causing mac_do_jail_set() to then
retrieve it and use it as a model when determining the default values
for not-specified parameters, instead of using the configuration
applicable to the parent jail.
Setting a default configuration in mac_do_jail_create() had been done as
a kind of defensive measure to prevent a created jail not to have
a configuration (effectively making it inherit from an ancestor jail,
which is a security hazard except if explicitly requested). However,
this measure was never really effective (osd_jail_call(PR_METHOD_CREATE)
in kern_jail_set() calls the PR_PETHOD_CREATE methods in an unspecified
order, and stops at the first error), so we are forced to rely in any
case on the fact that an error in a PR_METHOD_CREATE or PR_METHOD_SET
method leads to stopping the jail creation process (which is the case
today; see kern_jail_set()).
[5 lines not shown]
MAC/do: Fix the recent logic to set jail parameters, make it more tolerant
The logic introduced in the initial commit for the "executable paths"
feature did not match the specification we discussed in that specifying
an empty value (for rules or executable paths) on "mac.do" being "new"
would be treated as an absence of value and trigger a copy from the
currently applicable configuration, instead of being an override that
deactivates mac_do(4) in the jail. Fix that by distinguishing both
cases.
More generally, a non-explicitly specified parameter is set to the same
value it has in the currently applicable configuration (that of the
closest ancestor jail that has one; 'prison0' (the host) always has
one), with an exception in the disable case.
On disable (explicit: "mac.do" to "disable", implicit: no parameters
passed, or at least one is empty), now accept parameters with
a non-empty value as long as at least one of them is empty (which alone
is enough to disable mac_do(4)). If no parameters are passed, both are
[24 lines not shown]