Clean up embedded slog metaslab across txgs
On a read-write import, metaslab_set_fragmentation() can dirty a
metaslab via vdev_dirty() while still in the txg==0 load path when its
space map has an unexpected bonus size (e.g. a makefs-created pool
whose space-map dnodes use the boot loader's 24-byte space_map_phys_t
with nblkptr=3, giving db_size=64). If that metaslab is then selected
as the embedded slog, vdev_metaslab_init() only removed it from
vdev_ms_list when txg != 0, so the txg==0 case left it queued and
metaslab_fini() tripped VERIFY(!txg_list_member(&vd->vdev_ms_list,
msp, t)).
Remove slog_ms from the dirty list for every TXG_SIZE slot before
metaslab_fini() so the cleanup is correct regardless of txg.
Reported on FreeBSD as PR 281520:
External-issue: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=281520
Reviewed-by: Alexander Motin <alexander.motin at TrueNAS.com>
[2 lines not shown]
initramfs-zfs should not try to copy directories
We had find only return files from the beginning for libgcc.so, but not
libfetch/libcurl. This oversight affected a user when vmware installed
its own libcurl.so.4 in a directory called libcurl.so.4, since our code
then tried to copy a directory, which fails.
Reviewed-by: Chris Longros <chris.longros at gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Suggested-by: Carsten Härle <carsten.haerle at straightec.de>
Signed-off-by: Richard Yao <richard at ryao.dev>
Closes #18582
Closes #18686
ZTS: remove send_delegation tests
These tests are doing the same tests as delegate/zfs_allow_send, and are
hard to follow and maintain. There's no need for them now, so drop them.
Sponsored-by: TrueNAS
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Rob Norris <rob.norris at truenas.com>
Closes #18672
ZTS: delegate: add test for send sub-permissions
Regular send and raw send are actually separate operations with separate
permissions. This adds a test to test the combinations properly using
the existing permission test infrastructure.
Sponsored-by: TrueNAS
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Rob Norris <rob.norris at truenas.com>
Closes #18672
CI: Add a unit-tests workflow to our infrastructure
Run `make unit` on each PR so the unit-test suite (currently 64
tests) is tested as it grows.
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Rob Norris <rob.norris at truenas.com>
Signed-off-by: Christos Longros <chris.longros at gmail.com>
Closes #18670
CI: Re-allow workflow_dispatch on zfs-qemu
Allow zfs-qemu to be invoked from a workflow_dispatch event (a.k.a,
manually running a workflow). This may have been accidentally disabled
in 1916c2c55.
Reviewed-by: Chris Longros <chris.longros at gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Tony Hutter <hutter2 at llnl.gov>
Closes #18680
zfs_ioctl: fix EBUSY race between quota queries and mount
zfsvfs_hold() fell back to zfsvfs_create() -> dmu_objset_own()
(exclusive) for unmounted datasets. A concurrent zfs_domount()
also calls dmu_objset_own(), causing EBUSY on the same dataset.
Introduce zfsvfs_create_hold() using dmu_objset_hold() (shared
hold) instead. Shared holds do not conflict with exclusive owns,
eliminating the race. The release path (zfsvfs_rele,
zfsvfs_create_impl error) uses dmu_objset_ds()->ds_owner to
determine whether to disown or rele, avoiding the need for an
extra flag in zfsvfs_t.
Added tests userspace_005, groupspace_005, projectspace_006
(50 iter race test).
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Tony Hutter <hutter2 at llnl.gov>
Signed-off-by: HeonJe Lee <lhjnano at gmail.com>
Closes #18611
Fix handling of _PC_HAS_HIDDENSYSTEM for FreeBSD
The hidden and system flags are only supported for
ZFS pools if the z_use_fuids is true. Fix
zfs_freebsd_pathconf() to check this.
Reviewed-by: Alexander Motin <alexander.motin at TrueNAS.com>
Signed-off-by: Rick Macklem <rmacklem at uoguelph.ca>
Closes #18688
spa: make ccw_retry_interval tunable on Linux (#18681)
zfs_ccw_retry_interval sets the time interval after which a retry of a
failed write of the configuration cache file is attempted. It was only
exposed on FreeBSD. Make it Linux tunable with ZFS_MODULE_PARAM and
document it in zfs.4.
Signed-off-by: Christos Longros <chris.longros at gmail.com>
Reviewed-by: Alexander Motin <alexander.motin at TrueNAS.com>
Signed-off-by: Richard Yao <richard at ryao.dev>
Linux 7.1 compat: META (#18682)
Update the META file to reflect compatibility with the 7.1
kernel.
Signed-off-by: Tony Hutter <hutter2 at llnl.gov>
Signed-off-by: Rob Norris <rob.norris at truenas.com>
Reviewed-by: Chris Longros <chris.longros at gmail.com>
Update our CI runners to the newest FreeBSD 15.1 RELEASE (#18667)
Signed-off-by: Christos Longros <chris.longros at gmail.com>
Reviewed-by: Alexander Motin <alexander.motin at TrueNAS.com>
Reviewed-by: Tony Hutter <hutter2 at llnl.gov>
CI: Have zfs-build-packages workflow build tarballs on Alma (#18662)
Previously, zfs-build-packages would only build source tarballs
on Fedora due to problems with building them on RHEL 7. That's
a relic of the past now, as we haven't supported RHEL 7 since
it went EOL in 2024. With this change, we now build the tarballs
on both Alma and Fedora.
Signed-off-by: Tony Hutter <hutter2 at llnl.gov>
Reviewed-by: Olaf Faaland <faaland1 at llnl.gov>
Reviewed-by: Chris Longros <chris.longros at gmail.com>
abd: Fix stats asymmetry in case of Direct I/O
abd_alloc_from_pages() does not call abd_update_scatter_stats(),
since memory is not really allocated there. But abd_free_scatter()
called by abd_free() does. It causes negative overflow of some
ABD and possibly ARC counters.
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Rob Norris <rob.norris at truenas.com>
Signed-off-by: Alexander Motin <alexander.motin at TrueNAS.com>
Closes #18390
Fix log vdev removal issues
When we clear the log, we should clear all the fields, not only
zh_log. Otherwise remaining ZIL_REPLAY_NEEDED will prevent the
vdev removal. Handle it also from the other side, when zh_log
is already cleared, while zh_flags is not.
spa_vdev_remove_log() asserts that allocated space on removed log
device is zero. While it should be so in perfect world, it might
be not if space leaked at any point.
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Alexander Motin <alexander.motin at TrueNAS.com>
Closes #18277
Simplify dnode_level_is_l2cacheable()
We should not dereference through dn_handle->dnh_dnode once we
already have a dnode pointer. The result will be the same.
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Alexander Motin <alexander.motin at TrueNAS.com>
Closes #18212
ZVOL: Add encryption key check for block cloning
Somehow during block cloning porting from file systems was missed
the check for identical encryption keys. As result, blocks cloned
between unrelated ZVOLs produced authentication errors on later
reads. Having same or different encryption root does not matter.
This patch copies dmu_objset_crypto_key_equal() call from FS side.
Reviewed-by: Ameer Hamza <ahamza at ixsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Alexander Motin <alexander.motin at TrueNAS.com>
Closes #18315
Remove parent ZIO from dbuf_prefetch()
I am not sure why it was added there 10 years ago, but it seems not
needed now. According to my tests removing it improves sequential
read performance with recordsize=4K by 5-10% by reducing the CPU
overhead in prefetcher.
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Rob Norris <robn at despairlabs.com>
Reviewed-by: Ameer Hamza <ahamza at ixsystems.com>
Reviewed-by: Akash B <akash-b at hpe.com>
Signed-off-by: Alexander Motin <alexander.motin at TrueNAS.com>
Closes #18214
abd: Fix stats asymmetry in case of Direct I/O
abd_alloc_from_pages() does not call abd_update_scatter_stats(),
since memory is not really allocated there. But abd_free_scatter()
called by abd_free() does. It causes negative overflow of some
ABD and possibly ARC counters.
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Rob Norris <rob.norris at truenas.com>
Signed-off-by: Alexander Motin <alexander.motin at TrueNAS.com>
Closes #18390
Simplify dnode_level_is_l2cacheable()
We should not dereference through dn_handle->dnh_dnode once we
already have a dnode pointer. The result will be the same.
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Alexander Motin <alexander.motin at TrueNAS.com>
Closes #18212
ZVOL: Add encryption key check for block cloning
Somehow during block cloning porting from file systems was missed
the check for identical encryption keys. As result, blocks cloned
between unrelated ZVOLs produced authentication errors on later
reads. Having same or different encryption root does not matter.
This patch copies dmu_objset_crypto_key_equal() call from FS side.
Reviewed-by: Ameer Hamza <ahamza at ixsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Alexander Motin <alexander.motin at TrueNAS.com>
Closes #18315
Fix log vdev removal issues
When we clear the log, we should clear all the fields, not only
zh_log. Otherwise remaining ZIL_REPLAY_NEEDED will prevent the
vdev removal. Handle it also from the other side, when zh_log
is already cleared, while zh_flags is not.
spa_vdev_remove_log() asserts that allocated space on removed log
device is zero. While it should be so in perfect world, it might
be not if space leaked at any point.
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Alexander Motin <alexander.motin at TrueNAS.com>
Closes #18277
Remove parent ZIO from dbuf_prefetch()
I am not sure why it was added there 10 years ago, but it seems not
needed now. According to my tests removing it improves sequential
read performance with recordsize=4K by 5-10% by reducing the CPU
overhead in prefetcher.
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Rob Norris <robn at despairlabs.com>
Reviewed-by: Ameer Hamza <ahamza at ixsystems.com>
Reviewed-by: Akash B <akash-b at hpe.com>
Signed-off-by: Alexander Motin <alexander.motin at TrueNAS.com>
Closes #18214
Rename several printf attributes declarations to __printf__
For kernel builds on FreeBSD, we redefine `__printf__` to
`__freebsd_kprintf__`, to support FreeBSD kernel printf(9) extensions
with clang.
In OpenZFS various printf related functions are declared with
`__attribute__((format(printf, X, Y)))`, so these won't work with the
above redefinition. With clang 21 and higher, this leads to errors
similar to:
sys/contrib/openzfs/module/zfs/spa_misc.c:414:38: error: passing
'printf' format string where 'freebsd_kprintf' format string is
expected [-Werror,-Wformat]
414 | (void) vsnprintf(buf, sizeof (buf), fmt, adx);
| ^
Since attribute names can always be spelled with leading and trailing
double underscores, rename these instances.
[7 lines not shown]
Rename several printf attributes declarations to __printf__
For kernel builds on FreeBSD, we redefine `__printf__` to
`__freebsd_kprintf__`, to support FreeBSD kernel printf(9) extensions
with clang.
In OpenZFS various printf related functions are declared with
`__attribute__((format(printf, X, Y)))`, so these won't work with the
above redefinition. With clang 21 and higher, this leads to errors
similar to:
sys/contrib/openzfs/module/zfs/spa_misc.c:414:38: error: passing
'printf' format string where 'freebsd_kprintf' format string is
expected [-Werror,-Wformat]
414 | (void) vsnprintf(buf, sizeof (buf), fmt, adx);
| ^
Since attribute names can always be spelled with leading and trailing
double underscores, rename these instances.
[7 lines not shown]
unit/namecheck: test name validation
Add a test_namecheck unit suite covering zfs_namecheck name check
functions, including: pool, dataset, snapshot, bookmark, component,
permset and mountpoint, plus get_dataset_depth and dataset_nestcheck.
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Rob Norris <rob.norris at truenas.com>
Signed-off-by: Christos Longros <chris.longros at gmail.com>
Closes #18643
* unit/namecheck: simplify name generation and helpers
1) introduce check_longname_invalid() and simplify the long-name helper
2) document that zfs_max_dataset_nesting is a tunable and drop the
unnecessary restore
3) drop snprintf and use fixed intervals for the delimiters in a
random string
[4 lines not shown]
Fix self-deadlock when setting the "allocating"/"path" vdev property
zfs_ioc_vdev_set_props() acquires the SCL_CONFIG lock as a reader and
holds it across the call to vdev_prop_set(). For the "allocating"
property, vdev_prop_set() calls spa_vdev_noalloc()/spa_vdev_alloc(),
which descend through spa_vdev_enter() into spa_config_enter(spa,
SCL_ALL, RW_WRITER); the "path" property does the same via
spa_vdev_setpath().
Acquiring SCL_CONFIG as a writer while the same thread already holds it
as a reader is a self-deadlock: the writer waits for scl_count to drain
to zero, but scl_count is the thread's own reader, which is not released
until vdev_prop_set() returns. As a result "zpool set allocating=off|on
<vdev>" hangs the calling thread, and txg_sync, which also needs
SCL_CONFIG as a reader, stalls behind it and freezes the pool.
The SCL_CONFIG reader was added by commit d65015938e19 ("Vdev allocation
bias/class change", #18493) to keep the vdev tree stable across the guid
lookup and the property handling.
[28 lines not shown]
dsl_scan: close errorscrub cursor on pause
If the cursor were ever to actively hold resources, not finalising it
would mean leaking those resources whenever the scrub is paused.
The cursor is already reinitialized from the stored serialized form
if/when it is resumed, so there's nothing we need from the old one, just
to release it.
Sponsored-by: TrueNAS
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin at TrueNAS.com>
Signed-off-by: Rob Norris <rob.norris at truenas.com>
Closes #18603