ZTS: Add additional exceptions
The following tests have been observed to occasionally fail when
running under the CI. Updated our exceptions list to track them.
Signed-off-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Closes #18274
More consistent use of TREE_* macros in AVL comparators
Where is it appropriate and obvious, use TREE_CMP(), TREE_ISIGN() and
TREE_PCMP() instead or direct comparisons. It can make the code a lot
smaller, less error prone, and easier to read.
Sponsored-by: TrueNAS
Reviewed-by: Alexander Motin <alexander.motin at TrueNAS.com>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Rob Norris <rob.norris at truenas.com>
Closes #18259
Fix vdev_rebuild_range() tx commit
The spa_sync thread waits on ->spa_txg_zio and will set ZIO_WAIT_DONE
before running the sync tasks. The dmu_tx_commit() call must be done
after we add the child zio to the ->spa_txg_zio parent otherwise its
possible the child is added after txg_sync has waited.
Reviewed-by: Alexander Motin <alexander.motin at TrueNAS.com>
Signed-off-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Closes #18276
Add zpool properties for allocation class space
The existing zpool properties accounting pool space (size, allocated,
fragmentation, expandsize, free, capacity) are based on the normal
metaslab class or are cumulative properties of several classes combined.
Add properties reporting the space accounting metrics for each metaslab
class individually.
Also introduce pool-wide AVAIL, USABLE, and USED properties reporting
values corresponding to FREE, SIZE, and ALLOC deflated for raidz.
Update ZTS to recognize the new properties and validate reported values.
While in zpool_get_parsable.cfg, add "fragmentation" to the list of
parsable properties.
Sponsored-by: Klara, Inc.
Reviewed-by: Alexander Motin <alexander.motin at TrueNAS.com>
[4 lines not shown]
zcommon: Fix description of vdev capacity format
Capacity is reported as a percentage not a size.
Sponsored-by: Klara, Inc.
Reviewed-by: Alexander Motin <alexander.motin at TrueNAS.com>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Ameer Hamza <ahamza at ixsystems.com>
Signed-off-by: Ryan Moeller <ryan.moeller at klarasystems.com>
Closes #18238
Fix redundant declaration of dsl_pool_t
Remove redundant dsl_pool variable and duplicate spa_get_dsl()
call in vdev_rebuild_thread.
Reviewed-by: Alexander Motin <mav at FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Akash B <akash-b at hpe.com>
Closes #18263
Fix deadlock on dmu_tx_assign() from vdev_rebuild()
vdev_rebuild() is always called with spa_config_lock held in
RW_WRITER mode. However, when it tries to call dmu_tx_assign()
the latter may hang on dmu_tx_wait() waiting for available txg.
But that available txg may not happen because txg_sync takes
spa_config_lock in order to process the current txg. So we have
a deadlock case here:
- dmu_tx_assign() waits for txg holding spa_config_lock;
- txg_sync waits for spa_config_lock not progressing with txg.
Here are the stacks:
__schedule+0x24e/0x590
schedule+0x69/0x110
cv_wait_common+0xf8/0x130 [spl]
__cv_wait+0x15/0x20 [spl]
dmu_tx_wait+0x8e/0x1e0 [zfs]
[21 lines not shown]
zpl_super: prefer "new" mount API when available
This API has been available since kernel 5.2, and having it available
(almost) everywhere should give us a lot more flexibility for mount
management in the future.
Sponsored-by: TrueNAS
Reviewed-by: Tony Hutter <hutter2 at llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Rob Norris <rob.norris at truenas.com>
Closes #18260
icp: add SHA512 implementation using Intel SHA512 extensions
Generated from crypto/sha/asm/sha512-x86_64.pl in
openssl/openssl at 241d4826f8.
Sponsored-by: TrueNAS
Reviewed-by: Tony Hutter <hutter2 at llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Attila Fülöp <attila at fueloep.org>
Signed-off-by: Rob Norris <rob.norris at truenas.com>
Closes #18233
simd: detect and surface support for Intel SHA512 extensions
Recent Intel CPUs (starting with Arrow Lake and Lunar Lake) include new
vectorised SHA512 instructions. Detect them and make them available to
the rest of the system.
Note the internal name "sha512ext". This is to disambiguate from other
uses of "sha512".
Sponsored-by: TrueNAS
Reviewed-by: Tony Hutter <hutter2 at llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Attila Fülöp <attila at fueloep.org>
Signed-off-by: Rob Norris <rob.norris at truenas.com>
Closes #18233
range_tree: use zfs_panic_recover() for partial-overlap remove
zfs_range_tree_remove_impl() used a bare panic() when a segment to be
removed was not completely overlapped by an existing tree entry. Every
other consistency check in range_tree.c uses zfs_panic_recover(), which
respects the zfs_recover tunable and allows pools with on-disk
corruption to be imported and recovered. This one call was
inconsistent, making the partial-overlap case unrecoverable regardless
of zfs_recover.
Replace panic() with zfs_panic_recover() so that operators can set
zfs_recover=1 to import a corrupted pool and reclaim data, consistent
with all other range tree error paths.
Related-to: https://github.com/openzfs/zfs/issues/13483
Reviewed-by: Tony Hutter <hutter2 at llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Clemens Fruhwirth <clemens at endorphin.org>
Co-authored-by: Claude Sonnet 4.6 <noreply at anthropic.com>
Closes #18255
CI: Remove deprecated Fedora 41
Fedora 41 was deprecated on Dec 15 2025. Remove it from CI tests.
Reviewed-by: Rob Norris <robn at despairlabs.com>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: George Melikov <mail at gmelikov.ru>
Signed-off-by: Tony Hutter <hutter2 at llnl.gov>
Closes #18261
Introduce dedupused/dedupsaved pool properties
Currently there is only a dedup ratio reported via pool properties.
If dedup is enabled only for some datasets, it is impossible to say
how much space the ratio actually covers. Fix this by introducing
dedupused/dedupsaved pool properties, similar to earlier added
block cloning ones. Combined with work to expose allocation classes
stats, it should give user-space enough visibility to correlate
`zpool list` and `zfs list` space numbers.
Reviewed-by: Tony Hutter <hutter2 at llnl.gov>
Reviewed-by: Ryan Moeller <ryan.moeller at klarasystems.com>
Signed-off-by: Alexander Motin <alexander.motin at TrueNAS.com>
Closes #18245
zhack: Fix importing large allocation profiles on small pools (#18256)
This patch fixes a segmentation fault in zhack metaslab leak which might
be triggered by feeding zhack with a fragmentation profile that's
exported from a pool larger than the target pool.
Fixes: 8f15d2e4d58525e583277ccfef83f2056be4f72e
Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Reviewed-by: Paul Dagnelie <paul.dagnelie at klarasystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Mateusz Piotrowski <mateusz.piotrowski at klarasystems.com>
Linux 7.0: add shims for the fs_context-based mount API
The traditional mount API has been removed, so detect when its not
available and instead use a small adapter to allow our existing mount
functions to keep working.
Sponsored-by: TrueNAS
Reviewed-by: Tony Hutter <hutter2 at llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Rob Norris <rob.norris at truenas.com>
Closes #18216
Linux 7.0: posix_acl_to_xattr() now allocates memory
Kernel devs noted that almost all callers to posix_acl_to_xattr() would
check the ACL value size and allocate a buffer before make the call. To
reduce the repetition, they've changed it to allocate this buffer
internally and return it.
Unfortunately that's not true for us; most of our calls are from
xattr_handler->get() to convert a stored ACL to an xattr, and that call
provides a buffer. For now we have no other option, so this commit
detects the new version and wraps to copy the value back into the
provided buffer and then free it.
Sponsored-by: TrueNAS
Reviewed-by: Tony Hutter <hutter2 at llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Rob Norris <rob.norris at truenas.com>
Closes #18216
Linux 7.0: blk_queue_nonrot() renamed to blk_queue_rot()
It does exactly the same thing, just inverts the return. Detect its
presence or absence and call the right one.
Sponsored-by: TrueNAS
Reviewed-by: Tony Hutter <hutter2 at llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Rob Norris <rob.norris at truenas.com>
Closes #18216
SIMD: libspl: test the correct CPUID bit for AVX512VL
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Attila Fülöp <attila at fueloep.org>
Closes #18254
Improve misleading error messages for ZPOOL_STATUS_CORRUPT_POOL
When devices are missing or claimed by another subsystem (e.g.
mdadm, LVM), zpool import reports "The pool metadata is corrupted"
and suggests destroying the pool. This is misleading because the
metadata is not necessarily corrupted -- it may simply be incomplete
due to inaccessible devices.
Update the status, action, and recovery messages to acknowledge
that missing devices can trigger this status, and suggest checking
device availability before resorting to pool destruction.
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Chris Longros <chris.longros at gmail.com>
Closes #18251
Closes #8236
build: get objtool from $kernelbuild
On systems where `$kernelsrc` is different than `$kernelbuild`, the
objtool binary will be located in `$kernelbuild` as it's the result of
running `make prepare` during kernel build.
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Attila Fülöp <attila at fueloep.org>
Signed-off-by: Louis Leseur <louis.leseur at gmail.com>
Closes #18248
Closes #18249
Add vdev property to disable vdev scheduler
Added vdev property to disable the vdev scheduler.
The intention behind this property is to improve IOPS
performance when using o_direct.
Reviewed-by: Tony Hutter <hutter2 at llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin at TrueNAS.com>
Signed-off-by: MigeljanImeri <ImeriMigel at gmail.com>
Closes #17358
Move range_tree, btree, highbit64 to common code
Break out the range_tree, btree, and highbit64/lowbit64 code from kernel
space into shared kernel and userspace code. This is needed for the
updated `zpool status -vv` error byte range reporting that will be
coming in a future commit. That commit needs the range_tree code in
kernel and userspace.
Reviewed-by: Rob Norris <robn at despairlabs.com>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Tony Hutter <hutter2 at llnl.gov>
Closes #18133
Linux 7.0: explicitly set setlease handler to kernel implementation
The upcoming 7.0 kernel will no longer fall back to generic_setlease(),
instead returning EINVAL if .setlease is NULL. So, we set it explicitly.
To ensure that we catch any future kernel change, adds a sanity test for
F_SETLEASE and F_GETLEASE too. Since this is a Linux-specific test,
also a small adjustment to the test runner to allow OS-specific helper
programs.
Sponsored-by: TrueNAS
Reviewed-by: Tony Hutter <hutter2 at llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Rob Norris <rob.norris at truenas.com>
Closes #18215
zdb: handle key load/derive failures a bit more gracefully
There's no real need to outright crash if key loading fails; we can
just unwind nicely.
Sponsored-by: TrueNAS
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin at TrueNAS.com>
Signed-off-by: Rob Norris <rob.norris at truenas.com>
Closes #18230
zdb: don't try to load key for unencrypted dataset
Previously using -K/--key on an unencrypted dataset would trip a VERIFY,
because the dataset has nowhere to load the key into.
Now, just ignore it. This makes zdb much easier to drive when there's a
mix of encrypt and non-encrypted datasets, as the key can provided for
all of them (at least, assuming the same encryption root, which is a
common enough case).
Sponsored-by: TrueNAS
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin at TrueNAS.com>
Signed-off-by: Rob Norris <rob.norris at truenas.com>
Closes #18230
ZTS: make get_same_blocks() fail harder if zdb fails
Because it's called in $(...), it will swallow all errors, so we have to
work harder to recognise falure and echo a string that can't ever match
what the test is expecting.
Sponsored-by: TrueNAS
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin at TrueNAS.com>
Signed-off-by: Rob Norris <rob.norris at truenas.com>
Closes #18230
sha2_test: do correctness checks for all implementations
Sponsored-by: TrueNAS
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Attila Fülöp <attila at fueloep.org>
Reviewed-by: Tony Hutter <hutter2 at llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin at TrueNAS.com>
Signed-off-by: Rob Norris <rob.norris at truenas.com>
Closes #18232
get_cpu_freq: handle CPUs with variable frequency
If a CPU has variable frequency, then lscpu will list separate "CPU min
freq" and "CPU max freq" values. In this case, take the maximum.
Sponsored-by: TrueNAS
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Attila Fülöp <attila at fueloep.org>
Reviewed-by: Tony Hutter <hutter2 at llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin at TrueNAS.com>
Signed-off-by: Rob Norris <rob.norris at truenas.com>
Closes #18232
CI: Test & fix Linux ZFS built-in build
ZFS can be built directly into the Linux kernel. Add a test build
of this to the CI to verify it works. The test build is only enabled
on Fedora runners (since they run the newest kernels) and is done in
parallel with ZTS. The test build is done on vm2, since it typically
finishes ~15min before vm1 and thus has time to spare.
In addition:
- Update 'copy-builtin' to check that $1 is a directory
- Fix some VERIFYs that were causing the built-in build to fail
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Tony Hutter <hutter2 at llnl.gov>
Closes #18234