Implement lzc_send_progress
This commit adds an implementation of lzc_send_progress, which
existed in the libzfs_core header, but not in ABI and lacked
an actual implementation. The libzfs_send_progress function
is altered so that it wraps around the lzc operation. This
fills a functional gap in libzfs core.
Reviewed-by: Alexander Motin <alexander.motin at TrueNAS.com>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Ameer Hamza <ahamza at ixsystems.com>
Signed-off-by: Andrew Walker <andrew.walker at truenas.com>
Closes #18288
Fix check for .cfi_negate_ra_state on aarch64
Checking for LD_VERSION in unreliable as not all distros define it on
the compiler's preprocessor.
Explicitly check it via autoconf.
This fixes support for Ubuntu 18.04 on arm64.
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Juhyung Park <qkrwngud825 at gmail.com>
Closes #18262
libzpool: lift zfs_file ops out to separate source file
So its easier to remove and replace on non-Unix platforms.
Sponsored-by: TrueNAS
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Jorgen Lundman <lundman at lundman.net>
Signed-off-by: Rob Norris <rob.norris at truenas.com>
Closes #18281
zstream: consolidate shared code
zstream currently contains three identical copies of dump_record(),
which appear to all be drawn from libzfs_sendrecv.c. The original
is marked internal.
This PR adds zstream_util.[hc] and puts the shared code there along with
a couple of other items in common.
No functional changes.
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Garth Snyder <garth at garthsnyder.com>
Closes #18284
Add --no-preserve-encryption flag
* Add an option to send datasets with params or replicate
without preserving encryption
* Add a test case for the new functionality
Reviewed-by: Paul Dagnelie <paul.dagnelie at klarasystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin at TrueNAS.com>
Signed-off-by: Chris Jacobs <idefix2020dev at gmail.com>
Closes #18240
Add simd_config.h and HAVE_SIMD() selector
We need to select which SIMD variable to check based on the compilation
target: HAVE_KERNEL_xxx for the Linux kernel, HAVE_TOOLCHAIN_xxx for
other platforms.
This adds a HAVE_SIMD() macro returns the right result depending on the
definedness or value of the variable for this target.
The macro is in simd_config.h, which is forcibly included in every
compiler call (like zfs_config.h), to ensure that it can be used
directly without further includes.
Sponsored-by: TrueNAS
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Rob Norris <rob.norris at truenas.com>
Closes #18285
Convert all HAVE_<name> SIMD gates to HAVE_SIMD(<name>)
The original names no longer exist, and the new ones will need to be
selectable based on the current compilation target.
Sponsored-by: TrueNAS
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Rob Norris <rob.norris at truenas.com>
Closes #18285
config: also do SIMD checks on the kernel toolchain
The kernel may be built with a different compiler, and also includes
objtool, which may fail on unknwon instructions sequences. So, we want
to run the checks a second time for that toolchain too.
Sponsored-by: TrueNAS
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Rob Norris <rob.norris at truenas.com>
Closes #18285
config: generate SIMD checks from table
No need to repeat all that boilerplate each time!
Sponsored-by: TrueNAS
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Rob Norris <rob.norris at truenas.com>
Closes #18285
config: remove checks for unused SIMD gates
Specifically, we don't have any code gated on:
HAVE_SSE
HAVE_SSE3
HAVE_SSE4_2
HAVE_AVX512CD
HAVE_AVX512DQ
HAVE_AVX512IFMA
HAVE_AVX512VBMI
HAVE_AVX512PF
HAVE_AVX512ER
So we can remove them and the checks that probe and generate them.
Sponsored-by: TrueNAS
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Rob Norris <rob.norris at truenas.com>
Closes #18285
linux/simd_x86: remove obsolete kernel feature gates
Most of the X86_FEATURE_* defines we use were introduced in kernels much
older than those we support, so there's no need to check for them.
For the history, these are the ones being removed, and the kernel
versions/commits where they were introduced:
<4.6 torvalds/linux at cd4d09ec6f6c (refactor/consolidation commit)
OSXSAVE
BMI1
BMI2
AES
PCLMULQDQ
MOVBE
SHA_NI
AVX512F
AVX512CD
AVX512ER
[19 lines not shown]
Fix log vdev removal issues
When we clear the log, we should clear all the fields, not only
zh_log. Otherwise remaining ZIL_REPLAY_NEEDED will prevent the
vdev removal. Handle it also from the other side, when zh_log
is already cleared, while zh_flags is not.
spa_vdev_remove_log() asserts that allocated space on removed log
device is zero. While it should be so in perfect world, it might
be not if space leaked at any point.
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Alexander Motin <alexander.motin at TrueNAS.com>
Closes #18277
ZTS: Adjust mmp_on_uberblocks threshold
Decrease the number of required uberblock blocks write slightly due
to observed variation when running in the CI. This should help
avoid future false positives.
Reviewed-by: Tony Hutter <hutter2 at llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Closes #18280
ZTS: Add additional exceptions
The following tests have been observed to occasionally fail when
running under the CI. Updated our exceptions list to track them.
Signed-off-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Closes #18274
More consistent use of TREE_* macros in AVL comparators
Where is it appropriate and obvious, use TREE_CMP(), TREE_ISIGN() and
TREE_PCMP() instead or direct comparisons. It can make the code a lot
smaller, less error prone, and easier to read.
Sponsored-by: TrueNAS
Reviewed-by: Alexander Motin <alexander.motin at TrueNAS.com>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Rob Norris <rob.norris at truenas.com>
Closes #18259
Fix vdev_rebuild_range() tx commit
The spa_sync thread waits on ->spa_txg_zio and will set ZIO_WAIT_DONE
before running the sync tasks. The dmu_tx_commit() call must be done
after we add the child zio to the ->spa_txg_zio parent otherwise its
possible the child is added after txg_sync has waited.
Reviewed-by: Alexander Motin <alexander.motin at TrueNAS.com>
Signed-off-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Closes #18276
Add zpool properties for allocation class space
The existing zpool properties accounting pool space (size, allocated,
fragmentation, expandsize, free, capacity) are based on the normal
metaslab class or are cumulative properties of several classes combined.
Add properties reporting the space accounting metrics for each metaslab
class individually.
Also introduce pool-wide AVAIL, USABLE, and USED properties reporting
values corresponding to FREE, SIZE, and ALLOC deflated for raidz.
Update ZTS to recognize the new properties and validate reported values.
While in zpool_get_parsable.cfg, add "fragmentation" to the list of
parsable properties.
Sponsored-by: Klara, Inc.
Reviewed-by: Alexander Motin <alexander.motin at TrueNAS.com>
[4 lines not shown]
zcommon: Fix description of vdev capacity format
Capacity is reported as a percentage not a size.
Sponsored-by: Klara, Inc.
Reviewed-by: Alexander Motin <alexander.motin at TrueNAS.com>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Ameer Hamza <ahamza at ixsystems.com>
Signed-off-by: Ryan Moeller <ryan.moeller at klarasystems.com>
Closes #18238
Fix redundant declaration of dsl_pool_t
Remove redundant dsl_pool variable and duplicate spa_get_dsl()
call in vdev_rebuild_thread.
Reviewed-by: Alexander Motin <mav at FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Akash B <akash-b at hpe.com>
Closes #18263
Fix deadlock on dmu_tx_assign() from vdev_rebuild()
vdev_rebuild() is always called with spa_config_lock held in
RW_WRITER mode. However, when it tries to call dmu_tx_assign()
the latter may hang on dmu_tx_wait() waiting for available txg.
But that available txg may not happen because txg_sync takes
spa_config_lock in order to process the current txg. So we have
a deadlock case here:
- dmu_tx_assign() waits for txg holding spa_config_lock;
- txg_sync waits for spa_config_lock not progressing with txg.
Here are the stacks:
__schedule+0x24e/0x590
schedule+0x69/0x110
cv_wait_common+0xf8/0x130 [spl]
__cv_wait+0x15/0x20 [spl]
dmu_tx_wait+0x8e/0x1e0 [zfs]
[21 lines not shown]
zpl_super: prefer "new" mount API when available
This API has been available since kernel 5.2, and having it available
(almost) everywhere should give us a lot more flexibility for mount
management in the future.
Sponsored-by: TrueNAS
Reviewed-by: Tony Hutter <hutter2 at llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Rob Norris <rob.norris at truenas.com>
Closes #18260
icp: add SHA512 implementation using Intel SHA512 extensions
Generated from crypto/sha/asm/sha512-x86_64.pl in
openssl/openssl at 241d4826f8.
Sponsored-by: TrueNAS
Reviewed-by: Tony Hutter <hutter2 at llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Attila Fülöp <attila at fueloep.org>
Signed-off-by: Rob Norris <rob.norris at truenas.com>
Closes #18233
simd: detect and surface support for Intel SHA512 extensions
Recent Intel CPUs (starting with Arrow Lake and Lunar Lake) include new
vectorised SHA512 instructions. Detect them and make them available to
the rest of the system.
Note the internal name "sha512ext". This is to disambiguate from other
uses of "sha512".
Sponsored-by: TrueNAS
Reviewed-by: Tony Hutter <hutter2 at llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Attila Fülöp <attila at fueloep.org>
Signed-off-by: Rob Norris <rob.norris at truenas.com>
Closes #18233
range_tree: use zfs_panic_recover() for partial-overlap remove
zfs_range_tree_remove_impl() used a bare panic() when a segment to be
removed was not completely overlapped by an existing tree entry. Every
other consistency check in range_tree.c uses zfs_panic_recover(), which
respects the zfs_recover tunable and allows pools with on-disk
corruption to be imported and recovered. This one call was
inconsistent, making the partial-overlap case unrecoverable regardless
of zfs_recover.
Replace panic() with zfs_panic_recover() so that operators can set
zfs_recover=1 to import a corrupted pool and reclaim data, consistent
with all other range tree error paths.
Related-to: https://github.com/openzfs/zfs/issues/13483
Reviewed-by: Tony Hutter <hutter2 at llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Clemens Fruhwirth <clemens at endorphin.org>
Co-authored-by: Claude Sonnet 4.6 <noreply at anthropic.com>
Closes #18255
CI: Remove deprecated Fedora 41
Fedora 41 was deprecated on Dec 15 2025. Remove it from CI tests.
Reviewed-by: Rob Norris <robn at despairlabs.com>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: George Melikov <mail at gmelikov.ru>
Signed-off-by: Tony Hutter <hutter2 at llnl.gov>
Closes #18261
Introduce dedupused/dedupsaved pool properties
Currently there is only a dedup ratio reported via pool properties.
If dedup is enabled only for some datasets, it is impossible to say
how much space the ratio actually covers. Fix this by introducing
dedupused/dedupsaved pool properties, similar to earlier added
block cloning ones. Combined with work to expose allocation classes
stats, it should give user-space enough visibility to correlate
`zpool list` and `zfs list` space numbers.
Reviewed-by: Tony Hutter <hutter2 at llnl.gov>
Reviewed-by: Ryan Moeller <ryan.moeller at klarasystems.com>
Signed-off-by: Alexander Motin <alexander.motin at TrueNAS.com>
Closes #18245
zhack: Fix importing large allocation profiles on small pools (#18256)
This patch fixes a segmentation fault in zhack metaslab leak which might
be triggered by feeding zhack with a fragmentation profile that's
exported from a pool larger than the target pool.
Fixes: 8f15d2e4d58525e583277ccfef83f2056be4f72e
Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Reviewed-by: Paul Dagnelie <paul.dagnelie at klarasystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Mateusz Piotrowski <mateusz.piotrowski at klarasystems.com>
Linux 7.0: add shims for the fs_context-based mount API
The traditional mount API has been removed, so detect when its not
available and instead use a small adapter to allow our existing mount
functions to keep working.
Sponsored-by: TrueNAS
Reviewed-by: Tony Hutter <hutter2 at llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Rob Norris <rob.norris at truenas.com>
Closes #18216
Linux 7.0: posix_acl_to_xattr() now allocates memory
Kernel devs noted that almost all callers to posix_acl_to_xattr() would
check the ACL value size and allocate a buffer before make the call. To
reduce the repetition, they've changed it to allocate this buffer
internally and return it.
Unfortunately that's not true for us; most of our calls are from
xattr_handler->get() to convert a stored ACL to an xattr, and that call
provides a buffer. For now we have no other option, so this commit
detects the new version and wraps to copy the value back into the
provided buffer and then free it.
Sponsored-by: TrueNAS
Reviewed-by: Tony Hutter <hutter2 at llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Rob Norris <rob.norris at truenas.com>
Closes #18216
Linux 7.0: blk_queue_nonrot() renamed to blk_queue_rot()
It does exactly the same thing, just inverts the return. Detect its
presence or absence and call the right one.
Sponsored-by: TrueNAS
Reviewed-by: Tony Hutter <hutter2 at llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Rob Norris <rob.norris at truenas.com>
Closes #18216