OpenZFS/src f5733f6tests/zfs-tests/tests/functional/dedup dedup_bclone.ksh dedup_fdt_create.ksh

Integrate DDT and BRT tests

Don't disable block cloning during dedup tests.  Just don't use
cp to not trigger it.  Add a new test, explicitly mixing dedup
and cloning on the same file, that should be handled by DDT.

Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Rob Norris <rob.norris at truenas.com>
Signed-off-by: Alexander Motin <alexander.motin at TrueNAS.com>
Closes #18520
DeltaFile
+120-0tests/zfs-tests/tests/functional/dedup/dedup_bclone.ksh
+2-4tests/zfs-tests/tests/functional/dedup/dedup_fdt_create.ksh
+2-4tests/zfs-tests/tests/functional/dedup/dedup_legacy_fdt_upgrade.ksh
+2-4tests/zfs-tests/tests/functional/dedup/dedup_legacy_create.ksh
+1-3tests/zfs-tests/tests/functional/dedup/dedup_legacy_import.ksh
+2-2tests/zfs-tests/tests/functional/dedup/dedup_prune.ksh
+129-175 files not shown
+135-2711 files

OpenZFS/src 181e1b5include/sys zio_impl.h, man/man8 zpool-events.8

Fix double free for blocks cloned after DDT prune

Before this change, for blocks marked with D flag but absent in DDT
(pruned from it), zio_ddt_free() fell back to ZIO_STAGE_DVA_FREE
without trying ZIO_STAGE_BRT_FREE first.  Same time such blocks
might be present in BRT, and not handling that would result in
double/multiple free.

This change makes ZIO_DDT_FREE_PIPELINE include ZIO_FREE_PIPELINE,
just adding required ZIO_STAGE_ISSUE_ASYNC and ZIO_STAGE_DDT_FREE,
and moves DDT stages before BRT.  This way, if the block is found
in DDT by zio_ddt_free(), the pipeline is short-circuited to
ZIO_INTERLOCK_PIPELINE, similar to what zio_brt_free() does.  If
not, then BRT is checked, and if also no match, the block is freed.

Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Rob Norris <rob.norris at truenas.com>
Signed-off-by: Alexander Motin <alexander.motin at TrueNAS.com>
Closes #18520
DeltaFile
+152-0tests/zfs-tests/tests/functional/dedup/dedup_bclone_pruned.ksh
+15-8module/zfs/zio.c
+6-7include/sys/zio_impl.h
+5-5man/man8/zpool-events.8
+4-4tests/runfiles/common.run
+1-1module/zcommon/zfs_valstr.c
+183-251 files not shown
+184-257 files

OpenZFS/src 58c8dc5module/os/linux/zfs zpl_super.c

linux/zpl_super: handle 'source' option directly

vfs_parse_fs_param_source() didn't appear until 5.14, and was not
backported to kernel.org LTS kernels. It's simple enough that it's
easier to just handle it ourselves rather than use a configure check.

Sponsored-by: TrueNAS
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Rob Norris <rob.norris at truenas.com>
Closes #18529
DeltaFile
+19-10module/os/linux/zfs/zpl_super.c
+19-101 files

OpenZFS/src 532760emodule/os/linux/zfs zfs_vfsops.c

Linux: avoid znode list lock inversion during resume

Lockdep reports a circular locking dependency during mounted filesystem
rollback.  zfs_resume_fs() walks z_all_znodes under z_znodes_lock and
calls zfs_rezget(), which takes the per-object znode hold lock via
zfs_znode_hold_enter().

The normal zget path takes these locks in the opposite order.
zfs_zget() takes the per-object hold lock before zfs_znode_alloc()
inserts the znode on z_all_znodes under z_znodes_lock.  Resume can
therefore establish z_znodes_lock -> zh_lock while normal lookup
creates zh_lock -> z_znodes_lock.

Pin the current and next znodes with igrab() while holding the list
lock, then drop the list lock before reloading the znode.  Existing
stale inode handling is preserved, and both the suspended reference
and temporary walk reference are released asynchronously.

Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: ZhengYuan Huang <gality369 at gmail.com>
Closes #18517
DeltaFile
+39-6module/os/linux/zfs/zfs_vfsops.c
+39-61 files

OpenZFS/src 414ce4binclude/sys arc_impl.h, man/man4 zfs.4

Linux: expose zfs_arc_no_grow_shift as a module parameter

The zfs_arc_no_grow_shift variable is tunable via sysctl on FreeBSD
but had no module parameter registration on Linux.

Register it once in arc.c using param_get_uint and a per-platform
set handler, replacing the FreeBSD-only registration.

Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Alek Pinchuk <alek.pinchuk at connectwise.com>
Signed-off-by: Christos Longros <chris.longros at gmail.com>
Closes #18461
DeltaFile
+18-0module/os/linux/zfs/arc_os.c
+9-4module/zfs/arc.c
+2-2module/os/freebsd/zfs/sysctl_os.c
+0-3module/os/freebsd/zfs/arc_os.c
+2-1include/sys/arc_impl.h
+0-2man/man4/zfs.4
+31-121 files not shown
+31-137 files

OpenZFS/src 90a1740.github/workflows/scripts qemu-2-start.sh

CI: FreeBSD 15.1 STABLE

Update the freebsd15-1s builder to the released STABLE image.

Reviewed-by: Alexander Motin <alexander.motin at TrueNAS.com>
Signed-off-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Closes #18524
DeltaFile
+1-1.github/workflows/scripts/qemu-2-start.sh
+1-11 files

OpenZFS/src 59e10e7lib/libzfs libzfs_pool.c

libzfs_pool: document export and initialize functions

Add brief docstrings to zpool_export(), zpool_export_force(),
zpool_initialize() and zpool_initialize_wait().

Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Christos Longros <chris.longros at gmail.com>
Closes #18514
DeltaFile
+16-0lib/libzfs/libzfs_pool.c
+16-01 files

OpenZFS/src eaaea55tests/zfs-tests/tests/functional/send_xdr_encoding xdr_resume_bookmark_raw_with_write.ksh xdr_bookmark_raw_with_write.ksh

Consistently encode DRR_BEGIN packed nvlist payloads with NV_ENCODE_XDR

Currently, zfs send generates a mix of nvlist encodings in DRR_BEGIN
records, some XDR and some in native byte order. The result is that
most streams currently can't be zfs received on opposite-endian systems.

zfs send generates the outer wrappers for compound streams in userspace,
and it explicitly requests NV_ENCODE_XDR format for those records. But
the BEGIN records for individual datasets are generated on the kernel
side, in dmu_send.c, where fnvlist_pack() is used for encoding. That
routine hard-wires NV_ENCODE_NATIVE format.

This PR replaces the fnvlist_pack() call with a direct call to
nvlist_pack() that specifies NV_ENCODE_XDR.

Tests are included to verify that native-encoded nvlists are not
generated by any kernel path that attaches nvlists to BEGIN records.
There's also a check for XDR encoding in the outer wrapper of
replication streams in case there is ever a regression there.

    [15 lines not shown]
DeltaFile
+116-0tests/zfs-tests/tests/functional/send_xdr_encoding/xdr_resume_bookmark_raw_with_write.ksh
+107-0tests/zfs-tests/tests/functional/send_xdr_encoding/xdr_bookmark_raw_with_write.ksh
+103-0tests/zfs-tests/tests/functional/send_xdr_encoding/xdr_resume_bookmark_raw.ksh
+97-0tests/zfs-tests/tests/functional/send_xdr_encoding/xdr_redacted_received_raw.ksh
+96-0tests/zfs-tests/tests/functional/send_xdr_encoding/xdr_incr_from_redacted.ksh
+93-0tests/zfs-tests/tests/functional/send_xdr_encoding/xdr_bookmark_raw.ksh
+612-017 files not shown
+1,481-223 files

OpenZFS/src 00a941einclude/sys zap_impl.h, module/zfs zap_impl.c

zap: internal interface cleanup

Similar to previous, though a much lighter touch because these are not
"public" interfaces.

- reorganising functions into groups, by rough function class.
- matching header order to source order, to make it a little easier to
  find things.
- adding light documentation to functions that had none.

Note that I've not added any documentation for the mzap_* and fzap_*
functions, as part of this commit series is laying the groundwork to
hide those functions in their backend modules; such documentation would
become obsolete very quickly.

No actual code changes.

Sponsored-by: TrueNAS
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>

    [4 lines not shown]
DeltaFile
+263-256module/zfs/zap_impl.c
+62-15include/sys/zap_impl.h
+325-2712 files

OpenZFS/src bb304d3include/sys zap.h, module/zfs zap.c

zap: public interface cleanup

- reorganising functions into groups, collections of the variants of the
  same function.
- matching header order to source order, to make it a little easier to
  find things.
- moving per-function documentation from source to header.
- adding light documentation to functions that had none.

No actual code changes.

Sponsored-by: TrueNAS
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Tony Hutter <hutter2 at llnl.gov>
Reviewed-by: Akash B <akash-b at hpe.com>
Signed-off-by: Rob Norris <rob.norris at truenas.com>
Closes #18516
DeltaFile
+126-97module/zfs/zap.c
+123-70include/sys/zap.h
+249-1672 files

OpenZFS/src 8ff6400include/sys zap_impl.h, module Makefile.bsd

zap: split implementation out into more files

The ZAP code is mixed up across a few files without clear separation of
concerns. This splits it out from three source files to five:

- zap.c: the bulk of the "public" interface
- zap_impl.c: internals shared across all backends
- zap_micro.c: microzap backend
- zap_fat.c: fatzap backend: core logic
- zap_leaf.c: fatzap backend: leaf blocks

Note that this doesn't not change any code, just moves functions around.
Also note that right now the microzap and fatzap backends know more
about each other than is healthy. This change is simply marking out
where different things should live in the end, to make it easier for
that refactoring work to begin.

Sponsored-by: TrueNAS
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>

    [4 lines not shown]
DeltaFile
+1,071-1,440module/zfs/zap.c
+6-1,596module/zfs/zap_micro.c
+1,502-0module/zfs/zap_fat.c
+527-0module/zfs/zap_impl.c
+22-1include/sys/zap_impl.h
+3-1module/Makefile.bsd
+3,131-3,0382 files not shown
+3,135-3,0388 files

OpenZFS/src d50f5b6module/zfs dsl_dir.c

dsl_dir: avoid dd_lock during snapshots_changed updates

Avoid holding dd_lock while updating the on-disk
snapshots_changed timestamp.

Both dsl_dir_zapify() and zap_update() may dirty buffers
and recurse into space accounting, which can take dd_lock.
Holding dd_lock across either operation can therefore
preserve the lock-order inversion reported by lockdep.

Only protect the in-memory dd_snap_cmtime update
with dd_lock. Perform the zapify and ZAP update without
dd_lock held, and retry the on-disk write if another updater
advanced dd_snap_cmtime while the write was in progress.

Reviewed-by: Alexander Motin <alexander.motin at TrueNAS.com>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: ZhengYuan Huang <gality369 at gmail.com>
Co-authored-by: gality369 <gality369 at example.com>
Closes #18472
DeltaFile
+18-11module/zfs/dsl_dir.c
+18-111 files

OpenZFS/src 968f4dbman/man8 zpool-attach.8 zpool.8

zpool-attach.8: add EXAMPLES section

Mirror-attach (shared with zpool.8 example 5) and raidz expansion.

Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin at TrueNAS.com>
Reviewed-by: Tino Reichardt <milky-zfs at mcmilk.de>
Signed-off-by: Christos Longros <chris.longros at gmail.com>
Closes #18508
DeltaFile
+30-1man/man8/zpool-attach.8
+1-0man/man8/zpool.8
+31-12 files

OpenZFS/src 35853ac.github/workflows zfs-qemu.yml, .github/workflows/scripts generate-ci-type.py

CI: skip qemu matrix for documentation-only pull requests

Add a new "docs" CI type, selected when every file modified by a
pull request matches a documentation pattern (man pages, .md,
AUTHORS, COPYRIGHT, LICENSE, NOTICE, .gitignore). For this type the
os_selection is empty and the qemu matrix runs no jobs.

This affects only pull requests whose entire diff is documentation.
Any change touching a non-documentation file continues to be
classified as full, quick, linux, or freebsd by the existing
file-path rules, and a manual ZFS-CI-Type commit tag still overrides
that classification.

Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Christos Longros <chris.longros at gmail.com>
Closes #18518
DeltaFile
+22-0.github/workflows/scripts/generate-ci-type.py
+4-0.github/workflows/zfs-qemu.yml
+26-02 files

OpenZFS/src 45dddc4man/man4 zfs.4

zfs.4: Fix documentation of zfs_arc_dnode_reduce_percent

Fixes: 25458cbef Limit the amount of dnode metadata in the ARC
Fixes: 5b9f3b766 Soften pruning threshold on not evictable metadata

Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Mateusz Piotrowski <0mp at FreeBSD.org>
Closes #18513
DeltaFile
+11-4man/man4/zfs.4
+11-41 files

OpenZFS/src 6330a45. META

Tag zfs-2.4.2

META file and changelog updated.

Signed-off-by: Tony Hutter <hutter2 at llnl.gov>
DeltaFile
+1-1META
+1-11 files

OpenZFS/src f074587contrib/initramfs/scripts zfs

initramfs: fix incorrect variable rename

Fixes regression introduced by 61ab032ae0391bce38aef1e43b5b930724ecdb55.

Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Co-Authored-By: Claude Sonnet 4.6 <noreply at anthropic.com>
Signed-off-by: Joel Low <joel at joelsplace.sg>
Closes #18442
DeltaFile
+4-4contrib/initramfs/scripts/zfs
+4-41 files

OpenZFS/src 9ae9f2emodule/os/linux/zfs zfs_vnops_os.c

Linux: annotate nested xattr setattr znode locks

zfs_setattr() updates both the target znode and its hidden xattr
directory when ownership, mode, or project ID changes. The xattr
directory uses the same z_acl_lock and z_lock classes as the
parent znode, so lockdep reports recursive locking when the
second znode's mutexes are acquired.

This is a lockdep false positive rather than a real deadlock.
attrzp is the target file's hidden xattr directory, and the code
does not acquire these znode mutexes in the reverse order.
Acquire the attrzp mutexes with mutex_enter_nested() so lockdep
treats them as nested.

Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: ZhengYuan Huang <gality369 at gmail.com>
Co-authored-by: gality369 <gality369 at example.com>
Closes #18506
DeltaFile
+6-2module/os/linux/zfs/zfs_vnops_os.c
+6-21 files

OpenZFS/src c7cfe08cmd zarcstat.in zarcsummary, include/sys arc_impl.h

zarcstat: detect attached L2ARC device with no data

zarcstat and zarcsummary detected L2ARC presence using the l2_size
kstat, which is data held in L2ARC, not whether a cache device is
attached. When a cache device was attached but empty (freshly added,
or fully evicted):

  - zarcstat rejected "-f l2*" with "Incompatible field specified!"
  - zarcsummary printed "L2ARC not detected, skipping section",
    hiding cumulative I/O history and health counters

Expose the existing l2arc_ndev counter as a new kstat l2_dev_count.
It is maintained by l2arc_add_vdev() and l2arc_remove_vdev(), so it
tracks attachment in real time. Use it in both tools, falling back to
l2_size for compatibility with older kernel modules.

Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin at TrueNAS.com>
Signed-off-by: Ameer Hamza <ahamza at ixsystems.com>
Closes #18499
DeltaFile
+3-3cmd/zarcstat.in
+4-1cmd/zarcsummary
+2-0include/sys/arc_impl.h
+2-0module/zfs/arc.c
+11-44 files

OpenZFS/src 956debacmd/zdb zdb.c, tests/zfs-tests/tests/functional/bclone bclone_samefs_data.ksh

zdb: detect BRT and DDT leaks during block traversal

During -b traversal, track BRT and DDT reference counts and report
blocks claimed more times than their reference tables account for
if it causes claim errors, instead of just asserting it.  Also
report entries with references not fully consumed by the traversal.

Add zdb leaks checks to cloning and dedup tests. This should make
sure the pools are in a sane state after completing the functional
tests.

Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Alexander Motin <alexander.motin at TrueNAS.com>
Closes #18494
DeltaFile
+114-51cmd/zdb/zdb.c
+4-0tests/zfs-tests/tests/functional/block_cloning/block_cloning_after_device_removal.ksh
+3-0tests/zfs-tests/tests/functional/bclone/bclone_samefs_data.ksh
+3-0tests/zfs-tests/tests/functional/block_cloning/block_cloning_lwb_buffer_overflow.ksh
+3-0tests/zfs-tests/tests/functional/block_cloning/block_cloning_replay.ksh
+3-0tests/zfs-tests/tests/functional/block_cloning/block_cloning_replay_encrypted.ksh
+130-5112 files not shown
+156-5118 files

OpenZFS/src 6a25950tests/zfs-tests/tests/functional/redundancy redundancy.kshlib

ZTS: redundancy_draid_spare1

Preserve the 'zpool status' output used to calculate the number of
checksum errors so it can be logged on failure.  Several instances have
been observed in the CI where cksum was set to a non-zero value, yet a
subsequent run of 'zpool status' on failure showed no checksum errors.

Reviewed-by: Tony Hutter <hutter2 at llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Closes #18500
DeltaFile
+14-8tests/zfs-tests/tests/functional/redundancy/redundancy.kshlib
+14-81 files

OpenZFS/src a2d0533cmd/zdb zdb.c, man/man8 zdb.8

Add some more file layout output, triggered by -v

With one -v, the block type (parity or data) is printed (matching
the ASCII-art version); with two -v, the offset into the file is
also printed.

This also updates the man page, and adds some simple
test scripts.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Sean Fagan <sean.fagan at klarasystems.com>
Signed-off-by: Sean Fagan <sean.fagan at klarasystems.com>
Closes #18470
DeltaFile
+78-0tests/zfs-tests/tests/functional/cli_root/zdb/zdb_file_layout_002.ksh
+78-0tests/zfs-tests/tests/functional/cli_root/zdb/zdb_file_layout_001.ksh
+78-0tests/zfs-tests/tests/functional/cli_root/zdb/zdb_file_layout_003.ksh
+48-16cmd/zdb/zdb.c
+57-0tests/zfs-tests/tests/functional/cli_root/zdb/zdb_file_layout_neg.ksh
+6-1man/man8/zdb.8
+345-172 files not shown
+353-208 files

OpenZFS/src 439b802module/zfs sa.c

sa: fix sa_add_projid lock ordering

sa_add_projid() currently acquires hdl->sa_lock before zp->z_lock.
Several same-znode update paths take zp->z_lock and then call
sa_update() or sa_bulk_update() on the same SA handle.

On Linux, FS_IOC_FSSETXATTR reaches zfs_setattr() through
zpl_ioctl_setxattr() without outer inode serialization. This makes
the reversed lock order a real ABBA deadlock rather than a lockdep
false positive when projid is added to an old-format inode while
another thread updates the same znode.

Acquire zp->z_lock before hdl->sa_lock in sa_add_projid() to match
the existing znode update ordering.

Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: ZhengYuan Huang <gality369 at gmail.com>
Co-authored-by: gality369 <gality369 at example.com>
Closes #18503
DeltaFile
+2-2module/zfs/sa.c
+2-21 files

OpenZFS/src 4bb7592include/sys dmu.h, module/os/freebsd/zfs zfs_vnops_os.c

Add support for POSIX_FADV_DONTNEED

For now make it only evict the specified data from the dbuf cache.
Even though dbuf cache is small, this may still reduce eviction of
more useful data from there, and slightly accelerate ARC evictions
by making the blocks there evictable a bit sooner.

On FreeBSD this also adds support for POSIX_FADV_NOREUSE, since the
kernel translates it into POSIX_FADV_DONTNEED after every read/write.
This is not as efficient as it could be for ZFS, but that is the only
way FreeBSD kernel allows to handle POSIX_FADV_NOREUSE now.

Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Alexander Motin <alexander.motin at TrueNAS.com>
Closes #18399
DeltaFile
+63-0tests/zfs-tests/tests/functional/fadvise/fadvise_dontneed.ksh
+60-0module/zfs/dbuf.c
+29-0module/zfs/dmu.c
+8-2module/os/linux/zfs/zpl_file.c
+3-1module/os/freebsd/zfs/zfs_vnops_os.c
+2-0include/sys/dmu.h
+165-33 files not shown
+169-49 files

OpenZFS/src 38501e1module/zfs dmu.c, tests/zfs-tests/tests/functional/fadvise fadvise_dontneed.ksh

Fix long POSIX_FADV_DONTNEED for single block files

dbuf_whichblock() is not made to handle offsets beyond the block
end for single-block objects.  Handle it in dmu_evict_range(),
similar to dmu_prefetch_by_dnode().

Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Reviewed-by: Tony Hutter <hutter2 at llnl.gov>
Signed-off-by: Alexander Motin <alexander.motin at TrueNAS.com>
Closes #18399
Closes #18489
DeltaFile
+10-3tests/zfs-tests/tests/functional/fadvise/fadvise_dontneed.ksh
+8-2module/zfs/dmu.c
+18-52 files

OpenZFS/src 6f14581module/zfs spa_misc.c dmu.c

Cleanup allocation class selection

- For multilevel gang blocks it seemed possible to fallback from
normal to special class, since they don't have proper object type,
and DMU_OT_NONE is a "metadata".  They should never fallback.
 - Fix possible inversion with zfs_user_indirect_is_special = 0,
when indirects written to normal vdev, while small data to special.
Make small indirect blocks also follow special_small_blocks there.
 - With special_small_blocks now applying to both files and ZVOLs,
make it apply to all non-metadata without extra checks, since there
are no other non-metadata types.

Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Alexander Motin <alexander.motin at TrueNAS.com>
Closes #18208
DeltaFile
+16-26module/zfs/spa_misc.c
+1-3module/zfs/dmu.c
+17-292 files

OpenZFS/src 500b44etests/zfs-tests/tests/functional/cli_user/zpool_iostat zpool_iostat_002_pos.ksh

ZTS: zpool_iostat_002_pos remove sleep

In the CI environment commands may occasionally take longer than
expected.  For zpool_iostat_002_pos this can cause a failure if fewer
than the expected numbers of lines are logged in time.  To prevent
this issue relax the time constraint and simply verify the command
ran to completion and generate the correct number of lines.

Reviewed-by: Tony Hutter <hutter2 at llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Closes #18501
DeltaFile
+3-5tests/zfs-tests/tests/functional/cli_user/zpool_iostat/zpool_iostat_002_pos.ksh
+3-51 files

OpenZFS/src d650159include/sys/fs zfs.h, man/man7 vdevprops.7

Vdev allocation bias/class change

Normal, special and dedup vdevs differ only by space allocation
bias.  Normal and special vdevs might even legally store blocks
targeted to other classes.  Dedup vdevs don't normally do it, but
there is no real reason why they can't.  Considering this, it is
not impossible to change the allocation bias for those vdevs.

This change introduces a new top-level vdev property -- alloc_bias,
reporting current bias for the vdev, and allowing to change it.
This allows to easily change vdev role in a pool, especially if
vdev removal is impossible.  To not complicate the code, changes
take effect only on next pool import.

Changes to/from log vdev could also be theoretically possible, but
they are artificially blocked for now, partially due to additional
complications, and partially due to potential danger of placing
other blocks on log vdevs, that would otherwise be non-fatal.


    [3 lines not shown]
DeltaFile
+109-0tests/zfs-tests/tests/functional/alloc_class/alloc_class_014_pos.ksh
+91-0tests/zfs-tests/tests/functional/alloc_class/alloc_class_015_neg.ksh
+77-0module/zfs/vdev.c
+15-0man/man7/vdevprops.7
+12-0module/zcommon/zpool_prop.c
+11-0include/sys/fs/zfs.h
+315-06 files not shown
+334-1112 files

OpenZFS/src bdb8e8atests/zfs-tests/tests/functional/removal removal_with_export.ksh

ZTS: removal_with_export.ksh busy export

If the pool is active 'zpool export' will fail resulting in
a test failure.  Swap log_must with log_must_busy so the export
is retried when reported as busy before failing the test.

Reviewed-by: Tony Hutter <hutter2 at llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Closes #18498
DeltaFile
+1-1tests/zfs-tests/tests/functional/removal/removal_with_export.ksh
+1-11 files

OpenZFS/src 8fdc866module/zfs dsl_dir.c

zfs: annotate nested dd_lock in reservation sync accounting

When reservation sync updates a child's reserved space, it rolls the
delta into ancestor space accounting while still holding the child's
dd_lock.  That locking order is intentional, but Linux lockdep sees
the ancestor acquisition as recursive because it lacks a nested lock
subclass annotation.

Teach the reservation-sync space-accounting path to acquire ancestor
dd_lock instances with a nested subclass.  Keep the existing public
interfaces and accounting behavior unchanged by routing only the
ancestor rollup through local helpers.

Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: ZhengYuan Huang <gality369 at gmail.com>
Signed-off-by: gality369 <gality369 at example.com>
Closes #18497
DeltaFile
+50-14module/zfs/dsl_dir.c
+50-141 files