OpenZFS/src 301da59module/zfs vdev_removal.c

Fix lock reversal on device removal cancel

FreeBSD kernel's WITNESS code detected lock ordering violation in
spa_vdev_remove_cancel_sync().  It took svr_lock while holding
ms_lock, which is opposite to other places.  I was thinking to
resolve it similar to #17145, but looking closer I don't think
we even need svr_lock at that point, since we already asserted
svr_allocd_segs is empty, and we don't need to add there segments
we are going to call free_mapped_segment_cb for.

Reviewed-by: Tony Hutter <hutter2 at llnl.gov>
Reviewed-by: Allan Jude <allan at klarasystems.com>
Signed-off-by:  Alexander Motin <mav at FreeBSD.org>
Sponsored by:   iXsystems, Inc.
Closes #17164
DeltaFile
+22-28module/zfs/vdev_removal.c
+22-281 files

OpenZFS/src 367d34binclude/sys vdev.h, module/zfs vdev.c vdev_removal.c

Fix dspace underflow bug

Since spa_dspace accounts only normal allocation class space,
spa_nonallocating_dspace should do the same.  Otherwise we may get
negative overflow or respective assertion spa_update_dspace() if
removed special/dedup vdev is bigger than all normal class space.

Reviewed-by: Alexander Motin <mav at FreeBSD.org>
Reviewed-by: Allan Jude <allan at klarasystems.com>
Signed-off-by: Paul Dagnelie <paul.dagnelie at klarasystems.com>
Closes #17183
DeltaFile
+20-2module/zfs/vdev.c
+3-14module/zfs/vdev_removal.c
+1-0include/sys/vdev.h
+24-163 files

OpenZFS/src 11ca12dinclude/os/freebsd/spl/sys simd_powerpc.h

simd_powerpc.h: enable FPU on FreeBSD

FreeBSD nowadays supports FPU in the kernel on powerpc*, so enable it.

Reviewed-by: Alexander Motin <mav at FreeBSD.org>
Reviewed-by: Tony Hutter <hutter2 at llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Piotr Kubaj <pkubaj at FreeBSD.org>
Closes #17191
DeltaFile
+10-3include/os/freebsd/spl/sys/simd_powerpc.h
+10-31 files

OpenZFS/src 75e921dmodule/os/linux/spl spl-kstat.c

kstat: silence "maybe uninitialized" warnings

Firmly in the "shouldn't happen" camp, but at least GCC 7.4 (Ubuntu
18.04) complained about them, and it's easy to shut up, so do so.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Reviewed-by: Alexander Motin <mav at FreeBSD.org>
Signed-off-by: Rob Norris <rob.norris at klarasystems.com>
Closes #17189
DeltaFile
+2-3module/os/linux/spl/spl-kstat.c
+2-31 files

OpenZFS/src 5b29e70include/sys metaslab.h metaslab_impl.h, module/zfs metaslab.c vdev.c

Remove mg_allocators (#17192)

Previous code allowed each metaslab group to have different number
of allocators.  But in practice it worked only for embedded SLOGs,
relying on a number of conditions and creating a significant mine
field if any of those change.  I just stepped on one myself.

This change makes all groups to have spa_alloc_count allocators.
It may cost us extra 192 bytes of memory per normal top-level vdev
on large systems, but I find it a small price for cleaner and more
reliable code.

Signed-off-by:  Alexander Motin <mav at FreeBSD.org>
Sponsored by:   iXsystems, Inc.
Fixes #17188
Reviewed-by: Tony Hutter <hutter2 at llnl.gov>
Reviewed-by: Paul Dagnelie <pcd at delphix.com>
DeltaFile
+13-12module/zfs/metaslab.c
+2-3module/zfs/vdev.c
+1-1include/sys/metaslab.h
+0-1include/sys/metaslab_impl.h
+16-174 files

OpenZFS/src 30cc233cmd/zed/agents zfs_retire.c, include/sys spa.h

zed: Ensure spare activation after kernel-initiated device removal

In addition to hotplug events, the kernel may also mark a failing vdev
as REMOVED. This was observed in a customer report and reproduced by
forcing the NVMe host driver to disable the device after a failed reset
due to command timeout. In such cases, the spare was not activated
because the device had already transitioned to a REMOVED state before
zed processed the event.
To address this, explicitly attempt hot spare activation when the
kernel marks a device as REMOVED.

Reviewed-by: Alexander Motin <mav at FreeBSD.org>
Reviewed-by: Tony Hutter <hutter2 at llnl.gov>
Signed-off-by: Ameer Hamza <ahamza at ixsystems.com>
Closes #17187
DeltaFile
+15-4cmd/zed/agents/zfs_retire.c
+12-7module/zfs/spa.c
+16-2module/zfs/zfs_fm.c
+2-1include/sys/spa.h
+1-1module/zfs/vdev.c
+46-155 files

OpenZFS/src dd2a46bconfig kernel.m4

config: cache results of kernel checks (#17106)

Kernel checks are the heaviest part of the configure checks. This allows
the results to be cached through the normal autoconf cache.

Since we don't want to reuse cached values for different kernels, but
don't want to discard the entire cache on every kernel, we instead add a
short checksum to kernel config cache keys, based on the version and
path, so the cache can hold results for multiple different kernels.

Sponsored-by: https://despairlabs.com/sponsor/

Signed-off-by: Rob Norris <robn at despairlabs.com>
Reviewed-by: Tony Hutter <hutter2 at llnl.gov>
DeltaFile
+44-17config/kernel.m4
+44-171 files

OpenZFS/src 4abc21bmodule/zfs metaslab.c

Block remap for cloned blocks on device removal

When after device removal we handle block pointers remap, skip blocks
that might be cloned.  BRTs are indexed by vdev id and offset from
block pointer's DVA[0].  So if we start addressing the same block by
some different DVA, we won't get the proper reference counter.  As
result, we might either remap the block twice, that may result in
assertion during indirect mapping condense, or free it prematurely,
that may result in data overwrite, or free it twice, that may result
in assertion in spacemap code.

Reviewed-by: Ameer Hamza <ahamza at ixsystems.com>
Reviewed-by: Paul Dagnelie <pcd at delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by:  Alexander Motin <mav at FreeBSD.org>
Sponsored by:   iXsystems, Inc.
Closes #15604
Closes #17180
DeltaFile
+8-0module/zfs/metaslab.c
+8-01 files

OpenZFS/src 50d87fe.github/workflows/scripts qemu-4-build-vm.sh

runners: Fix tarball build for zfs-qemu-packages workflow (#17158)

The initial tarballs we built for for zfs-2.3.1 were incorrect since
they did not have a ./configure script, and their files were not
in a top level zfs-2.3.1/ directory.  This commit copies the way we
built them on buildbot so the tarballs are created as expected.

Signed-off-by: Tony Hutter <hutter2 at llnl.gov>
Reviewed-by: Tino Reichardt <milky-zfs at mcmilk.de>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
DeltaFile
+18-10.github/workflows/scripts/qemu-4-build-vm.sh
+18-101 files

OpenZFS/src 240fc4a.github/workflows/scripts qemu-4-build-vm.sh

runners: Fix zfs-release RPM creation (#17173)

The zfs-qemu-packages workflow was incorrectly copying the built
zfs-release RPMs to ~/zfsonlinux.github.com rather than ~/zfs.  This
meant that the RPMs were not being correctly picked in the artifacts
files.  This fixes the issue.

Signed-off-by: Tony Hutter <hutter2 at llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: @ImAwsumm
Reviewed-by: Tino Reichardt <milky-zfs at mcmilk.de>
DeltaFile
+3-2.github/workflows/scripts/qemu-4-build-vm.sh
+3-21 files

OpenZFS/src a0e6271config kernel-sb-dying.m4 kernel.m4, module/os/linux/zfs zpl_super.c

Linux: Fix zfs_prune panics v2 (#17121)

It turns out that approach taken in the original version of the patch
was wrong. So now, we're taking approach in-line with how kernel
actually does it - when sb is being torn down, access to it
is serialized via sb->s_umount rwsem, only when that lock is taken
is it okay to work with s_flags - and the other mistake I was doing
was trying to make SB_ACTIVE work, but apparently the kernel checks
the negative variant - not SB_DYING and not SB_BORN.

Kernels pre-6.6 don't have SB_DYING, but check if sb is hashed
instead.

Signed-off-by: Pavel Snajdr <snajpa at snajpa.net>
Reviewed-by: Tony Hutter <hutter2 at llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
DeltaFile
+17-9module/os/linux/zfs/zpl_super.c
+19-0config/kernel-sb-dying.m4
+2-0config/kernel.m4
+38-93 files

OpenZFS/src 9611dfd. META

Linux 6.14 compat: META (#17098) (#17172)

Update the META file to reflect compatibility with the 6.14
kernel.

Signed-off-by: Tony Hutter <hutter2 at llnl.gov>
Reviewed-by: Rob Norris <robn at despairlabs.com>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: @ImAwsumm
DeltaFile
+1-1META
+1-11 files

OpenZFS/src 885f87fscripts zfs-helpers.sh Makefile.am

ZTS: Fix zpool_status_features_001_pos local test (#17174)

Update 'zfs-helpers.sh -i' to install the compatibility.d/ file
symlinks. These are need to run the zpool_status_features_001_pos test
from a local workspace (as opposed to running ZTS from a formal
'make install' or install from RPMs, which are unaffected).

Signed-off-by: Tony Hutter <hutter2 at llnl.gov>
Reviewed-by: @ImAwsumm
Reviewed-by: Alexander Motin <mav at FreeBSD.org>
DeltaFile
+2-0scripts/zfs-helpers.sh
+1-0scripts/Makefile.am
+3-02 files

OpenZFS/src fd01824man/man7 zfsprops.7

Disambiguate reference to kibibytes, not kilobytes

A minor nitpick that is kind of obvious based on the surrounding context
and reference to powers of two. It's better to be explicit, though.

Signed-off-by: Simon Howard <fraggle at gmail.com>
Reviewed-by: Tony Hutter <hutter2 at llnl.gov>
Reviewed-by: Alexander Motin <mav at FreeBSD.org>
DeltaFile
+2-2man/man7/zfsprops.7
+2-21 files

OpenZFS/src ef81812man/man4 zfs.4, man/man7 vdevprops.7

Fix spelling errors

Unlike some of my other fixes which are more subtle, these are
unambigously spelling errors.

Signed-off-by: Simon Howard <fraggle at gmail.com>
Reviewed-by: Tony Hutter <hutter2 at llnl.gov>
Reviewed-by: Alexander Motin <mav at FreeBSD.org>
DeltaFile
+4-4man/man4/zfs.4
+2-2man/man7/vdevprops.7
+1-1man/man8/zpool-events.8
+7-73 files

OpenZFS/src e759a86man/man7 zfsprops.7, man/man8 zfs-allow.8

Correct "umount" to "unmount" in a couple of places

This is admittedly a nitpicky change, but `umount` is the command that
performs an *unmount*. So if we are talking about unmounting something
we should phrase it that way.

Signed-off-by: Simon Howard <fraggle at gmail.com>
Reviewed-by: Tony Hutter <hutter2 at llnl.gov>
Reviewed-by: Alexander Motin <mav at FreeBSD.org>
DeltaFile
+1-1man/man7/zfsprops.7
+1-1man/man8/zfs-allow.8
+2-22 files

OpenZFS/src 1d4505dman/man4 spl.4 zfs.4, man/man7 zfsprops.7

Capitalize in various places where appropriate

These are mostly acronyms (CPUs; ZILs) but also proper nouns such as
"Unix" and "Unicode" which should also be capitalized.

Signed-off-by: Simon Howard <fraggle at gmail.com>
Reviewed-by: Tony Hutter <hutter2 at llnl.gov>
Reviewed-by: Alexander Motin <mav at FreeBSD.org>
DeltaFile
+3-3man/man4/spl.4
+3-3man/man7/zfsprops.7
+2-2man/man4/zfs.4
+1-1man/man8/zfs-hold.8
+1-1man/man8/zpool.8
+10-105 files

OpenZFS/src b386bf8man/man7 zfsprops.7, man/man8 zfs-allow.8 zfs-send.8

Fix cases where "descendent" is used as a noun

As per Wiktionary: "descendent" may be used as an adjective (e.g.
"a descendent dataset") but for nouns (e.g. "descendants of this
dataset"), "descendant" is the correct spelling.

Signed-off-by: Simon Howard <fraggle at gmail.com>
Reviewed-by: Tony Hutter <hutter2 at llnl.gov>
Reviewed-by: Alexander Motin <mav at FreeBSD.org>
DeltaFile
+11-11man/man7/zfsprops.7
+3-3man/man8/zfs-allow.8
+1-1man/man8/zfs-send.8
+15-153 files

OpenZFS/src 73494f3man/man4 spl.4 zfs.4, man/man7 zpool-features.7

Make use of "i.e." (id est) consistent

This is the most common way it is written throughout the manpages, but
there are a few cases where it is written slightly differently.

Signed-off-by: Simon Howard <fraggle at gmail.com>
Reviewed-by: Tony Hutter <hutter2 at llnl.gov>
Reviewed-by: Alexander Motin <mav at FreeBSD.org>
DeltaFile
+1-1man/man4/spl.4
+1-1man/man4/zfs.4
+1-1man/man7/zpool-features.7
+3-33 files

OpenZFS/src 530ddcdman/man1 ztest.1, man/man4 zfs.4

Harmonize on American spelling in several places

Most of the documentation is written in American English, so it makes
sense to be consistent.

Signed-off-by: Simon Howard <fraggle at gmail.com>
Reviewed-by: Tony Hutter <hutter2 at llnl.gov>
Reviewed-by: Alexander Motin <mav at FreeBSD.org>
DeltaFile
+3-3man/man4/zfs.4
+1-1man/man8/zpool-attach.8
+1-1man/man1/ztest.1
+1-1man/man7/zpoolconcepts.7
+1-1man/man8/zed.8.in
+1-1man/man8/zpool-remove.8
+8-81 files not shown
+9-97 files

OpenZFS/src 5f70370cmd/zinject zinject.c, include/sys zfs_ioctl.h

Revert "zinject: count matches and injections for each handler" (#17137)

Adding fields to zinject_record_t unexpectedly extended zfs_cmd_t,
preventing some things working properly with 2.3.1 userspace tools
against 2.3.0 kernel module.

This reverts commit fabdd502f4f04e27d057aedc7fb7697e7bd95b74.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.

Signed-off-by: Rob Norris <rob.norris at klarasystems.com>
Reviewed-by: Alexander Motin <mav at FreeBSD.org>
Reviewed-by: Tony Hutter <hutter2 at llnl.gov>
DeltaFile
+0-142tests/zfs-tests/tests/functional/cli_root/zinject/zinject_counts.ksh
+26-40cmd/zinject/zinject.c
+10-48module/zfs/zio_inject.c
+1-1tests/runfiles/common.run
+0-2include/sys/zfs_ioctl.h
+0-1tests/zfs-tests/tests/Makefile.am
+37-2346 files

OpenZFS/src 94a3fabinclude/sys metaslab_impl.h, man/man4 zfs.4

Unified allocation throttling (#17020)

Existing allocation throttling had a goal to improve write speed
by allocating more data to vdevs that are able to write it faster.
But in the process it completely broken the original mechanism,
designed to balance vdev space usage.  With severe vdev space use
imbalance it is possible that some with higher use start growing
fragmentation sooner than others and after getting full will stop
any writes at all.  Also after vdev addition it might take a very
long time for pool to restore the balance, since the new vdev does
not have any real preference, unless the old one is already much
slower due to fragmentation.  Also the old throttling was request-
based, which was unpredictable with block sizes varying from 512B
to 16MB, neither it made much sense in case of I/O aggregation,
when its 32-100 requests could be aggregated into few, leaving
device underutilized, submitting fewer and/or shorter requests,
or in opposite try to queue up to 1.6GB of writes per device.

This change presents a completely new throttling algorithm. Unlike

    [28 lines not shown]
DeltaFile
+363-404module/zfs/metaslab.c
+110-155module/zfs/zio.c
+8-71module/zfs/spa.c
+16-45include/sys/metaslab_impl.h
+17-27man/man4/zfs.4
+1-32module/zfs/vdev_queue.c
+515-7346 files not shown
+536-78612 files

OpenZFS/src 3862ebb.github/workflows zfs-qemu.yml, .github/workflows/scripts qemu-2-start.sh

CI: Remove FreeBSD 13.3 and 14.1 tests (#17162)

They are out of support and we are really low on CI resources.

Signed-off-by:  Alexander Motin <mav at FreeBSD.org>
Sponsored by:   iXsystems, Inc.
Reviewed-by: Tony Hutter <hutter2 at llnl.gov>
Reviewed-by: Alexander Motin <mav at FreeBSD.org>
Reviewed-by: George Melikov <mail at gmelikov.ru>
DeltaFile
+0-13.github/workflows/scripts/qemu-2-start.sh
+3-3.github/workflows/zfs-qemu.yml
+3-162 files

OpenZFS/src 45e9b54module/os/freebsd/spl spl_kstat.c

freebsd/kstat: allow multi-level module names

This extends the existing special-case for zfs/poolname to split and
create any number of intermediate sysctl names, so that multi-level
module names are possible.

Sponsored-by: Klara, Inc.
Sponsored-by: Syneto
Signed-off-by: Rob Norris <rob.norris at klarasystems.com>
Reviewed-by: Alexander Motin <mav at FreeBSD.org>
Reviewed-by: Tony Hutter <hutter2 at llnl.gov>
DeltaFile
+45-42module/os/freebsd/spl/spl_kstat.c
+45-421 files

OpenZFS/src d28d2e3include/os/linux/spl/sys kstat.h, module/os/linux/spl spl-kstat.c

linux/kstat: allow multi-level module names

Module names are mapped directly to directory names in procfs, but
nothing is done to create the intermediate directories, or remove them.
This makes it impossible to sensibly present kstats about sub-objects.

This commit loops through '/'-separated names in the full module name,
creates a separate module for each, and hooks them up with a parent
pointer and child counter, and then unrolls this on the other side when
deleting a module.

Sponsored-by: Klara, Inc.
Sponsored-by: Syneto
Signed-off-by: Rob Norris <rob.norris at klarasystems.com>
Reviewed-by: Alexander Motin <mav at FreeBSD.org>
Reviewed-by: Tony Hutter <hutter2 at llnl.gov>
DeltaFile
+81-25module/os/linux/spl/spl-kstat.c
+6-0include/os/linux/spl/sys/kstat.h
+87-252 files

OpenZFS/src 5b5a514tests/zfs-tests/tests/functional/gang_blocks cleanup.ksh gang_blocks.kshlib

zts: add spdx license tags to gang_blocks tests (#17160)

Missed in #17073, probably because that PR was branched before #17001
was landed and never rebased.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.

Signed-off-by: Rob Norris <rob.norris at klarasystems.com>
Reviewed-by: Alexander Motin <mav at FreeBSD.org>
Reviewed-by: Tony Hutter <hutter2 at llnl.gov>
DeltaFile
+1-0tests/zfs-tests/tests/functional/gang_blocks/cleanup.ksh
+1-0tests/zfs-tests/tests/functional/gang_blocks/gang_blocks.kshlib
+1-0tests/zfs-tests/tests/functional/gang_blocks/gang_blocks_redundant.ksh
+1-0tests/zfs-tests/tests/functional/gang_blocks/setup.ksh
+4-04 files

OpenZFS/src 9250403module/zfs zio.c dmu.c, tests/zfs-tests/tests/functional/gang_blocks gang_blocks.kshlib gang_blocks_redundant.ksh

Make ganging redundancy respect redundant_metadata property (#17073)

The redundant_metadata setting in ZFS allows users to trade resilience
for performance and space savings. This applies to all data and metadata
blocks in zfs, with one exception: gang blocks. Gang blocks currently
just take the copies property of the IO being ganged and, if it's 1,
sets it to 2. This means that we always make at least two copies of a
gang header, which is good for resilience. However, if the users care
more about performance than resilience, their gang blocks will be even
more of a penalty than usual.

We add logic to calculate the number of gang headers copies directly,
and store it as a separate IO property. This is stored in the IO
properties and not calculated when we decide to gang because by that
point we may not have easy access to the relevant information about what
kind of block is being stored. We also check the redundant_metadata
property when doing so, and use that to decide whether to store an extra
copy of the gang headers, compared to the underlying blocks.


    [6 lines not shown]
DeltaFile
+120-0tests/zfs-tests/tests/functional/gang_blocks/gang_blocks.kshlib
+88-0tests/zfs-tests/tests/functional/gang_blocks/gang_blocks_redundant.ksh
+31-0tests/zfs-tests/tests/functional/gang_blocks/cleanup.ksh
+30-0tests/zfs-tests/tests/functional/gang_blocks/setup.ksh
+12-11module/zfs/zio.c
+20-1module/zfs/dmu.c
+301-129 files not shown
+327-2015 files

OpenZFS/src 94b9cbbtests/zfs-tests/tests/functional/direct dio_read_verify.ksh

Updating dio_read_verify ZTS test (#16830)

There was a recent CI ZTS test failure on FreeBSD 14 for the
dio_read_verify test case. The failure reported there was no ARC reads
while the buffer wes being manipulated. All checksum verify errors for
Direct I/O reads are rerouted through the ARC, so there should be ARC
reads accounted for. In order to help debug any future failures of this
test case, the order of checks has been changed. First there is a check
for DIO verify failures for the reads and then ARC read counts are
checked.

This PR also contains general cleanup of the comments in the test
script.

Signed-off-by: Brian Atkinson <batkinson at lanl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Alexander Motin <mav at FreeBSD.org>
DeltaFile
+10-9tests/zfs-tests/tests/functional/direct/dio_read_verify.ksh
+10-91 files

OpenZFS/src 676b7efmodule/zfs vdev_removal.c

Fix deadlock on I/O errors during device removal

spa_vdev_remove_thread() should not hold svr_lock while loading a
metaslab.  It may block ZIO threads, required to handle metaslab
loading, at least in case of read errors causing recovery writes.

Reviewed-by: Tony Hutter <hutter2 at llnl.gov>
Reviewed-by: Rob Norris <robn at despairlabs.com>
Signed-off-by:  Alexander Motin <mav at FreeBSD.org>
Sponsored by:   iXsystems, Inc.
Closes #17145
DeltaFile
+39-19module/zfs/vdev_removal.c
+39-191 files

OpenZFS/src 83fa051module/os/freebsd/spl spl_vfs.c

spl_vfs: fix vrele task runner signature mismatch

Reviewed-by: Alexander Motin <mav at FreeBSD.org>
Signed-off-by: SHENGYI HONG <aokblast at FreeBSD.org>
Closes #17101
DeltaFile
+8-2module/os/freebsd/spl/spl_vfs.c
+8-21 files