Make zpool status dedup table support raw bytes -p output
Check if -p flag is enabled, and if so print dedup table with raw
bytes. Restructure the logic in zutil_pool to check if -p flag is
enabled before printing either the bytes or raw numbers.
Calls to print the data for DDT now all use
zfs_nicenum_format(). Increased DDT histogram column buffers
to 32 bytes to prevent truncation when -p is enabled.
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin at TrueNAS.com>
Signed-off-by: Adi Gollamudi <adigollamudi at gmail.com>
Closes #11626
Closes #17926
README: add FreeBSD 14.4-RELEASE alongside 15.0
14.4 was announced yesterday. Also link to part of
FreeBSD's security page.
https://www.freebsd.org/security/#sup
presents a fairly comprehensive table of supported releases.
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Rob Norris <robn at despairlabs.com>
Reviewed-by: Alexander Motin <alexander.motin at TrueNAS.com>
Signed-off-by: Graham Perrin <grahamperrin at gmail.com>
Closes #18304
ZTS: redundancy_draid_spare{1,3} exceptions
Update the redundancy_draid_spare1 exception to reference an issue
which describes the failure.
Remove the exception for the redundancy_draid_spare3 test. I have
not observed it in local testing. If it reproduces in the CI we
can create a new issue for it and put back the exception.
Signed-off-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Closes #18308
ZVOL: Restrict cloning with different properties
While technically its not a problem to clone between ZVOLs with
different properties, it might create expectation of new properties
being applied during data move, while actually it won't happen.
For copies and checksum it may mean incorrect safety expectations.
For dedup, compression and special_small_blocks -- performance and
space usage.
This is a replica of #18180 from FS.
Reviewed-by: Ameer Hamza <ahamza at ixsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Alexander Motin <alexander.motin at TrueNAS.com>
Closes #18315
ZVOL: Add encryption key check for block cloning
Somehow during block cloning porting from file systems was missed
the check for identical encryption keys. As result, blocks cloned
between unrelated ZVOLs produced authentication errors on later
reads. Having same or different encryption root does not matter.
This patch copies dmu_objset_crypto_key_equal() call from FS side.
Reviewed-by: Ameer Hamza <ahamza at ixsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Alexander Motin <alexander.motin at TrueNAS.com>
Closes #18315
Fix the send --exclude option to work with encryption
When using --exclude, filtering needs to take place in two places:
in zfs_main.c via the callback previously added to support the
options, and in libzfs_sendrecv.c because it generates the nvlist
during a first pass, and that results in it complaining if the
excluded dataset is not available for sending. (eg, excluding an
encrypted dataset so you don't have to use --raw wouldn't work,
because the first pass would look at the dataset and decide you
couldn't use it.) Add send --exclude tests, including one that tests
excluding an encrypted hierarchy.
Reviewed-by: Allan Jude <allan at klarasystems.com>
Reviewed-by: Alexander Motin <alexander.motin at TrueNAS.com>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Sean Eric Fagan <sef at kithrup.ie>
Closes #18278
zstream: add a drop_record subcommand
It can be used to drop extraneous records in a send stream caused by a
corrupt dataset, as in issue #18239.
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Alan Somers <asomers at gmail.com>
Sponsored by: ConnectWise
Closes #18275
libzfs: scrub: only include start and end nv pairs if needed for scrub
This patch addresses running `zpool scrub <pool>` with ZFS 2.4 userspace
while the loaded kernel module is still 2.3, failing with:
```
cannot scrub <pool>: the loaded zfs module does not support an option
for this operation. A reboot may be required to enable this option.
```
Checking for the source of the message via `strace` showed the scrub
ioctl failing and setting errno to ZFS_ERR_IOC_ARG_UNAVAIL[0]. With
that and the comments in `module/zfs/zfs_ioctl.c`[1] commit: 894edd084
seemed like a likely cause for the backward incompatibility.
The corresponding kernelspace code in `module/zfs/zfs_ioctl.c` defaults
to a setting of 0 if either parameter is not set, so not providing the
nvpairs in case both are 0 should not make a semantic difference.
Tested by:
[19 lines not shown]
Add the --file-layout (-f) option to zdb(8)
Displays the physical raidz block layout for a given file.
This leverages the internal vdev_raidz_map_alloc() function to find
the map of how the block data is laid out across the child disks.
The column entry for each row looks like:
+------------+
| D2 43 |
| 6020da |
+------------+
representing here the logical data column 2 that is 43 sectors high
starting at sector 0x6020da.
With -H, the output is a list of disks, LBAs, and block counts,
given in 512 byte block values.
Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
[4 lines not shown]
config: fix STATX_MNT_ID detection
statx(2) requires _GNU_SOURCE to be defined in order for sys/stat.h to
produce a definition for struct statx and the STATX_* defines. We get
that at compile time because we pass -D_GNU_SOURCE through to
everything, but in the configure check we aren't setting _GNU_SOURCE, so
we don't find STATX_MNT_ID, and so don't set HAVE_STATX_MNT_ID.
(This was fine before ccf5a8a6fc, because linux/stat.h does not require
_GNU_SOURCE).
Simple fix: in the check, define _GNU_SOURCE before including
sys/stat.h.
Sponsored-by: TrueNAS
Reviewed-by: Ameer Hamza <ahamza at ixsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Rob Norris <rob.norris at truenas.com>
Closes #18312
contrib/debian: add zilstat.1 manpage to installation list
Missed in 65165df129, which caused Debian packaging to fail.
Sponsored-by: TrueNAS
Reviewed-by: Ameer Hamza <ahamza at ixsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Rob Norris <rob.norris at truenas.com>
Closes #18313
zpool clear: remove undocumented rewind flags
Remove the -F, -n, and -X flags from zpool clear. These flags were
inherited from OpenSolaris but are not applicable in this context.
Unlike zpool import, where the pool is not yet loaded and a specific
TXG can be selected, zpool clear operates on an already imported pool
whose in-memory state is ahead of what is on disk. Rewinding
transactions would require force-exporting the pool first.
The rewind policy passed to zpool_clear() is now always
ZPOOL_NO_REWIND.
Tested on FreeBSD 16.0-CURRENT (amd64). Verified that -F, -n, and
-X are properly rejected as invalid options and that the usage output
reflects the change.
Reviewed-by: Alexander Motin <alexander.motin at TrueNAS.com>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Christos Longros <chris.longros at gmail.com>
Closes #13825
Closes #18300
zilstat: add man page
The zilstat command has no man page. Add zilstat.1 documenting all
options and field definitions based on the source in cmd/zilstat.in.
Reviewed-by: Ameer Hamza <ahamza at ixsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Christos Longros <chris.longros at gmail.com>
Closes #18303
draid: fix data corruption after disk clear
Currently, when there there are several faulted disks with attached
dRAID spares, and one of those disks is cleared from errors (zpool
clear), followed by its spare being detached, the data in all the
remaining spares that were attached while the cleared disk was in
FAULTED state might get corrupted (which can be seen by running scrub).
In some cases, when too many disks get cleared at a time, this can
result in data corruption/loss.
dRAID spare is a virtual device whose blocks are distributed among
other disks. Those disks can be also in FAULTED state with attached
spares on their own. When a disk gets sequentially resilvered (rebuilt),
the changes made by that resilvering won't get captured in the DTL
(Dirty Time Log) of other FAULTED disks with the attached spares to
which the data is written during the resilvering (as it would normally
be done for the changes made by the user if a new file is written or
some existing one is deleted). It is because sequential resilvering
works on the block level, without touching or looking into metadata,
[40 lines not shown]
libspl/mnttab: remove struct extmnttab
The two additional fields are never used by calling code, and we can
replace their sole internal use with an extra stack param.
Sponsored-by: TrueNAS
Reviewed-by: Ameer Hamza <ahamza at ixsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Rob Norris <rob.norris at truenas.com>
Closes #18296
libzfs/mnttab: shorten names, reorg a bit
We can't change the public interface, but internally we don't need so
much redundant naming.
Sponsored-by: TrueNAS
Reviewed-by: Ameer Hamza <ahamza at ixsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Rob Norris <rob.norris at truenas.com>
Closes #18296
libzfs/mnttab: lift mnttab cache into separate file
Sponsored-by: TrueNAS
Reviewed-by: Ameer Hamza <ahamza at ixsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Rob Norris <rob.norris at truenas.com>
Closes #18296
libzfs/mnttab: always enable the cache
There's no real reason not to enable it always; the `zfs` command always
enables it anyway, and right now there's multiple places that do mount
work that don't go through the cache anyway. Having it always be on lets
us remove a bunch of the fallback code.
Sponsored-by: TrueNAS
Reviewed-by: Ameer Hamza <ahamza at ixsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Rob Norris <rob.norris at truenas.com>
Closes #18296
libspl/mnttab: make mnttab source filenames consistent
FreeBSD's getextmntent.c is only separate because it has a different
license to mnttab.c, otherwise it would go there too.
Sponsored-by: TrueNAS
Reviewed-by: Ameer Hamza <ahamza at ixsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Rob Norris <rob.norris at truenas.com>
Closes #18296
libzfs/mnttab: lift node alloc/free
Sponsored-by: TrueNAS
Reviewed-by: Ameer Hamza <ahamza at ixsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Rob Norris <rob.norris at truenas.com>
Closes #18296
abi: updates for mnttab cleanup
Sponsored-by: TrueNAS
Reviewed-by: Ameer Hamza <ahamza at ixsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Rob Norris <rob.norris at truenas.com>
Closes #18296
libspl/mnttab: remove getmntany()
Only used for when the mount cache was disabled, but since its always
enabled now, we don't need it.
Sponsored-by: TrueNAS
Reviewed-by: Ameer Hamza <ahamza at ixsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Rob Norris <rob.norris at truenas.com>
Closes #18296
libzfs/mnttab: use SPL mutexes
More consistent, less typing, and we can check ownership.
Sponsored-by: TrueNAS
Reviewed-by: Ameer Hamza <ahamza at ixsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Rob Norris <rob.norris at truenas.com>
Closes #18296
fix libzfs diff mem leak in an error path
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin at TrueNAS.com>
Signed-off-by: Alek Pinchuk <apinchuk at axcient.com>
Closes #18301
L2ARC: Fix prev_hdr use-after-free in l2arc_write_sublist
prev_hdr is dereferenced after the sublist lock is dropped for write I/O
but nothing prevents it from being freed during that window. Eliminate
prev_hdr entirely and simplify persistent marker repositioning logic.
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin at TrueNAS.com>
Signed-off-by: Ameer Hamza <ahamza at ixsystems.com>
Closes #18289
man: Update L2ARC documentation for depth cap and write budget fairness
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin at TrueNAS.com>
Signed-off-by: Ameer Hamza <ahamza at ixsystems.com>
Closes #18289
L2ARC: Write budget fairness for metadata monopolization
Under heavy metadata load, metadata passes can monopolize the write
budget every cycle while data passes get nothing written. Track
consecutive monopolized cycles per device in l2ad_meta_cycles. After
l2arc_meta_cycles (default 2) consecutive cycles where metadata fills
the write budget, skip metadata for one cycle to let data run. Reset
the counter when nothing is written.
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin at TrueNAS.com>
Signed-off-by: Ameer Hamza <ahamza at ixsystems.com>
Closes #18289
L2ARC: Scan-based depth cap for persistent markers
With persistent markers and inclusive scanning, the marker traverses the
entire ARC state across many feed cycles, writing buffers far from the
tail that may no longer be relevant.
Track cumulative bytes scanned per pass in l2arc_ext_scanned. When scans
reach l2arc_ext_headroom_pct (default 25%) of the ARC state size, reset
the pass markers to the tail via lazy reset flags. This keeps markers
focused on the tail zone where buffers soon to be evicted have the most
value for L2ARC.
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin at TrueNAS.com>
Signed-off-by: Ameer Hamza <ahamza at ixsystems.com>
Closes #18289
L2ARC: Lazy sublist reset flags for persistent markers
Replace direct marker-to-tail manipulation with per-sublist boolean
flags consumed lazily by feed threads. Each scanning thread resets its
own marker when it sees the flag, rather than having another thread
manipulate the marker directly.
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin at TrueNAS.com>
Signed-off-by: Ameer Hamza <ahamza at ixsystems.com>
Closes #18289
L2ARC: Even sublist headroom distribution with round-robin selection
The dynamic headroom redistribution formula gave later sublists
progressively larger scanning budgets, and random sublist selection
caused uneven coverage across sublists. For depth cap to work
effectively, each sublist should be equally and fairly treated.
Use equal per-sublist headroom (headroom / num_sublists) for even
distribution and deterministic round-robin selection for fair
coverage across cycles.
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin at TrueNAS.com>
Signed-off-by: Ameer Hamza <ahamza at ixsystems.com>
Closes #18289