OpenZFS / src - FreshBSD

OpenZFS/src ef81812 — man/man4 zfs.4, man/man7 vdevprops.7

9 days ago by Simon Howard via Tony Hutter on ⎇

master

Fix spelling errors

Unlike some of my other fixes which are more subtle, these are
unambigously spelling errors.

Signed-off-by: Simon Howard <fraggle at gmail.com>
Reviewed-by: Tony Hutter <hutter2 at llnl.gov>
Reviewed-by: Alexander Motin <mav at FreeBSD.org>

Delta		File
+4	-4	man/man4/zfs.4
+2	-2	man/man7/vdevprops.7
+1	-1	man/man8/zpool-events.8
+7	-7	3 files

OpenZFS/src 1d4505d — man/man4 spl.4 zfs.4, man/man7 zfsprops.7

9 days ago by Simon Howard via Tony Hutter on ⎇

master

Capitalize in various places where appropriate

These are mostly acronyms (CPUs; ZILs) but also proper nouns such as
"Unix" and "Unicode" which should also be capitalized.

Signed-off-by: Simon Howard <fraggle at gmail.com>
Reviewed-by: Tony Hutter <hutter2 at llnl.gov>
Reviewed-by: Alexander Motin <mav at FreeBSD.org>

Delta		File
+3	-3	man/man4/spl.4
+3	-3	man/man7/zfsprops.7
+2	-2	man/man4/zfs.4
+1	-1	man/man8/zpool.8
+1	-1	man/man8/zfs-hold.8
+10	-10	5 files

OpenZFS/src 73494f3 — man/man4 spl.4 zfs.4, man/man7 zpool-features.7

9 days ago by Simon Howard via Tony Hutter on ⎇

master

Make use of "i.e." (id est) consistent

This is the most common way it is written throughout the manpages, but
there are a few cases where it is written slightly differently.

Signed-off-by: Simon Howard <fraggle at gmail.com>
Reviewed-by: Tony Hutter <hutter2 at llnl.gov>
Reviewed-by: Alexander Motin <mav at FreeBSD.org>

Delta		File
+1	-1	man/man4/spl.4
+1	-1	man/man4/zfs.4
+1	-1	man/man7/zpool-features.7
+3	-3	3 files

OpenZFS/src 530ddcd — man/man1 ztest.1, man/man4 zfs.4

9 days ago by Simon Howard via Tony Hutter on ⎇

master

Harmonize on American spelling in several places

Most of the documentation is written in American English, so it makes
sense to be consistent.

Signed-off-by: Simon Howard <fraggle at gmail.com>
Reviewed-by: Tony Hutter <hutter2 at llnl.gov>
Reviewed-by: Alexander Motin <mav at FreeBSD.org>

Delta		File
+3	-3	man/man4/zfs.4
+1	-1	man/man7/zpoolconcepts.7
+1	-1	man/man1/ztest.1
+1	-1	man/man8/zed.8.in
+1	-1	man/man8/zpool-attach.8
+1	-1	man/man8/zpool-remove.8
+8	-8	1 files not shown
+9	-9	7 files

OpenZFS/src 94a3fab — include/sys metaslab_impl.h, man/man4 zfs.4

9 days ago by Alexander Motin (mav) via GitHub on ⎇

master

Unified allocation throttling (#17020)

Existing allocation throttling had a goal to improve write speed
by allocating more data to vdevs that are able to write it faster.
But in the process it completely broken the original mechanism,
designed to balance vdev space usage.  With severe vdev space use
imbalance it is possible that some with higher use start growing
fragmentation sooner than others and after getting full will stop
any writes at all.  Also after vdev addition it might take a very
long time for pool to restore the balance, since the new vdev does
not have any real preference, unless the old one is already much
slower due to fragmentation.  Also the old throttling was request-
based, which was unpredictable with block sizes varying from 512B
to 16MB, neither it made much sense in case of I/O aggregation,
when its 32-100 requests could be aggregated into few, leaving
device underutilized, submitting fewer and/or shorter requests,
or in opposite try to queue up to 1.6GB of writes per device.

This change presents a completely new throttling algorithm. Unlike

    [28 lines not shown]

Delta		File
+363	-404	module/zfs/metaslab.c
+110	-155	module/zfs/zio.c
+8	-71	module/zfs/spa.c
+16	-45	include/sys/metaslab_impl.h
+17	-27	man/man4/zfs.4
+1	-32	module/zfs/vdev_queue.c
+515	-734	6 files not shown
+536	-786	12 files

OpenZFS/src eb9098e — tests/zfs-tests/tests/functional/inheritance inherit_001_pos.ksh state001.cfg

20 days ago by Rob Norris via Tony Hutter on ⎇

master

SPDX: license tags: CDDL-1.0

Sponsored-by: https://despairlabs.com/sponsor/
Signed-off-by: Rob Norris <robn at despairlabs.com>
Reviewed-by: Tony Hutter <hutter2 at llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>

Delta		File
+1	-0	tests/zfs-tests/tests/functional/inheritance/inherit_001_pos.ksh
+1	-0	tests/zfs-tests/tests/functional/inheritance/state001.cfg
+1	-0	tests/zfs-tests/tests/functional/inheritance/state002.cfg
+1	-0	tests/zfs-tests/tests/functional/inheritance/state003.cfg
+1	-0	tests/zfs-tests/tests/functional/inheritance/state004.cfg
+1	-0	tests/zfs-tests/tests/functional/inheritance/state005.cfg
+6	-0	2,910 files not shown
+2,916	-0	2,916 files

OpenZFS/src 1b495ee — include/sys ddt.h, man/man4 zfs.4

20 days ago by Paul Dagnelie via GitHub on ⎇

master

FDT dedup log sync  -- remove incremental

This PR condenses the FDT dedup log syncing into a single sync
pass. This reduces the overhead of modifying indirect blocks for the
dedup table multiple times per txg. In addition, changes were made to
the formula for how much to sync per txg. We now also consider the
backlog we have to clear, to prevent it from growing too large, or
remaining large on an idle system.

Sponsored-by: Klara, Inc.
Sponsored-by: iXsystems, Inc.
Reviewed-by: Alexander Motin <mav at FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Tony Hutter <hutter2 at llnl.gov>
Authored-by: Don Brady <don.brady at klarasystems.com>
Authored-by: Paul Dagnelie <paul.dagnelie at klarasystems.com>
Signed-off-by: Paul Dagnelie <paul.dagnelie at klarasystems.com>
Closes #17038

Delta		File
+179	-163	module/zfs/ddt.c
+109	-0	tests/zfs-tests/tests/functional/dedup/dedup_fdt_pacing.ksh
+50	-31	man/man4/zfs.4
+10	-0	module/zfs/vdev_queue.c
+2	-5	include/sys/ddt.h
+2	-2	tests/runfiles/common.run
+352	-201	7 files not shown
+366	-202	13 files

OpenZFS/src 2adca17 — man/man4 zfs.4, module/zfs metaslab.c

Feb 27 by Paul Dagnelie via Ameer Hamza on ⎇

zfs-2.3.1-staging

Expand fragmentation table to reflect larger possibile allocation sizes

When you are using large recordsizes in conjunction with raidz, with
incompressible data, you can pretty reliably be making 21 MB
allocations. Unfortunately, the fragmentation metric in ZFS considers
any metaslabs with 16 MB free chunks completely unfragmented, so you can
have a metaslab report 0% fragmented and be unable to satisfy an
allocation. When using the segment-based metaslab weight, this is
inconvenient; when using the space-based one, it can seriously degrade
performance.

We expand the fragmentation table to extend up to 512MB, and redefine
the table size based on the actual table, rather than having a static
define. We also tweak the one variable that depends on fragmentation
directly.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Reviewed-by: Allan Jude <allan at klarasystems.com>

    [2 lines not shown]

Delta		File
+31	-26	module/zfs/metaslab.c
+1	-1	man/man4/zfs.4
+32	-27	2 files

OpenZFS/src e6c98d1 — man/man4 zfs.4, man/man7 zpool-features.7 vdevprops.7

Feb 25 by Tim Smith via Ameer Hamza on ⎇

zfs-2.3.1-staging

Fix several typos in the man pages

Reviewed-by: George Amanakis <gamanakis at gmail.com>
Reviewed-by: Alexander Motin <mav at FreeBSD.org>
Signed-off-by: Tim Smith <tsmith84 at gmail.com>
Closes #16965

Delta		File
+4	-4	man/man4/zfs.4
+2	-2	man/man7/zpool-features.7
+1	-1	man/man7/vdevprops.7
+1	-1	man/man8/zfs.8
+1	-1	man/man8/zpool-initialize.8
+1	-1	man/man8/zpool-status.8
+10	-10	6 files

OpenZFS/src 4049651 — man/man4 zfs.4, module/zfs metaslab.c

Feb 6 by Paul Dagnelie via GitHub on ⎇

master

Expand fragmentation table to reflect larger possibile allocation sizes

When you are using large recordsizes in conjunction with raidz, with
incompressible data, you can pretty reliably be making 21 MB
allocations. Unfortunately, the fragmentation metric in ZFS considers
any metaslabs with 16 MB free chunks completely unfragmented, so you can
have a metaslab report 0% fragmented and be unable to satisfy an
allocation. When using the segment-based metaslab weight, this is
inconvenient; when using the space-based one, it can seriously degrade
performance.

We expand the fragmentation table to extend up to 512MB, and redefine
the table size based on the actual table, rather than having a static
define. We also tweak the one variable that depends on fragmentation
directly.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Reviewed-by: Allan Jude <allan at klarasystems.com>

    [2 lines not shown]

Delta		File
+31	-26	module/zfs/metaslab.c
+1	-1	man/man4/zfs.4
+32	-27	2 files

OpenZFS/src b8c0c15 — man/man4 zfs.4, man/man7 zpool-features.7 vdevprops.7

Jan 21 by Tim Smith via GitHub on ⎇

master

Fix several typos in the man pages

Reviewed-by: George Amanakis <gamanakis at gmail.com>
Reviewed-by: Alexander Motin <mav at FreeBSD.org>
Signed-off-by: Tim Smith <tsmith84 at gmail.com>
Closes #16965

Delta		File
+4	-4	man/man4/zfs.4
+2	-2	man/man7/zpool-features.7
+1	-1	man/man8/zfs.8
+1	-1	man/man7/vdevprops.7
+1	-1	man/man8/zpool-initialize.8
+1	-1	man/man8/zpool-status.8
+10	-10	6 files

OpenZFS/src c2d9494 — man/man4 zfs.4, module/os/linux/zfs arc_os.c

Dec 29, 2024 by shodanshok via Brian Behlendorf on ⎇

zfs-2.3-release

set zfs_arc_shrinker_limit to 0 by default

zfs_arc_shrinker_limit was introduced to avoid ARC collapse due to
aggressive kernel reclaim. While useful, the current default (10000) is
too prone to OOM especially when MGLRU-enabled kernels with default
min_ttl_ms are used. Even when no OOM happens, it often causes too much
swap usage.

This patch sets zfs_arc_shrinker_limit=0 to not ignore kernel reclaim
requests. ARC now plays better with both kernel shrinker and pagecache
but, should ARC collapse happen again, MGLRU behavior can be tuned or
even disabled.

Anyway, zfs should not cause OOM when ARC can be released.

Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Alexander Motin <mav at FreeBSD.org>
Signed-off-by: Gionatan Danti <g.danti at assyoma.it>
Closes #16909

Delta		File
+2	-2	man/man4/zfs.4
+2	-2	module/os/linux/zfs/arc_os.c
+4	-4	2 files

OpenZFS/src 54126fd — man/man4 zfs.4, module/os/linux/zfs arc_os.c

Dec 29, 2024 by shodanshok via GitHub on ⎇

master

set zfs_arc_shrinker_limit to 0 by default

zfs_arc_shrinker_limit was introduced to avoid ARC collapse due to
aggressive kernel reclaim. While useful, the current default (10000) is
too prone to OOM especially when MGLRU-enabled kernels with default
min_ttl_ms are used. Even when no OOM happens, it often causes too much
swap usage.

This patch sets zfs_arc_shrinker_limit=0 to not ignore kernel reclaim
requests. ARC now plays better with both kernel shrinker and pagecache
but, should ARC collapse happen again, MGLRU behavior can be tuned or
even disabled.

Anyway, zfs should not cause OOM when ARC can be released.

Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Alexander Motin <mav at FreeBSD.org>
Signed-off-by: Gionatan Danti <g.danti at assyoma.it>
Closes #16909

Delta		File
+2	-2	man/man4/zfs.4
+2	-2	module/os/linux/zfs/arc_os.c
+4	-4	2 files

OpenZFS/src 022bf86 — man/man4 zfs.4, module/zfs arc.c

Nov 15, 2024 by shodanshok via Tony Hutter on ⎇

zfs-2.2-release

Increase L2ARC write rate and headroom

Current L2ARC write rate and headroom parameters are very conservative:
l2arc_write_max=8M and l2arc_headroom=2 (ie: a full L2ARC writes at
8 MB/s, scanning 16/32 MB of ARC tail each time; a warming L2ARC runs
at 2x these rates).

These values were selected 15+ years ago based on then-current SSDs
size, performance and endurance. Today we have multi-TB, fast and
cheap SSDs which can sustain much higher read/write rates.

For this reason, this patch increases l2arc_write_max to 32M and
l2arc_headroom to 8 (4x increase for both).

Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Alexander Motin <mav at FreeBSD.org>
Signed-off-by: Gionatan Danti <g.danti at assyoma.it>
Closes #15457

Delta		File
+3	-3	man/man4/zfs.4
+2	-2	module/zfs/arc.c
+5	-5	2 files

OpenZFS/src 2bd540d — man/man4 zfs.4, man/man7 zfsprops.7

Nov 5, 2024 by George Melikov via Tony Hutter on ⎇

zfs-2.2-release

man: update recordsize max size info

Reflect https://github.com/openzfs/zfs/commit/f2330bd1568489ae1fb16d975a5a9bcfe12ed219
change in our man pages and add some context.

Wording is primarily copy-pasted from code comments.

Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by:  Tino Reichardt <milky-zfs at mcmilk.de>
Signed-off-by: George Melikov <mail at gmelikov.ru>
Closes #16581

Delta		File
+12	-1	man/man7/zfsprops.7
+5	-0	man/man4/zfs.4
+17	-1	2 files

OpenZFS/src 880b739 — man/man4 zfs.4, module/zfs zfs_vnops.c

Nov 5, 2024 by Rob Norris via Brian Behlendorf on ⎇

zfs-2.3-release

zfs(4): remove "experimental" from zfs_bclone_enabled

I think we've done enough experiments.

Sponsored-by: https://despairlabs.com/sponsor/
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Alexander Motin <mav at FreeBSD.org>
Reviewed-by: George Melikov <mail at gmelikov.ru>
Signed-off-by: Rob Norris <robn at despairlabs.com>
Closes #16189 
Closes #16712

Delta		File
+4	-3	man/man4/zfs.4
+3	-3	module/zfs/zfs_vnops.c
+7	-6	2 files

OpenZFS/src b0cfb48 — cmd/zed/zed.d deadman-slot_off.sh zed.rc, man/man4 zfs.4

Nov 4, 2024 by Brian Behlendorf via Tony Hutter on ⎇

zfs-2.2-release

zed: Add deadman-slot_off.sh zedlet

Optionally turn off disk's enclosure slot if an I/O is hung
triggering the deadman.

It's possible for outstanding I/O to a misbehaving SCSI disk to
neither promptly complete or return an error.  This can occur due
to retry and recovery actions taken by the SCSI layer, driver, or
disk.  When it occurs the pool will be unresponsive even though
there may be sufficient redundancy configured to proceeded without
this single disk.

When a hung I/O is detected by the kmods it will be posted as a
deadman event.  By default an I/O is considered to be hung after
5 minutes.  This value can be changed with the zfs_deadman_ziotime_ms
module parameter.  If ZED_POWER_OFF_ENCLOSURE_SLOT_ON_DEADMAN is set
the disk's enclosure slot will be powered off causing the outstanding
I/O to fail.  The ZED will then handle this like a normal disk failure.
By default ZED_POWER_OFF_ENCLOSURE_SLOT_ON_DEADMAN is not set.

    [10 lines not shown]

Delta		File
+71	-0	cmd/zed/zed.d/deadman-slot_off.sh
+12	-9	man/man4/zfs.4
+9	-1	module/zfs/vdev.c
+4	-4	tests/zfs-tests/tests/functional/deadman/deadman_ratelimit.ksh
+7	-0	cmd/zed/zed.d/zed.rc
+2	-0	cmd/zed/zed.d/Makefile.am
+105	-14	1 files not shown
+106	-14	7 files

OpenZFS/src 91bd12d — man/man4 zfs.4, module/zfs zfs_vnops.c

Nov 1, 2024 by Rob Norris via GitHub on ⎇

master

zfs(4): remove "experimental" from zfs_bclone_enabled

I think we've done enough experiments.

Sponsored-by: https://despairlabs.com/sponsor/
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Alexander Motin <mav at FreeBSD.org>
Reviewed-by: George Melikov <mail at gmelikov.ru>
Signed-off-by: Rob Norris <robn at despairlabs.com>
Closes #16189 
Closes #16712

Delta		File
+4	-3	man/man4/zfs.4
+3	-3	module/zfs/zfs_vnops.c
+7	-6	2 files

OpenZFS/src 26ecd8b — include/sys zio.h, module/os/freebsd/zfs abd_os.c

Oct 9, 2024 by Brian Atkinson via Brian Behlendorf on ⎇

zfs-2.3-release

Always validate checksums for Direct I/O reads

This fixes an oversight in the Direct I/O PR. There is nothing that
stops a process from manipulating the contents of a buffer for a
Direct I/O read while the I/O is in flight. This can lead checksum
verify failures. However, the disk contents are still correct, and this
would lead to false reporting of checksum validation failures.

To remedy this, all Direct I/O reads that have a checksum verification
failure are treated as suspicious. In the event a checksum validation
failure occurs for a Direct I/O read, then the I/O request will be
reissued though the ARC. This allows for actual validation to happen and
removes any possibility of the buffer being manipulated after the I/O
has been issued.

Just as with Direct I/O write checksum validation failures, Direct I/O
read checksum validation failures are reported though zpool status -d in
the DIO column. Also the zevent has been updated to have both:
1. dio_verify_wr -> Checksum verification failure for writes

    [35 lines not shown]

Delta		File
+121	-59	tests/zfs-tests/cmd/manipulate_user_buffer.c
+85	-35	module/zfs/zio.c
+107	-0	tests/zfs-tests/tests/functional/direct/dio_read_verify.ksh
+39	-5	module/zfs/vdev_raidz.c
+37	-4	module/os/freebsd/zfs/abd_os.c
+15	-14	include/sys/zio.h
+404	-117	18 files not shown
+510	-146	24 files

OpenZFS/src b4e4cbe — include/sys zio.h, module/os/freebsd/zfs abd_os.c

Oct 9, 2024 by Brian Atkinson via GitHub on ⎇

master

Always validate checksums for Direct I/O reads

This fixes an oversight in the Direct I/O PR. There is nothing that
stops a process from manipulating the contents of a buffer for a
Direct I/O read while the I/O is in flight. This can lead checksum
verify failures. However, the disk contents are still correct, and this
would lead to false reporting of checksum validation failures.

To remedy this, all Direct I/O reads that have a checksum verification
failure are treated as suspicious. In the event a checksum validation
failure occurs for a Direct I/O read, then the I/O request will be
reissued though the ARC. This allows for actual validation to happen and
removes any possibility of the buffer being manipulated after the I/O
has been issued.

Just as with Direct I/O write checksum validation failures, Direct I/O
read checksum validation failures are reported though zpool status -d in
the DIO column. Also the zevent has been updated to have both:
1. dio_verify_wr -> Checksum verification failure for writes

    [35 lines not shown]

Delta		File
+121	-59	tests/zfs-tests/cmd/manipulate_user_buffer.c
+85	-35	module/zfs/zio.c
+107	-0	tests/zfs-tests/tests/functional/direct/dio_read_verify.ksh
+39	-5	module/zfs/vdev_raidz.c
+37	-4	module/os/freebsd/zfs/abd_os.c
+15	-14	include/sys/zio.h
+404	-117	18 files not shown
+510	-146	24 files

OpenZFS/src 0d77e73 — man/man4 zfs.4, module/zfs dsl_scan.c

Oct 4, 2024 by Pavel Snajdr via GitHub on ⎇

master

Defer resilver only when progress is above a threshold

Restart a resilver from scratch, if the current one in progress is
below a new tunable, zfs_resilver_defer_percent (defaulting to 10%).

The original rationale for deferring additional resilvers, when there is
already one in progress, was to help achieving data redundancy sooner
for the data that gets scanned at the end of the resilver.

But in case the admin wants to attach multiple disks to a single vdev,
it wasn't immediately obvious the admin is supposed to run
`zpool resilver` afterwards to reset the deferred resilvers and start
a new one from scratch.

Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Pavel Snajdr <snajpa at snajpa.net>
Closes #15810

Delta		File
+44	-19	module/zfs/dsl_scan.c
+10	-1	tests/zfs-tests/tests/functional/replacement/resilver_restart_001.ksh
+7	-0	man/man4/zfs.4
+1	-0	tests/zfs-tests/include/tunables.cfg
+62	-20	4 files

OpenZFS/src 224393a — lib/libzfs libzfs.abi, man/man7 zpool-features.7

Oct 3, 2024 by Rob Norris via GitHub on ⎇

master

feature: large_microzap

In a4b21eadec we added the zap_micro_max_size tuneable to raise the size
at which "micro" (single-block) ZAPs are upgraded to "fat" (multi-block)
ZAPs. Before this, a microZAP was limited to 128KiB, which was the old
largest block size. The side effect of raising the max size past 128KiB
is that it be stored in a large block, requiring the large_blocks
feature.

Unfortunately, this means that a backup stream created without the
--large-block (-L) flag to zfs send would split the microZAP block into
smaller blocks and send those, as is normal behaviour for large blocks.
This would be received correctly, but since microZAPs are limited to the
first block in the object by definition, the entries in the later blocks
would be inaccessible. For directory ZAPs, this gives the appearance of
files being lost.

This commit adds a feature flag, large_microzap, that must be enabled
for microZAPs to grow beyond 128KiB, and which will be activated the

    [38 lines not shown]

Delta		File
+53	-2	module/zfs/zap_micro.c
+20	-3	man/man7/zpool-features.7
+22	-1	module/zfs/dmu_recv.c
+14	-1	module/zcommon/zfeature_common.c
+12	-1	module/zfs/dmu_send.c
+6	-5	lib/libzfs/libzfs.abi
+127	-13	10 files not shown
+162	-22	16 files

OpenZFS/src d34d4f9 — man/man4 zfs.4, man/man7 zfsprops.7

Oct 2, 2024 by Brian Behlendorf via GitHub on ⎇

master

snapdir: add 'disabled' value to make .zfs inaccessible

In some environments, just making the .zfs control dir hidden from sight
might not be enough. In particular, the following scenarios might
warrant not allowing access at all:
- old snapshots with wrong permissions/ownership
- old snapshots with exploitable setuid/setgid binaries
- old snapshots with sensitive contents

Introducing a new 'disabled' value that not only hides the control dir,
but prevents access to its contents by returning ENOENT solves all of
the above.

The new property value takes advantage of 'iuv' semantics ("ignore
unknown value") to automatically fall back to the old default value when
a pool is accessed by an older version of ZFS that doesn't yet know
about 'disabled' semantics.

I think that technically the zfs_dirlook change is enough to prevent

    [16 lines not shown]

Delta		File
+17	-5	module/os/linux/zfs/zfs_ctldir.c
+9	-0	man/man4/zfs.4
+3	-3	man/man7/zfsprops.7
+5	-0	module/os/linux/zfs/zfs_vfsops.c
+4	-0	module/os/linux/zfs/zpl_ctldir.c
+4	-0	module/zfs/dsl_prop.c
+42	-8	10 files not shown
+56	-15	16 files

OpenZFS/src 5591505 — man/man4 zfs.4, man/man7 zfsprops.7

Sep 29, 2024 by George Melikov via GitHub on ⎇

master

man: update recordsize max size info

Reflect https://github.com/openzfs/zfs/commit/f2330bd1568489ae1fb16d975a5a9bcfe12ed219
change in our man pages and add some context.

Wording is primarily copy-pasted from code comments.

Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by:  Tino Reichardt <milky-zfs at mcmilk.de>
Signed-off-by: George Melikov <mail at gmelikov.ru>
Closes #16581

Delta		File
+12	-1	man/man7/zfsprops.7
+5	-0	man/man4/zfs.4
+17	-1	2 files

OpenZFS/src a10e552 — lib/libzfs libzfs.abi, module/os/linux/zfs zfs_uio.c

Sep 14, 2024 by Brian Atkinson via GitHub on ⎇

master

Adding Direct IO Support

Adding O_DIRECT support to ZFS to bypass the ARC for writes/reads.

O_DIRECT support in ZFS will always ensure there is coherency between
buffered and O_DIRECT IO requests. This ensures that all IO requests,
whether buffered or direct, will see the same file contents at all
times. Just as in other FS's , O_DIRECT does not imply O_SYNC. While
data is written directly to VDEV disks, metadata will not be synced
until the associated  TXG is synced.
For both O_DIRECT read and write request the offset and request sizes,
at a minimum, must be PAGE_SIZE aligned. In the event they are not,
then EINVAL is returned unless the direct property is set to always (see
below).

For O_DIRECT writes:
The request also must be block aligned (recordsize) or the write
request will take the normal (buffered) write path. In the event that
request is block aligned and a cached copy of the buffer in the ARC,

    [66 lines not shown]

Delta		File
+351	-289	lib/libzfs/libzfs.abi
+395	-0	module/zfs/dmu_direct.c
+331	-0	tests/zfs-tests/tests/functional/direct/dio.kshlib
+231	-88	module/zfs/dbuf.c
+293	-2	module/os/linux/zfs/zfs_uio.c
+274	-20	module/zfs/zfs_vnops.c
+1,875	-399	105 files not shown
+5,989	-726	111 files

OpenZFS/src cd42e99 — man/man4 zfs.4, module/zfs arc.c

Aug 27, 2024 by shodanshok via Tony Hutter on ⎇

zfs-2.2-release

Enable L2 cache of all (MRU+MFU) metadata but MFU data only

`l2arc_mfuonly` was added to avoid wasting L2 ARC on read-once MRU
data and metadata. However it can be useful to cache as much
metadata as possible while, at the same time, restricting data
cache to MFU buffers only.

This patch allow for such behavior by setting `l2arc_mfuonly` to 2
(or higher). The list of possible values is the following:
0: cache both MRU and MFU for both data and metadata;
1: cache only MFU for both data and metadata;
2: cache both MRU and MFU for metadata, but only MFU for data.

Reviewed-by: Alexander Motin <mav at FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Gionatan Danti <g.danti at assyoma.it>
Closes #16343 
Closes #16402

Delta		File
+10	-4	man/man4/zfs.4
+8	-3	module/zfs/arc.c
+18	-7	2 files

OpenZFS/src bbe8512 — man/man4 zfs.4, module/os/linux/zfs arc_os.c

Aug 21, 2024 by shodanshok via GitHub on ⎇

master

Ignore zfs_arc_shrinker_limit in direct reclaim mode

zfs_arc_shrinker_limit (default: 10000) avoids ARC collapse
due to excessive memory reclaim. However, when the kernel is
in direct reclaim mode (ie: low on memory), limiting ARC reclaim
increases OOM risk. This is especially true on system without
(or with inadequate) swap.

This patch ignores zfs_arc_shrinker_limit when the kernel is in
direct reclaim mode, avoiding most OOM. It also restores
"echo 3 > /proc/sys/vm/drop_caches" ability to correctly drop
(almost) all ARC.

Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Adam Moss <c at yotes.com>
Signed-off-by: Gionatan Danti <g.danti at assyoma.it>
Closes #16313

Delta		File
+3	-3	module/os/linux/zfs/arc_os.c
+1	-0	man/man4/zfs.4
+4	-3	2 files

OpenZFS/src 77a797a — man/man4 zfs.4, module/zfs arc.c

Aug 16, 2024 by shodanshok via GitHub on ⎇

master

Enable L2 cache of all (MRU+MFU) metadata but MFU data only

`l2arc_mfuonly` was added to avoid wasting L2 ARC on read-once MRU
data and metadata. However it can be useful to cache as much
metadata as possible while, at the same time, restricting data
cache to MFU buffers only.

This patch allow for such behavior by setting `l2arc_mfuonly` to 2
(or higher). The list of possible values is the following:
0: cache both MRU and MFU for both data and metadata;
1: cache only MFU for both data and metadata;
2: cache both MRU and MFU for metadata, but only MFU for data.

Reviewed-by: Alexander Motin <mav at FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Gionatan Danti <g.danti at assyoma.it>
Closes #16343 
Closes #16402

Delta		File
+10	-4	man/man4/zfs.4
+8	-3	module/zfs/arc.c
+18	-7	2 files

OpenZFS/src a60e15d — man/man4 zfs.4

Aug 16, 2024 by Allan Jude via Brian Behlendorf on ⎇

master

Man page updates for dmu_ddt_copies

Reviewed-by: Alexander Motin <mav at FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Allan Jude <allan at klarasystems.com>
Closes #15895

Delta		File
+11	-0	man/man4/zfs.4
+11	-0	1 files

OpenZFS/src cd69ba3 — cmd/zdb zdb.c, include/sys ddt_impl.h ddt.h

Aug 16, 2024 by Rob Norris via Brian Behlendorf on ⎇

master

ddt: dedup log

Adds a log/journal to dedup. At the end of txg, instead of writing the
entry directly to the ZAP, instead its adding to an in-memory tree and
appended to an on-disk object. The on-disk object is only read at
import, to reload the in-memory tree.

Lookups first go the the log tree before going to the ZAP, so
recently-used entries will remain close by in memory. This vastly
reduces overhead from dedup IO, as it will not have to do so many
read/update/write cycles on ZAP leaf nodes.

A flushing facility is added at end of txg, to push logged entries out
to the ZAP. There's actually two separate "logs" (in-memory tree and
on-disk object), one active (recieving updated entries) and one flushing
(writing out to disk). These are swapped (ie flushing begins) based on
memory used by the in-memory log trees and time since we last flushed
something.


    [16 lines not shown]

Delta		File
+760	-0	module/zfs/ddt_log.c
+524	-122	module/zfs/ddt.c
+130	-1	include/sys/ddt_impl.h
+82	-0	man/man4/zfs.4
+36	-3	include/sys/ddt.h
+32	-1	cmd/zdb/zdb.c
+1,564	-127	11 files not shown
+1,621	-131	17 files