ZFS on Linux/src 900d09bcmd/ztest ztest.c, include/sys zil_impl.h

OpenZFS 9962 - zil_commit should omit cache thrash

As a result of the changes made in 8585, it's possible for an excessive
amount of vdev flush commands to be issued under some workloads.

Specifically, when the workload consists of mostly async write activity,
interspersed with some sync write and/or fsync activity, we can end up
issuing more flush commands to the underlying storage than is actually
necessary. As a result of these flush commands, the write latency and
overall throughput of the pool can be poorly impacted (latency
increases, throughput decreases).

Currently, any time an lwb completes, the vdev(s) written to as a result
of that lwb will be issued a flush command. The intenion is so the data
written to that vdev is on stable storage, prior to communicating to any
waiting threads that their data is safe on disk.

The problem with this scheme, is that sometimes an lwb will not have any
threads waiting for it to complete. This can occur when there's async
activity that gets "converted" to sync requests, as a result of calling
the zil_async_to_sync() function via zil_commit_impl(). When this
occurs, the current code may issue many lwbs that don't have waiters
associated with them, resulting in many flush commands, potentially to
the same vdev(s).


    [45 lines not shown]

ZFS on Linux/src 53b1f5eman/man5 zfs-module-parameters.5, module/zfs zil.c vdev.c

OpenZFS 9963 - Separate tunable for disabling ZIL vdev flush

Porting Notes:
* Add options to zfs-module-parameters(5) man page.
* zfs_nocacheflush move to vdev.c instead of vdev_disk.c, since
  the latter doesn't get built for user space.

Authored by: Prakash Surya <prakash.surya at delphix.com>
Reviewed by: Matt Ahrens <matt at delphix.com>
Reviewed by: Brad Lewis <brad.lewis at delphix.com>
Reviewed by: Patrick Mooney <patrick.mooney at joyent.com>
Reviewed by: Tom Caputi <tcaputi at datto.com>
Reviewed by: George Melikov <mail at gmelikov.ru>
Approved by: Dan McDonald <danmcd at joyent.com>
Ported-by: Signed-off-by: Brian Behlendorf <behlendorf1 at llnl.gov>

OpenZFS-issue: https://www.illumos.org/issues/9963
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/f8fdf68125
Closes #8186

ZFS on Linux/src 18b14b1module/zfs zio.c

OpenZFS 9993 - zil writes can get delayed in zio pipeline

Authored by: George Wilson <george.wilson at delphix.com>
Reviewed by: Prakash Surya <prakash.surya at delphix.com>
Reviewed by: Brad Lewis <brad.lewis at delphix.com>
Reviewed by: Matt Ahrens <matt at delphix.com>
Reviewed by: Tom Caputi <tcaputi at datto.com>
Reviewed by: George Melikov <mail at gmelikov.ru>
Approved by: Dan McDonald <danmcd at joyent.com>
Ported-by: Brian Behlendorf <behlendorf1 at llnl.gov>

OpenZFS-issue: https://www.illumos.org/issues/9993
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/2258ad0b
Closes #8185
DeltaFile
+2-1module/zfs/zio.c
+2-11 files

ZFS on Linux/src e63ac16lib/libzfs libzfs_mount.c

OpenZFS 9880 - Race in ZFS parallel mount

Porting Notes:
* Not required for Linux since the zone is always global.  But
  we'll want this change if we start using the zones code.

Authored by: Andy Fiddaman <omnios at citrus-it.co.uk>
Reviewed by: Jason King <jason.king at joyent.com>
Reviewed by: Sebastien Roy <sebastien.roy at delphix.com>
Reviewed by: Tom Caputi <tcaputi at datto.com>
Approved by: Joshua M. Clulow <josh at sysmgr.org>
Ported-by: Brian Behlendorf <behlendorf1 at llnl.gov>

OpenZFS-issue: https://www.illumos.org/issues/9880
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/bc4c0ff134
Closes #8189

ZFS on Linux/src 4b61176lib/libzfs libzfs_util.c

Fix error message when zfs module is not loaded

This patch corrects a small issue where the wrong error message
was being displayed when the zfs kernel module was not loaded.
This also avoids waiting for the (by default) 10s timeout to see
if the /dev/zfs device appears.

Reviewed-by: George Melikov <mail at gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Tom Caputi <tcaputi at datto.com>
Closes #8187 

ZFS on Linux/src ef57371scripts zfs-tests.sh zfs.sh

Do not enable stack tracer for ZFS performance test

Linux ZFS test suite runs with /proc/sys/kernel/stack_tracer_enabled=1,
via zfs.sh script, which has negative performance impact, up to 40%.

Since large stack is a rare issue now, preferred behavior would be:
- making stack tracer an opt-in feature for zfs.sh
- zfs-test.sh enables stack tracer only when requested

Reviewed-by: George Melikov <mail at gmelikov.ru>
Reviewed-by: Richard Elling <Richard.Elling at RichardElling.com>
Reviewed-by: John Kennedy <john.kennedy at delphix.com>
Signed-off-by: Tony Nguyen <tony.nguyen at delphix.com>
#8173 

ZFS on Linux/src d649604module/zfs dsl_scan.c

Ensure dsl scan prefetch queue is emptied

This patch simply ensures that scn->scn_prefetch_queue is emptied
before the kernel module is unloaded and when scanning completes.

Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Alek Pinchuk <apinchuk at datto.com>
Signed-off-by: Tom Caputi <tcaputi at datto.com>
Closes #8178 
DeltaFile
+20-0module/zfs/dsl_scan.c
+20-01 files

ZFS on Linux/src b53cb02lib/libzfs libzfs_sendrecv.c

Fix 'zfs receive -F' message when destination has snapshots

When receiving a send stream with forced rollback on a dataset with
snapshots zfs suggests said snapshots must be removed to successfully
receive the stream; however the message is misleading because it
prints the dataset name instead of one of its snapshots.

   $ sudo zfs snap pp/recvfs at snap-orig
   $ sudo zfs recv -F pp/recvfs < sendstream
   cannot receive new filesystem stream: destination has snapshots (eg. pp/recvfs)
   must destroy them to overwrite it

This change simply restores the snapshot name in the error message.

Reviewed-by: George Melikov <mail at gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu at gmail.com>
Closes #8167 

ZFS on Linux/src 2aa398fconfig kernel.m4

Use autoconf variable for C preprocessor

This fixes the build when cross-compiling, where the preprocessor might
be prefixed.

Reviewed-by: George Melikov <mail at gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Tony Hutter <hutter2 at llnl.gov>
Signed-off-by: Ben Wolsieffer <benwolsieffer at gmail.com>
Closes #8180 
DeltaFile
+1-1config/kernel.m4
+1-11 files

ZFS on Linux/src e3c85c0cmd/zdb zdb.c

Move assert in dump_dir() in zdb

This one line patch moves an assert in the function dump_dir()
below an error check that ensures it ran correctly. This ensures
zdb dumps the error that actually caused the problem, as opposed
to one of its symptoms.

Reviewed-by: George Melikov <mail at gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Tom Caputi <tcaputi at datto.com>
Closes #8171 
DeltaFile
+3-3cmd/zdb/zdb.c
+3-31 files

ZFS on Linux/src 78e2139module/zfs dnode.c

Fix dnode_hold() freeing dnode behavior

Commit 4c5b89f59 refactored dnode_hold() and in the process
accidentally introduced a slight change in behavior which was
not intended.  The required behavior is that once the ZPL,
or other consumer, declares its intent to free a dnode then
dnode_hold() should immediately start failing.  This updated
code wouldn't return the failure until after it was freed.

When DNODE_MUST_BE_ALLOCATED is set it must return ENOENT, and
when DNODE_MUST_BE_FREE is set it must return EEXIST;

This issue was uncovered by ztest_remap() which attempted
to remap a freeing object which should have been skipped as
described by the comment in dmu_objset_remap_indirects_impl().

Reviewed-by: George Melikov <mail at gmelikov.ru>
Reviewed-by: Tom Caputi <tcaputi at datto.com>
Reviewed-by: Olaf Faaland <faaland1 at llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Closes #8172 
DeltaFile
+4-2module/zfs/dnode.c
+4-21 files

ZFS on Linux/src c5eea0acmd/zpool zpool_main.c, module/zcommon zprop_common.c

Fix 'zpool list -v' alignment

The verbose output of 'zpool list' was not correctly aligned due
to differences in the vdev name lengths.  Minimally update the
code the correct the alignment using the same strategy employed
by 'zpool status'.

Missing dashes were added for the empty defaults columns, and
the vdev state is now printed for all vdevs.

Reviewed-by: Tom Caputi <tcaputi at datto.com>
Reviewed-by: Tony Hutter <hutter2 at llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Closes #7308 
Closes #8147 

ZFS on Linux/src a0cc372etc/init.d zfs-functions.in

zfs-functions.in: is_mounted() always returns 1

The 'while read line; ...; done' loop is run in a piped subshell 
therefore the 'return 0' would not cause a return from the 
is_mounted() function.  In all cases, this function will 
always return 1.

The fix is to 'return 1' from the subshell on a successful match 
(no match == return 0), and then negating the final return value.

Reviewed-by: George Melikov <mail at gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: TerraTech <TerraTech at users.noreply.github.com>
Closes #8151 

ZFS on Linux/src fedef6dmodule/zfs vdev_removal.c

Fix ztest deadlock in spa_vdev_remove()

This patch corrects an issue where spa_vdev_remove() would
call spa_history_log_internal() while holding the spa config
lock. This function may decide to block until the next txg if
the current one seems too full. However, since the thread is
holding the config log, the txg sync thread cannot progress
and the system ends up deadlocked. This patch simply moves
all calls to spa_history_log_internal() outside of the config
lock.

Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Tom Caputi <tcaputi at datto.com>
Closes #8162 

ZFS on Linux/src 0b606cbcmd/ztest ztest.c

Fix ztest deadlock in ztest_zil_remount()

This patch fixes a small race condition in ztest_zil_remount()
that could result in a deadlock. ztest_device_removal() calls
spa_vdev_remove() which may eventually call spa_reset_logs().
If ztest_zil_remount() attempts to call zil_close() while this
is happening, it may fail when it asserts !zilog_is_dirty(zilog).
This patch simply adds locking to correct the issue.

Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Tom Caputi <tcaputi at datto.com>
Closes #8154 
DeltaFile
+8-0cmd/ztest/ztest.c
+8-01 files

ZFS on Linux/src bdbd547cmd/zfs zfs_main.c, lib/libzfs libzfs_sendrecv.c

Fix ASSERT in zfs_receive_one()

This commit fixes the following ASSERT in zfs_receive_one() when
receiving a send stream from a root dataset with the "-e" option:

    $ sudo zfs snap source at snap
    $ sudo zfs send source at snap | sudo zfs recv -e destination/recv
    chopprefix > drrb->drr_toname
    ASSERT at libzfs_sendrecv.c:3804:zfs_receive_one()

Reviewed-by: Tom Caputi <tcaputi at datto.com>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed by: Paul Dagnelie <pcd at delphix.com>
Signed-off-by: loli10K <ezomori.nozomu at gmail.com>
Closes #8121 

ZFS on Linux/src 7c9a429man/man5 zfs-module-parameters.5, man/man8 zpool.8

Detect IO errors during device removal

* Detect IO errors during device removal

While device removal cannot verify the checksums of individual
blocks during device removal, it can reasonably detect hard IO
errors from the leaf vdevs.  Failure to perform this error
checking can result in device removal completing successfully,
but moving no data which will permanently corrupt the pool.

Situation 1: faulted/degraded vdevs

In the configuration shown below, the removal of mirror-0 will
permanently corrupt the pool.  Device removal will preferentially
copy data from 'vdev1 -> vdev3' and from 'vdev2 -> vdev4'.  Which
in this case will result in nothing being copied since one vdev
in each of those groups in unavailable.  However, device removal
will complete successfully since all IO errors are ignored.

  tank                DEGRADED     0     0     0
    mirror-0          DEGRADED     0     0     0
      /var/tmp/vdev1  FAULTED      0     0     0  external fault
      /var/tmp/vdev2  ONLINE       0     0     0
    mirror-1          DEGRADED     0     0     0
      /var/tmp/vdev3  ONLINE       0     0     0

    [26 lines not shown]

ZFS on Linux/src c40a112cmd/ztest ztest.c, module/zfs vdev_removal.c spa_checkpoint.c

Fix consistency of ztest_device_removal_active

ztest currently uses the boolean flag ztest_device_removal_active
to protect some tests that may not run successfully if they occur
at the same time as ztest_device_removal(). Unfortunately, in the
event that ztest is in the middle of a device removal when it
decides to issue a SIGKILL, the device removal will be
automatically restarted (without setting the flag) when the pool
is re-imported on the next run. This patch corrects this by
ensuring that any in-progress removals are completed before running
further tests after the re-import.

This patch also makes a few small changes to prevent race conditions
involving the creation and destruction of spa->spa_vdev_removal,
since this field is not protected by any locks. Some checks that
may run concurrently with setting / unsetting this field have been
updated to check spa->spa_removing_phys.sr_state instead. The most
significant change here is that spa_removal_get_stats() no longer
accounts for in-flight work done, since that could result in a NULL
pointer dereference.

Reviewed by: Matthew Ahrens <mahrens at delphix.com>
Reviewed-by: Serapheim Dimitropoulos <serapheim at delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Tom Caputi <tcaputi at datto.com>
Closes #8105 

ZFS on Linux/src c71c8c7module/zfs vdev_disk.c

zfs_dbgmsg() is not safe from every context

This commit reverts to using printk() instead of zfs_dbgmsg() to log
messages in vdev_disk_error(): this is necessary because the latter can
be called from interrupt context where we are not allowed to sleep.
Unfortunately zfs_dbgmsg() performs its allocations calling kmalloc()
with the KM_SLEEP flag which may result in the following oops:

   BUG: scheduling while atomic: swapper/4/0/0x10000100
        Call Trace:
        <IRQ>  [<0>] dump_stack+0x19/0x1b
        ...
        [<0>] spl_kmem_alloc+0xdf/0x140 [spl] <-- kmem_alloc(size, KM_SLEEP)
        [<0>] __dprintf+0x69/0x150 [zfs]
        [<0>] ? kmem_cache_free+0x1e2/0x200
        [<0>] vdev_disk_error.part.15+0x5f/0x70 [zfs]
        [<0>] vdev_disk_io_flush_completion+0x48/0x70 [zfs]
        [<0>] bio_endio+0x67/0xb0
        [<0>] blk_update_request+0x90/0x360
        ...
        [<0>] scsi_finish_command+0xdc/0x140
        [<0>] scsi_softirq_done+0x132/0x160
        [<0>] blk_done_softirq+0x96/0xc0
        [<0>] __do_softirq+0xf5/0x280
        [<0>] call_softirq+0x1c/0x30

    [16 lines not shown]

ZFS on Linux/src cef48f1module/zfs dsl_scan.c, tests/zfs-tests/tests/functional/cli_root/zpool_import import_cachefile_device_replaced.ksh import_rewind_device_replaced.ksh

Remove races from scrub / resilver tests

Currently, several tests in the ZFS Test Suite that attempt to
test scrub and resilver behavior occasionally fail. A big reason
for this is that these tests use a combination of zinject and
zfs_scan_vdev_limit to attempt to slow these operations enough
to verify their test commands. This method works most of the time,
but provides no guarantees and leads to flaky behavior. This patch
adds a new tunable, zfs_scan_suspend_progress, that ensures that
scans make no progress, guaranteeing that tests can be run without
racing.

This patch also changes zfs_remove_max_bytes_pause to match this
new tunable. This provides some consistency between these two
similar tunables and ensures that the tunable will not misbehave
on 32-bit systems.

Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Giuseppe Di Natale <guss80 at gmail.com>
Reviewed-by: Tony Hutter <hutter2 at llnl.gov>
Signed-off-by: Tom Caputi <tcaputi at datto.com>
Closes #8111

ZFS on Linux/src 00369f3tests/zfs-tests/tests/functional/channel_program/lua_core tst.nvlist_to_lua.ksh tst.return_nvlist_pos.ksh, tests/zfs-tests/tests/functional/channel_program/synctask_core tst.destroy_snap.ksh tst.destroy_fs.ksh

ZTS: fix "not found" errors

This commit fixes several "not found" errors caused by calling undefined
or incorrect shell functions in the following ZFS Test Suite groups:

   * alloc_class
   * channel_program/lua_core
   * channel_program/synctask_core
   * cli_root/zpool_import
   * cli_user/misc

Reviewed-by: George Melikov <mail at gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Giuseppe Di Natale <guss80 at gmail.com>
Reviewed-by: bunder2015 <omfgbunder at gmail.com>
Signed-off-by: loli10K <ezomori.nozomu at gmail.com>
Closes #8152 

ZFS on Linux/src 62ee31aman/man5 zfs-module-parameters.5

Fix typo in update to zfs-module-parameters(5)

Reviewed-by: George Melikov <mail at gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Giuseppe Di Natale <guss80 at gmail.com>
Reviewed-by: bunder2015 <omfgbunder at gmail.com>
Signed-off-by: Rich Ercolani <rincebrain at gmail.com>
Closes #8153 

ZFS on Linux/src 8005ca4lib/libspl strlcat.c strlcpy.c, lib/libspl/include string.h

Move strlcat, strlcpy, and strnlen

Move strlcat() and strlcpy() from .c source files in to the libspl
string.h header.  By changing these compatibility functions to static
inline functions they can included as needed without requiring linking
with the libspl.so library.

Remove strnlen() which is barely used in the source, and has been
provided by glibc since v2.10.

Finally, convert four instances of strncpy() to strlcpy() in
libzfs_input_check.c which were causing build warnings when compiling
with gcc 8.2.1.  For example:

  libzfs_input_check.c: In function ‘zfs_destroy’:
  libzfs_input_check.c:651:9: error: ‘strncpy’ specified bound \
      4096 equals destination size [-Werror=stringop-truncation]
    (void) strncpy(zc.zc_name, dataset, sizeof (zc.zc_name));
         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Reviewed-by: Olaf Faaland <faaland1 at llnl.gov>
Reviewed-by: Richard Laager <rlaager at wiktel.com>
Signed-off-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Closes #8116 

ZFS on Linux/src 0cd5c94cmd/zpool zpool_vdev.c, tests/runfiles linux.run

zpool: allow split with whole-disk devices

This change allows 'zpool split' to work with whole-disk devices and
updates the ZFS Test Suite with a new script to exercise this
functionality.

Reviewed by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu at gmail.com>
Closes #6643 
Closes #8133 

ZFS on Linux/src bd9c195man/man8 zfs.8

man/zfs.8: document 'received' property source

Reviewed-by: George Melikov <mail at gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Giuseppe Di Natale <guss80 at gmail.com>
Reviewed-by: Richard Laager <rlaager at wiktel.com>
Signed-off-by: Christian Schwarz <me at cschwarz.com>
Closes #8134 
DeltaFile
+3-2man/man8/zfs.8
+3-21 files

ZFS on Linux/src 70621fftests/zfs-tests/tests/functional/checksum filetest_001_pos.ksh

ZTS: Fix parsing of zpool status in checksum test

filetest_001_pos consumes the output using read -r, assigning each
field to a variable. The problem comes when a vdev is marked degraded,
which appends extra fields to the line. This causes the trailing text
to be treated as part of the `cksum` variable. Using awk instead of
read -r allows us to extract the checksum error count from the output
whether the vdev is degraded or not.

Reviewed-by: loli10K <ezomori.nozomu at gmail.com>
Reviewed-by: George Melikov <mail at gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: John Wren Kennedy <john.kennedy at delphix.com>
Closes #8136 

ZFS on Linux/src ebb8735tests/zfs-tests/include commands.cfg

ZTS: "checksum" test group needs "lscpu"

This change adds "lscpu" to the list of commands used by the ZFS Test
Suite: this is required by the "checksum" test group to read the CPU
frequency which is used in EdonR, Skein and SHA2 performance tests.

Reviewed-by: George Melikov <mail at gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Richard Laager <rlaager at wiktel.com>
Signed-off-by: loli10K <ezomori.nozomu at gmail.com>
Closes #8139 

ZFS on Linux/src a10d50fcmd/zfs zfs_main.c, include libzfs_impl.h

OpenZFS 8115 - parallel zfs mount

Porting Notes:
* Use thread pools (tpool) API instead of introducing taskq interfaces
  to libzfs.
* Use pthread_mutext for locks as mutex_t isn't available.
* Ignore alternative libshare initialization since OpenZFS-7955 is
  not present on zfsonlinux.

Authored by: Sebastien Roy <seb at delphix.com>
Reviewed by: Matthew Ahrens <mahrens at delphix.com>
Reviewed by: Pavel Zakharov <pavel.zakharov at delphix.com>
Reviewed by: Brad Lewis <brad.lewis at delphix.com>
Reviewed by: George Wilson <george.wilson at delphix.com>
Reviewed by: Paul Dagnelie <pcd at delphix.com>
Reviewed by: Prashanth Sreenivasa <pks at delphix.com>
Authored by: Brian Behlendorf <behlendorf1 at llnl.gov>
Approved by: Matt Ahrens <mahrens at delphix.com>
Ported-by: Don Brady <don.brady at delphix.com>

OpenZFS-issue: https://www.illumos.org/issues/8115
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/a3f0e2b569
Closes #8092

ZFS on Linux/src af2e841. META

Tag 0.8.0-rc2

Signed-off-by: Brian Behlendorf <behlendorf1 at llnl.gov>
DeltaFile
+2-2META
+2-21 files

ZFS on Linux/src eb1a0b6contrib/dracut/90zfs parse-zfs.sh.in

Allow spaces in pool names for cmdline argument

PR #8114 quoted the ${ENCRYPTIONROOT} parameter to ensure we don't
lose spaces when unlocking root filesystem in the off chance that 
it has a space in its name.

Unfortunately, dracut and initramfs-tools do not actually get the 
quotes from the cmdline. If we use root=ZFS="root pool/filesystem 
name" the script still only sees root=ZFS=root and no quotation 
marks.

Because + is a reserved character in ZFS, it's used as a 
placeholder for spaces in the kernel cmdline.  In this way,
root=ZFS=root+pool/filesystem+name will properly expand by 
replacing the character with sed (POSIX compliant method).

Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: bunder2015 <omfgbunder at gmail.com>
Signed-off-by: Kash Pande <kash at tripleback.net>
Issue #8114 
Closes #8117 

ZFS on Linux/src c8fd652module/zfs vdev_label.c

Fix coverity defects: CID 184285

CID 184285: Read from pointer after free (USE_AFTER_FREE)

This patch fixes an use-after-free in vdev_config_generate_stats()
moving the kmem_free() call at the end of the function.

Reviewed-by: George Melikov <mail at gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Giuseppe Di Natale <guss80 at gmail.com>
Signed-off-by: loli10K <ezomori.nozomu at gmail.com>
Closes #8120 

ZFS on Linux/src ecd3728config user-systemd.m4, rpm/generic zfs.spec.in zfs-kmod.spec.in

Fix systemd spec file macros

Ensure that the _unitdir, _presetdir, _modulesloaddir, and
_systemdgeneratordir macros are always defined.  If not set
them to the expected default values.  Pass all of these options
to ./configure and package the resulting files in those locations.

Additionally, set __brp_mangle_shebangs_exclude_from until the
conversion to Python 3 is complete so they may be built cleanly
under mock.

Reviewed-by: Neal Gompa <ngompa at datto.com>
Reviewed-by: Tony Hutter <hutter2 at llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Closes #7567
Closes #8119

ZFS on Linux/src 0500bfdcontrib/initramfs/scripts zfs.in

Make initramfs-tools script encryption aware

Changed decrypt_fs zfs command to "load-key"
Plymouth case code based on "contrib/dracut/90zfs/zfs-lib.sh.in"
Systemd case based on "contrib/dracut/90zfs/zfs-load-key.sh.in"
Cleaned up misspelling of "available" throughout

Code style fixes
Single quote for ${ENCRYPTIONROOT}
Changed "${DECRYPT_CMD}"  to "eval ${DECRYPT_CMD}"

Reviewed-by: Kash Pande <kash at tripleback.net>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Tom Caputi <tcaputi at datto.com>
Reviewed-by: Richard Laager <rlaager at wiktel.com>
Signed-off-by: Garrett Fields <ghfields at gmail.com>
Closes #8093 

ZFS on Linux/src d48091dcmd/zed/agents zfs_agents.c zfs_retire.c, tests/runfiles linux.run

zed: detect and offline physically removed devices

This commit adds a new test case to the ZFS Test Suite to verify ZED
can detect when a device is physically removed from a running system:
the device will be offlined if a spare is not available in the pool.

We implement this by using the existing libudev functionality and
without relying solely on the FM kernel module capabilities which have
been observed to be unreliable with some kernels.

Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Don Brady <don.brady at delphix.com>
Signed-off-by: loli10K <ezomori.nozomu at gmail.com>
Closes #1537
Closes #7926

ZFS on Linux/src 13c59bbcontrib/dracut/90zfs mount-zfs.sh.in

Add quotations for ${ENCRYPTIONROOT}

Add quotations for ${ENCRYPTIONROOT} to avoid breaking systems
with a space in the name.

Reviewed-by: bunder2015 <omfgbunder at gmail.com>
Reviewed-by: Tom Caputi <tcaputi at datto.com>
Reviewed-by: George Melikov <mail at gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Kash Pande <kash at tripleback.net>
Related-to: #8093 
Closes #8114 

ZFS on Linux/src ad796b8cmd/zpool zpool_main.c, man/man5 zfs-module-parameters.5

Add zpool status -s (slow I/Os) and -p (parseable)

This patch adds a new slow I/Os (-s) column to zpool status to show the
number of VDEV slow I/Os. This is the number of I/Os that didn't
complete in zio_slow_io_ms milliseconds. It also adds a new parsable
(-p) flag to display exact values.

        NAME         STATE     READ WRITE CKSUM  SLOW
        testpool     ONLINE       0     0     0     -
          mirror-0   ONLINE       0     0     0     -
            loop0    ONLINE       0     0     0    20
            loop1    ONLINE       0     0     0     0

Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed by: Matthew Ahrens <mahrens at delphix.com>
Signed-off-by: Tony Hutter <hutter2 at llnl.gov>
Closes #7756
Closes #6885

ZFS on Linux/src 877d925module/zfs zfs_ctldir.c, tests/zfs-tests/tests/functional/delegate delegate_common.kshlib setup.ksh

Update zfs_admin_snapshot value (disabled)

It's disabled by default, update code and tests to reflect
the documentation.

Minor cleanup in delegate_common.kshlib.

Reviewed-by: Gregor Kopka <gregor at kopka.net>
Reviewed-by: John Kennedy <john.kennedy at delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: George Melikov <mail at gmelikov.ru>
Closes #7835 
Closes #8045 

ZFS on Linux/src 16d2981. META, rpm/generic zfs.spec.in zfs-kmod.spec.in

Tag zfs-0.7.12

META file and changelog updated.

Signed-off-by: Tony Hutter <hutter2 at llnl.gov>

ZFS on Linux/src 55f39a0module/zfs arc.c

Fix arc_release() refcount

Update arc_release to use arc_buf_size().  This hunk was accidentally
dropped when porting compressed send/recv, 2aa34383b.

Reviewed-by: Matthew Ahrens <mahrens at delphix.com>
Signed-off-by: Tom Caputi <tcaputi at datto.com>
Signed-off-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Closes #8000
DeltaFile
+1-1module/zfs/arc.c
+1-11 files

ZFS on Linux/src b32f127include/sys dnode.h dmu_impl.h, module/zfs dnode.c dmu_objset.c

Fix race in dnode_check_slots_free()

Currently, dnode_check_slots_free() works by checking dn->dn_type
in the dnode to determine if the dnode is reclaimable. However,
there is a small window of time between dnode_free_sync() in the
first call to dsl_dataset_sync() and when the useraccounting code
is run when the type is set DMU_OT_NONE, but the dnode is not yet
evictable, leading to crashes. This patch adds the ability for
dnodes to track which txg they were last dirtied in and adds a
check for this before performing the reclaim.

This patch also corrects several instances when dn_dirty_link was
treated as a list_node_t when it is technically a multilist_node_t.

Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Tom Caputi <tcaputi at datto.com>
Closes #7147
Closes #7388

ZFS on Linux/src b884768include/sys refcount.h, module/zfs arc.c dbuf.c

Prefix all refcount functions with zfs_

Recent changes in the Linux kernel made it necessary to prefix
the refcount_add() function with zfs_ due to a name collision.

To bring the other functions in line with that and to avoid future
collisions, prefix the other refcount functions as well.

Reviewed by: Matthew Ahrens <mahrens at delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Tim Schumacher <timschumi at gmx.de>
Closes #7963

ZFS on Linux/src f42f870rpm/generic zfs-kmod.spec.in zfs.spec.in

Add BuildRequires gcc, make, elfutils-libelf-devel

This adds a BuildRequires for gcc, make, and elfutils-libelf-devel
into our spec files.  gcc has been a packaging requirement for
awhile now:

https://fedoraproject.org/wiki/Packaging:C_and_C%2B%2B

These additional BuildRequires allow us to mock build in
Fedora 29.

Reviewed-by: Neal Gompa <ngompa at datto.com>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by:  Tony Hutter <hutter2 at llnl.gov>
Closes #8095
Closes #8102

ZFS on Linux/src 9014da2cmd/zdb zdb.c, tests/runfiles linux.run

Skip import activity test in more zdb code paths

Since zdb opens the pools read-only, it cannot damage the pool in the
event the pool is already imported either on the same host or on
another one.

If the pool vdev structure is changing while zdb is importing the
pool, it may cause zdb to crash.  However this is unlikely, and in any
case it's a user space process and can simply be run again.

For this reason, zdb should disable the multihost activity test on
import that is normally run.

This commit fixes a few zdb code paths where that had been overlooked.
It also adds tests to ensure that several common use cases handle this
properly in the future.

Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Gu Zheng <guzheng2331314 at 163.com>
Signed-off-by: Olaf Faaland <faaland1 at llnl.gov>
Closes #7797
Closes #7801

ZFS on Linux/src 262275acontrib/initramfs/scripts zfs

Allow use of pool GUID as root pool

It's helpful if there are pools with same names,
but you need to use only one of them.

Main case is twin servers, meanwhile some software
requires the same name of pools (e.g. Proxmox).

Reviewed-by: Kash Pande <kash at tripleback.net>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: George Melikov <mail at gmelikov.ru>
Signed-off-by: Igor ‘guardian’ Lidin of Moscow, Russia
Closes #8052

ZFS on Linux/src b2f003cconfig kernel-in-compat-syscall.m4 kernel.m4, include/linux vfs_compat.h

Fix statfs(2) for 32-bit user space

When handling a 32-bit statfs() system call the returned fields,
although 64-bit in the kernel, must be limited to 32-bits or an
EOVERFLOW error will be returned.

This is less of an issue for block counts since the default
reported block size in 128KiB. But since it is possible to
set a smaller block size, these values will be scaled as
needed to fit in a 32-bit unsigned long.

Unlike most other filesystems the total possible file counts
are more likely to overflow because they are calculated based
on the available free space in the pool. In order to prevent
this the reported value must be capped at 2^32-1. This is
only for statfs(2) reporting, there are no changes to the
internal ZFS limits.

Reviewed-by: Andreas Dilger <andreas.dilger at whamcloud.com>
Reviewed-by: Richard Yao <ryao at gentoo.org>
Signed-off-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Issue #7927
Closes #7122
Closes #7937

ZFS on Linux/src f8f4e13include/sys refcount.h arc_impl.h, module/zfs refcount.c arc.c

Linux 4.19-rc3+ compat: Remove refcount_t compat

torvalds/linux at 59b57717f ("blkcg: delay blkg destruction until
after writeback has finished") added a refcount_t to the blkcg
structure. Due to the refcount_t compatibility code, zfs_refcount_t
was used by mistake.

Resolve this by removing the compatibility code and replacing the
occurrences of refcount_t with zfs_refcount_t.

Reviewed-by: Franz Pletz <fpletz at fnordicwalking.de>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Tim Schumacher <timschumi at gmx.de>
Closes #7885
Closes #7932

ZFS on Linux/src 5f07d51cmd/zpool zpool_main.c

Zpool iostat: remove latency/queue scaling

Bandwidth and iops are average per second while *_wait are averages
per request for latency or, for queue depths, an instantaneous
measurement at the end of an interval (according to man zpool).

When calculating the first two it makes sense to do
x/interval_duration (x being the increase in total bytes or number of
requests over the duration of the interval, interval_duration in
seconds) to 'scale' from amount/interval_duration to amount/second.

But applying the same math for the latter (*_wait latencies/queue) is
wrong as there is no interval_duration component in the values (these
are time/requests to get to average_time/request or already an
absulute number).

This bug leads to the only correct continuous *_wait figures for both
latencies and queue depths from 'zpool iostat -l/q' being with
duration=1 as then the wrong math cancels itself (x/1 is a nop).

This removes temporal scaling from latency and queue depth figures.

Reviewed-by: Tony Hutter <hutter2 at llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Gregor Kopka <gregor at kopka.net>

    [2 lines not shown]

ZFS on Linux/src 9e58d5ecmd/arcstat arcstat.py

Fix flake8 "invalid escape sequence 'x'" warning

From, https://lintlyci.github.io/Flake8Rules/rules/W605.html

As of Python 3.6, a backslash-character pair that is not a valid
escape sequence now generates a DeprecationWarning. Although this
will eventually become a SyntaxError, that will not be for several
Python releases.

Note 'float_pobj' was simply removed from arcstat.py since it
was entirely unused.

Reviewed-by: John Kennedy <john.kennedy at delphix.com>
Reviewed-by: Richard Elling <Richard.Elling at RichardElling.com>
Signed-off-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Closes #8056

ZFS on Linux/src 320f9detests/zfs-tests/tests/functional/tmpfile setup.ksh tmpfile_test.c

ZTS: Update O_TMPFILE support check

In CentOS 7.5 the kernel provided a compatibility wrapper to support
O_TMPFILE.  This results in the test setup script correctly detecting
kernel support.  But the ZFS module was built without O_TMPFILE
support due to the non-standard CentOS kernel interface.

Handle this case by updating the setup check to fail either when
the kernel or the ZFS module fail to provide support.  The reason
will be clearly logged in the test results.

Reviewed-by: Chunwei Chen <tuxoko at gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Closes #7528

ZFS on Linux/src 45579c9include/sys zio.h, module/zfs zio.c

Reduce taskq and context-switch cost of zio pipe

When doing a read from disk, ZFS creates 3 ZIO's: a zio_null(), the
logical zio_read(), and then a physical zio. Currently, each of these
results in a separate taskq_dispatch(zio_execute).

On high-read-iops workloads, this causes a significant performance
impact. By processing all 3 ZIO's in a single taskq entry, we reduce the
overhead on taskq locking and context switching.  We accomplish this by
allowing zio_done() to return a "next zio to execute" to zio_execute().

This results in a ~12% performance increase for random reads, from
96,000 iops to 108,000 iops (with recordsize=8k, on SSD's).

Reviewed by: Pavel Zakharov <pavel.zakharov at delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed by: George Wilson <george.wilson at delphix.com>
Signed-off-by: Matthew Ahrens <mahrens at delphix.com>
External-issue: DLPX-59292
Closes #7736
DeltaFile
+137-115module/zfs/zio.c
+2-2include/sys/zio.h
+139-1172 files