15977 smbadm option to show group member SIDs
Reviewed by: Jerry Jelinek <gjelinek at racktopsystems.com>
Reviewed by: Matt Barden <mbarden at racktopsystems.com>
Approved by: Robert Mustacchi <rm at fingolfin.org>
15976 List users connected via SMB
Reviewed by: Gordon Ross <gwr at racktopsystems.com>
Reviewed by: Matt Barden <mbarden at racktopsystems.com>
Approved by: Robert Mustacchi <rm at fingolfin.org>
15975 Let smbsrv driver allow multiple opens
Reviewed by: Jason King <jking at racktopsystems.com>
Reviewed by: Toomas Soome <tsoome at me.com>
Approved by: Robert Mustacchi <rm at fingolfin.org>
[X86] LowerFunnelShift - improve handling of vXi8 constant splat funnel shifts
This patch moves the promotion to vXi16 shifts and the upper/lower bit masking into LowerFunnelShift for targets that have a bit-select instruction (XOP's VPCMOV and AVX512's VPTERNLOG).
This prevents the regressions in #89115 due to the masking of ((X << V) | (Y >> (8-V))) vXi8 shifts.
[mlir][test] Shard and reorganize the test dialect
Shard the test dialect by 4. This patch also reorganizes the manually-written
op hooks into `TestOpDefs.cpp` and format custom directive parser and printers
into `TestFormatUtils`, adds missing comment blocks, and moves around where
generated source files are included for types, attributes, enums, etc.
In my case, the compilation time of the test dialect drops from >60s to ~10s.
[flang] Remove obsolete flang-to-external-fc tool (#88904)
It seems like the `flang-to-external-fc` tool is no longer needed,
because Flang is now a full compiler in its own right. After PR #85249
has landed, this tool will not be able to pick up the `.f18.mod` files.
[mlir][tensor] Fold `tensor.reshape` for dynamic reshape (#88961)
If `tensor.reshape` occurs with `d0, d1, d2, ...` for the dimensions we
know that the reshape is a no-op. Checking for this case lets us fold
away the computation.
[libc++] Fix usage of 'exclude_from_explicit_instantiation' attribute on local class members (#89377)
#88777 adds a warning for when the `exclude_from_explicit_instantiation`
attribute is applied to local classes and members thereof. This patch
addresses the few instances of `exclude_from_explicit_instantiation`
being applied to local class members in libc++.
bdev_discard_supported: understand discard_granularity=0
Kernel documentation for the discard_granularity property says:
A discard_granularity of 0 means that the device does not support
discard functionality.
Some older kernels had drivers (notably loop, but also some USB-SATA
adapters) that would set the QUEUE_FLAG_DISCARD capability flag, but
have discard_granularity=0. Since 5.10 (torvalds/linux at b35fd7422c2f) the
discard entry point blkdev_issue_discard() has had a check for this,
which would immediately reject the call with EOPNOTSUPP, and throw a
scary diagnostic message into the log. See #16068.
Since 6.8, the block layer sets a non-zero default for
discard_granularity (torvalds/linux at 3c407dc723bb), and a future kernel
will remove the check entirely[1].
As such, there's no good reason for us to enable discard when
[10 lines not shown]
Add zfetch stats in arcstats
arc_summary also reports zfetch stats but it's inconvenient to monitor
contiguously incrementing numbers. Adding them in arcstats allows us to
observe streams more conveniently.
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Alexander Motin <mav at FreeBSD.org>
Signed-off-by: Ameer Hamza <ahamza at ixsystems.com>
Closes #16094
Fix: FreeBSD Arm64 does not build currently
The define LD_VERSION isn't defined on FreeBSD Arm64 when OpenZFS is
build with the default compiler: clang.
I used only gcc for testing - my fault.
Fast fix as suggested by @mmatuska
Reviewed-by: Tony Hutter <hutter2 at llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Martin Matuska <mm at FreeBSD.org>
Signed-off-by: Tino Reichardt <milky-zfs at mcmilk.de>
Closes #16103
L2ARC: Relax locking during write
Previous code held ARC state sublist lock throughout all L2ARC
write process, which included number of allocations and even ZIO
issues. Being blocked in any of those places the code could also
block ARC eviction, that could cause OOM activation or even dead-
lock if system is low on memory or one is too fragmented.
Fix it by dropping the lock as soon as we see a block eligible
for L2ARC writing and pick it up later using earlier inserted
marker. While there, also reduce scope of hash lock, moving
ZIO allocation and other operations not requiring header access
out of it. All operations requiring header access move under
hash lock, since L2_WRITING flag does not prevent header eviction
only transition to arc_l2c_only state with L1 header.
To be able to manipulate sublist lock and marker as needed add few
more multilist functions and modify one.
[3 lines not shown]
BRT: Skip getting length in brt_entry_lookup()
Unlike DDT, where ZAP values may have different lengths due to
compression, all BRT entries are identical 8-byte counters. It
does not make sense to first fetch the length only to assert it.
zap_lookup_uint64() is specifically designed to work with counters
of different size and should return error if something odd found.
Calling it straight allows to save some measurable CPU time.
Reviewed-by: Pawel Jakub Dawidek <pawel at dawidek.net>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Rob Norris <robn at despairlabs.com>
Signed-off-by: Alexander Motin <mav at FreeBSD.org>
Sponsored by: iXsystems, Inc.
Closes #15950
BRT: Make BRT block sizes configurable
Similar to DDT make BRT data and indirect block sizes configurable
via module parameters. I am not sure what would be the best yet,
but similar to DDT 4KB blocks kill all chances of compression on
vdev with ashift=12 or more, that on my tests reaches 3x.
While here, fix documentation for respective DDT parameters.
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Alexander Motin <mav at FreeBSD.org>
Sponsored by: iXsystems, Inc.
Closes #15967
ZAP: Massively switch to _by_dnode() interfaces
Before this change ZAP called dnode_hold() for almost every block
access, that was clearly visible in profiler under heavy load, such
as BRT. This patch makes it always hold the dnode reference between
zap_lockdir() and zap_unlockdir(). It allows to avoid most of dnode
operations between those. It also adds several new _by_dnode() APIs
to ZAP and uses them in BRT code. Also adds dmu_prefetch_by_dnode()
variant and uses it in the ZAP code.
After this there remains only one call to dmu_buf_dnode_enter(),
which seems to be unneeded. So remove the call and the functions.
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Alexander Motin <mav at FreeBSD.org>
Sponsored by: iXsystems, Inc.
Closes #15951
BRT: Skip duplicate BRT prefetches
If there is a pending entry for this block, then we've already
issued BRT prefetch for it within this TXG, so don't do it again.
BRT vdev lookup and following zap_prefetch_uint64() call can be
pretty expensive and should be avoided when not necessary.
Reviewed-by: Pawel Jakub Dawidek <pawel at dawidek.net>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Alexander Motin <mav at FreeBSD.org>
Sponsored by: iXsystems, Inc.
Closes #15941
Update resume token at object receive.
Before this change resume token was updated only on data receive.
Usually it is enough to resume replication without much overlap.
But we've got a report of a curios case, where replication source
was traversed with recursive grep, which through enabled atime
modified every object without modifying any data. It produced
several gigabytes of replication traffic without a single data
write and so without a single resume point.
While the resume token was not designed to resume from an object,
I've found that the send implementation always sends object before
any data. So by requesting resume from offset 0 we are effectively
resuming from the object, followed (or not) by the data at offset
0, just as we need it.
Reviewed-by: Allan Jude <allan at klarasystems.com>
Reviewed-by: Paul Dagnelie <pcd at delphix.com>
Signed-off-by: Alexander Motin <mav at FreeBSD.org>
Sponsored by: iXsystems, Inc.
Closes #15927
Linux: Cleanup taskq threads spawn/exit
This changes taskq_thread_should_stop() to limit maximum exit rate
for idle threads to one per 5 seconds. I believe the previous one
was broken, not allowing any thread exits for tasks arriving more
than one at a time and so completing while others are running.
Also while there:
- Remove taskq_thread_spawn() calls on task allocation errors.
- Remove extra taskq_thread_should_stop() call.
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Rich Ercolani <rincebrain at gmail.com>
Signed-off-by: Alexander Motin <mav at FreeBSD.org>
Sponsored by: iXsystems, Inc.
Closes #15873
ZIL: Update Linux tracing after #15635
While picking parts from #14909 I've missed Linux tracing specific
ones, that went unnoticed in default configurations, but breaks the
build in some.
Reviewed-by: Ameer Hamza <ahamza at ixsystems.com>
Reviewed-by: Brian Atkinson <batkinson at lanl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Alexander Motin <mav at FreeBSD.org>
Sponsored by: iXsystems, Inc.
Closes #15730
ZIO: Optimize zio_flush()
- Generalize vdev_nowritecache handling by traversing through the
VDEV tree and skipping children ZIOs where not supported.
- Remove intermediate zio_null() in case of several VDEV children.
- Remove children handling from zio_ioctl(). There are no other
use cases for this code beside DKIOCFLUSHWRITECACHED, and would there
be, I doubt they would so straightforward apply to all VDEV children.
Comparing to removed previous optimization this should improve cases
of redundant ZILs/SLOGs.
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: George Wilson <george.wilson at delphix.com>
Signed-off-by: Alexander Motin <mav at FreeBSD.org>
Sponsored by: iXsystems, Inc.
Closes #15515
ZIL: Detect single-threaded workloads
... by checking that previous block is fully written and flushed.
It allows to skip commit delays since we can give up on aggregation
in that case. This removes zil_min_commit_timeout parameter, since
for single-threaded workloads it is not needed at all, while on very
fast devices even some multi-threaded workloads may get detected as
single-threaded and still bypass the wait. To give multi-threaded
workloads more aggregation chances increase zfs_commit_timeout_pct
from 5 to 10%, as they should suffer less from additional latency.
Also single-threaded workloads detection allows in perspective better
prediction of the next block size.
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Prakash Surya <prakash.surya at delphix.com>
Signed-off-by: Alexander Motin <mav at FreeBSD.org>
Sponsored by: iXsystems, Inc.
Closes #15381
Small fix to prefetch ranges aggregation
When after #16022 adding new range we aggregate more than two
existing ranges, that should be very rare, only if several streams
overlap, we may need to zero not the last range, but some earlier.
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Alexander Motin <mav at FreeBSD.org>
Sponsored by: iXsystems, Inc.
Closes #16072