DDT: Reduce global DDT lock scope during writes
Before this change DDT lock was taken 4 times per written block,
and as effectively a pool-wide lock it can be highly congested.
This change introduces a new per-entry dde_io_lock, protecting some
fields during I/O ready and done stages, so that we don't need the
global lock there.
According to my write tests on 64-thread system with 4KB blocks this
significantly reduce the global lock contention, reducing CPU usage
from 100% to expected ~80%, and increasing write throughput by 10%.
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Rob Norris <robn at despairlabs.com>
Signed-off-by: Alexander Motin <alexander.motin at TrueNAS.com>
Closes #17960
[TySan] Attempt to unbreak build after #169036
If tysan was not in COMPILER_RT_SANITIZERS_TO_BUILD, we used to
get an error after #169036, see comments there for details.
Reapply "[clangd] Make lit tests work with the internal shell" (#169972)
This reverts commit bd04ef6df50e8e6e5212762fc798ea9fbdcfc897.
This reapply fixes the broken case where we would fail at CMake
configuration time if LLVM_INCLUDE_BENCHMARKS was explicitly turned off.
DDT: Switch to using wmsums for lookup stats
ddt_lookup() is a very busy code under a highly congested global
lock. Anything we can save here is very important.
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Rob Norris <robn at despairlabs.com>
Signed-off-by: Alexander Motin <alexander.motin at TrueNAS.com>
Closes #17980
[HLSL] Update indexed vector elements individually (#169144)
When an individual element of a vector is updated via indexing into the vector, it needs to be handled as a store operation on that one vector element.
Clang treats vectors as one unit, so a vector element needs to be updated, the whole vector is loaded, the element is modified, and then the whole vector is stored. In HLSL vector elements are handled individually. We need to avoid this load/modify/store sequence to prevent overwriting other vector elements that might be getting updated in parallel.
Fixes #167729
Contributes to #160208.
[lldb] Fix a bug when disabling the statusline. (#169127)
Currently, disabling the statusline with `settings set show-statusline
false` leaves LLDB in a broken state. The same is true when trying to
toggle the setting again.
The issue was that setting the scroll window to 0 is apparently not
identical to setting it to the correct number of rows, even though some
documentation online incorrectly claims so.
Fixes #166608
DDT: Make children writes inherit allocator
Even though unlike gang children it is not so critical for dedup
children to inherit parent's allocator, there is still no reason
for them to have allocation policy different from normal writes.
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Rob Norris <robn at despairlabs.com>
Signed-off-by: Alexander Motin <alexander.motin at TrueNAS.com>
Closes #17961
[LLDB][NativePDB] Look for PDBs in `target.debug-file-search-paths` (#169719)
Similar to DWARF's DWO, we should look for PDBs in
`target.debug-file-search-paths` if the PDB isn't at the original
location or next to the executable.
With this PR, the search order is as follows:
1. PDB path specified in the PE/COFF file
2. Next to the executable
3. In `target.debug-file-search-paths`
This roughly matches [the order Visual Studio
uses](https://learn.microsoft.com/en-us/visualstudio/debugger/specify-symbol-dot-pdb-and-source-files-in-the-visual-studio-debugger?view=vs-2022#where-the-debugger-looks-for-symbols),
except that we don't have a project folder and don't support symbol
servers.
Closes #125355 (though I think this is already fixed in the native
plugin).
CI: zfs-test-packages: Add in new repos
Test install from our new repos: zfs-latest, zfs-legacy,
zfs-2.3, zfs-2.2, from the zfs-test-packages workflow.
This on-demand workflow is use to verify that the zfs RPMs
in the repos are correct.
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Tony Hutter <hutter2 at llnl.gov>
Closes #17956
[flang][cuda][NFC] Split allocation related operation conversion from other cuf operations (#169740)
Split AllocOp, FreeOp, AllocateOp and DeallocateOp from other
conversion. Patterns are currently added to the base CUFOpConversion
when the option is enabled.
This split is a pre-requisite to be more flexible where we do the
allocation related operations conversion in the pipeline.
config/kmap_atomic: initialise test data
6.18 changes kmap_atomic() to take a const pointer. This is no problem
for the places we use it, but Clang fails the test due to a warning
about being unable to guarantee that uninitialised data will definitely
not change. Easily solved by forcibly initialising it.
Sponsored-by: https://despairlabs.com/sponsor/
Reviewed-by: Alexander Motin <alexander.motin at TrueNAS.com>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Rob Norris <robn at despairlabs.com>
Closes #17954
zvol_id: make array length properly known at compile time
Using strlen() in an static array declaration is a GCC extension. Clang
calls it "gnu-folding-constant" and warns about it, which breaks the
build. If it were widespread we could just turn off the warning, but
since there's only one case, lets just change the array to an explicit
size.
Sponsored-by: https://despairlabs.com/sponsor/
Reviewed-by: Alexander Motin <alexander.motin at TrueNAS.com>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Rob Norris <robn at despairlabs.com>
Closes #17954
Linux: bump -std to gnu11
Linux switched from -std=gnu89 to -std=gnu11 in 5.18
(torvalds/linux at e8c07082a810f). We've always overridden that with gnu99
because we use some newer features.
More recent kernels are using C11 features in headers that we include.
GCC generally doesn't seem to care, but more recent versions of Clang
seem to be enforcing our gnu99 override more strictly, which breaks the
build in some configurations.
Just bumping our "override" to match the kernel seems to be the easiest
workaround. It's an effective no-op since 5.18, while still allowing us
to build on older kernels.
Sponsored-by: https://despairlabs.com/sponsor/
Reviewed-by: Alexander Motin <alexander.motin at TrueNAS.com>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Rob Norris <robn at despairlabs.com>
Closes #17954
releases/15.0R/relnotes: Add notes for commits mentioning relnotes, batch 4
Add more content coming from RELNOTES.
Content in this commit corresponds to stopping at fe86d923f83f included.
Sponsored by: The FreeBSD Foundation
chksum: run 256K benchmark on demand, preserve chksum_stat_data
Reviewed-by: Tino Reichardt <milky-zfs at mcmilk.de>
Reviewed-by: Alexander Motin <alexander.motin at TrueNAS.com>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Alexx Saver <lzsaver.eth at ethermail.io>
Co-authored-by: Adam Moss <c at yotes.com>
Closes #17945
Closes #17946
AMDGPU/GlobalISel: Report RegBankLegalize errors using reportGISelFailure
Use standard GlobalISel error reporting with reportGISelFailure
and pass returning false instead of llvm_unreachable.
Also enables -global-isel-abort=0 or 2 for -global-isel -new-reg-bank-select.
Note: new-reg-bank-select with abort 0 or 2 runs LCSSA,
while "intended use" without abort or with abort 1 does not run LCSSA.
[AMDGPU] Allow hazard checks for WMMA co-exec
Now we are just inserting V_NOP instrtuctions, try to schedule
something into the shadow.
It is still somewhat imprecise, for example AdvanceCycle() will
use TII.getNumWaitStates() anyway, but in a scheduling mode
we are not required to be precise. We must be finally precise
in the hazard recognizer mode. Then EmittedInstrs buffer is also
limited to MaxLookAhead even though VALU only hazards may actually
never expire and require an endless buffer. But that's OK, we can
at least mitigate what the buffer can hold. The buffer is also
currently much bigger than any of VALU hazards may need.
That said the rest of the 'fix*' functions here can be changed
the same way, these which are using V_NOPs. This one is just the
worst because it may require up to 9 nops.
[AMDGPU] Refactor hazard recognizer for VALU-pipeline hazards. NFCI.
This is in preparation of handling these in scheduler. I do not expect
any changes to the produce code here, it is just an infrastructure.
Our current problem with the VALU pipeline hazards is that we only
insert V_NOP instructions in the hazard recognizer mode, but ignore
it during scheduling. This patch is meant to create a mechanism to
actually account for that during scheduling.
releases/15.0R/relnotes: Add notes for commits mentioning relnotes, batch 4
Add more content coming from RELNOTES.
Content in this commit corresponds to stopping at eeb04a736cb9 included.
Remove duplicated "virtual channels" info.
Sponsored by: The FreeBSD Foundation
GlobalISel: Stop using TPC to check if GlobalISelAbort is enabled
New pass manager does not use TargetPassConfig.
GlobalISel requires TargetPassConfig to reportGISelFailure,
and it only actual use is to check if GlobalISelAbort is enabled.
TargetPassConfig uses TargetMachine to check if GlobalISelAbort is
enabled, but TargetMachine is also available from MachineFunction.
15.0/relnote: Tidy application changes and contributed software
+ remove expat, this is a private lib and doesn't have a manual
+ move the thing that is changing to the beginingish of the sentences
+ s/foo/the foo frobber/ to match usual freebsd style
+ fix some markup and punctuation typos
These need to happen before the entries can be alphabetized.
Discussed with: jhb, olce (loosely)