zpool-list.8: clarify that only imported pools are listed (#18352)
The man page stated "all pools in the system are listed" which is
misleading, as only imported pools are shown. Clarify this and
add a cross-reference to zpool-import(8) for discovering pools
available for import.
Signed-off-by: Christos Longros <chris.longros at gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Tony Hutter <hutter2 at llnl.gov>
Linux 7.0: autoconf: Remove copy-from-user-inatomic API checks (#18348) (#18354)
This function was removed in c6442bd3b643: "Removing old code outside
of 4.18 kernsls", but fails at present on PowerPC builds due to the
recent inclusion of 6bc9c0a90522: "powerpc: fix KUAP warning in VMX
usercopy path" in the upstream kernel, which introduces a use of
cpu_feature_keys[], which is a GPL-only symbol. Removing the API
check as it doesn't appear necessary.
Signed-off-by: John Cabaj <john.cabaj at canonical.com>
Reviewed-by: Tony Hutter <hutter2 at llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin at TrueNAS.com>
[DAG] ComputeKnownBits - set low bit to zero for ADD(X,X) (#186461)
ADD(X,X) is equivalent to SHL(X,1), so bit[0] is always zero.
This allows downstream folds like `and(add(x,x), 1) -> 0`.
Fixes #186091
[compiler-rt][ARM] cmake properties for complicated builtin sources (#179919)
In the builtins library, most functions have a portable C implementation
(e.g. `mulsf3.c`), and platforms might provide an optimized assembler
implementation (e.g. `arm/mulsf3.S`). The cmake script automatically
excludes the C source file corresponding to each assembly source file it
includes. Additionally, each source file name is automatically
translated into a flag that lit tests can query, with a name like
`librt_has_mulsf3`, to indicate that a function is available to be
tested.
In future commits I plan to introduce cases where a single .S file
provides more than one function (so that they can share code easily),
and therefore, must supersede more than one existing source file.
I've introduced the `crt_supersedes` cmake property, which you can set
on a .S file to name a list of .c files that it should supersede. Also,
the `crt_provides` property can be set on any source file to indicate a
list of functions it makes available for testing, in addition to the one
implied by its name.
[AMDGPU] Add structural stall heuristic to scheduling strategies
Implements a structural stall heuristic that considers both resource
hazards and latency constraints when selecting instructions. In coexec,
this changes the pending queue from a binary “not ready to issue”
distinction into part of a unified candidate comparison. Pending
instructions still identify structural stalls in the current cycle, but
they are now evaluated directly against available instructions by stall
cost, making the heuristics both more intuitive and more expressive.
- Add getStructuralStallCycles() to GCNSchedStrategy that computes the
number of cycles an instruction must wait due to:
- Resource conflicts on unbuffered resources (from the SchedModel)
- Sequence-dependent hazards (from GCNHazardRecognizer)
- Add getHazardWaitStates() to GCNHazardRecognizer that returns the number
of wait states until all hazards for an instruction are resolved,
providing cycle-accurate hazard information for scheduling heuristics.
wasm: recognize `any_true` and `all_true` (#155885)
fixes https://github.com/llvm/llvm-project/issues/129441
cc @lukel97 @badumbatish
https://github.com/llvm/llvm-project/pull/145108
I've been learning a bit about LLVM, trying to make progress on some of
these issues. The code below is based on
https://github.com/llvm/llvm-project/pull/145108#issuecomment-3004561085,
by implementing `shouldExpandReduction`.
The implementation works for the test cases I added, but (obviously)
fails for any existing cases. `ISD::VECREDUCE_AND` and
`ISD::VECREDUCE_OR` are now marked as legal, which is required for the
`Pat`s to fire, but when they don't that causes a selection failure.
So, I'm wondering, what is the right approach here. Should I mark these
intrinsics as `Custom` instead and manually perform the transformation
[2 lines not shown]
[NFC][Support] Minor code cleanup in APFloat.cpp (#187526)
Minor code cleanup: define variables at their first assignment as
opposed to at the start of functions, and use `[[maybe_unused]]` for
variables used in assert only.
[HLSL] Implement Texture2D::operator[] (#186110)
Implments the Texture2D::operator[] method. It uses the same design as
Buffer::operator[]. However, this requires us to chagne the
resource_getpointer intrinsic to accept integer vectors for the index.
Assisted-by: Gemini
[flang][flang-rt] Implement F202X leading-zero control edit descriptors LZ, LZS, and LZP for formatted output (F, E, D, and G editing) (#183500)
LZ: processor-dependent (default, flang prints leading zero); LZS:
suppress the optional leading zero before the decimal point; LZP: print
the optional leading zero before the decimal point. Changes span the
source parser, compile-time format validator, runtime format processing,
and runtime output formatting. Includes semantic test (io18.f90) and
documentation updates.
[TargetLowering] Avoid unnecessary nodes in the chunk loop in expandDIVREMByConstant (#187967)
We don't need an AND on the last iteration. If we shifted the dividend
due to trailing zeros in the divisor, we don't need a chunk that only
contains shifted in zeros.
[ADT] Add predicate based match support to StringSwitch
This introduces `Predicate` and `IfNotPredicate` case selection to
StringSwitch to allow use cases like
```
StringSwitch<...>(..)
.Case("foo", FooTok)
.Predicate(isAlpha, IdentifierTok)
...
```
This is mostly useful for improving conciseness and clarity when
processing generated strings, diagnostics, and similar.
[AMDGPU] Add ML-oriented coexec scheduler selection and queue handling
This patch adds the initial coexec scheduler scaffold for machine
learning workloads on gfx1250.
It introduces function and module-level controls for selecting the
AMDGPU preRA and postRA schedulers, including an `amdgpu-workload-type`
module flag that maps ML workloads to coexec preRA scheduling and a nop
postRA scheduler by default.
It also updates the coexec scheduler to use a simplified top-down
candidate selection path that considers both available and pending
queues through a single flow, setting up follow-on heuristic work.
[mlir][spirv] Add first 3 data layout ops in TOSA Ext Inst Set (#187714)
This patch introduces the following reduction operators:
spirv.Tosa.Concat
spirv.Tosa.Pad
spirv.Tosa.Reshape
Also dialect and serialization round-trip tests have been added.
Signed-off-by: Davide Grohmann <davide.grohmann at arm.com>
[flang][NFC] Converted five tests from old lowering to new lowering (part 37) (#188009)
Tests converted from test/Lower/Intrinsics: minval.f90, modulo.f90,
move_alloc.f90, mvbits.f90, not.f90
Thread Safety Analysis: Support guarded_by/pt_guarded_by with multiple capabilities (#186838)
Previously, `guarded_by` and `pt_guarded_by` only accepted a single
capability argument. Introduce support for declaring that a variable is
guarded by multiple capabilities, which exploits the following property:
any writer must hold all capabilities, so holding any one of them
(exclusive or shared) guarantees at least shared (read) access.
Therefore, writing requires all listed capabilities to be held
exclusively, while reading only requires at least one to be held.
This synchronization pattern is frequently used where the underlying
lock implementation does not support real reader locking, and instead
several lock "shards" are used to reduce contention for readers. For
example, the Linux kernel makes frequent use of this pattern [1].
Backwards compatibility is not affected by this change: for the time
being we deliberately do not change the semantics of multiple stacked
attributes (this retains existing semantics precisely, while giving a
way to choose the "stricter" semantics if needed).
[2 lines not shown]
[mlir][python] Fix PyObjectRef copy/move assignment for MSVC (#186758)
PyObjectRef has a user-declared move constructor but no explicit
copy/move assignment operators. On at least some version of MSVC,
instantiation of operator= is forced, causing a compile error:
```
In file included from mlir/lib/Bindings/Python/Globals.cpp:9:
In file included from mlir/include/mlir/Bindings/Python/IRCore.h:16:
<MSVC>/include/vector(1461,27): error: object of type 'value_type' (aka 'mlir::python::mlir::PyDiagnostic::DiagnosticInfo') cannot be assigned because its copy assignment operator is implicitly deleted
1461 | *_Mid = *_First;
| ^
<MSVC>/include/vector(1539,9): note: in instantiation of function template specialization 'std::vector<mlir::python::mlir::PyDiagnostic::DiagnosticInfo>::_Assign_counted_range<mlir::python::mlir::PyDiagnostic::DiagnosticInfo *>' requested here
1539 | _Assign_counted_range(_Right_data._Myfirst, static_cast<size_type>(_Right_data._Mylast - _Right_data._Myfirst));
| ^
mlir/include/mlir/Bindings/Python/IRCore.h(1317,33): note: in instantiation of member function 'std::vector<mlir::python::mlir::PyDiagnostic::DiagnosticInfo>::operator=' requested here
1317 | struct MLIR_PYTHON_API_EXPORTED MLIRError {
| ^
mlir/include/mlir/Bindings/Python/IRCore.h(369,16): note: copy assignment operator of 'DiagnosticInfo' is implicitly deleted because field 'location' has a deleted copy assignment operator
[20 lines not shown]