NAS-138709 / 26.04 / Add SED details in disk.details endpoint (#17725)
This commit adds changes so that we also get sed details in disk.details
endpoint. Basically this endpoint is being used by UI on zpool creation
screens for showing disks. In the current changes, we get SED info from
DB and sed status (if it's SED disk) in realtime from disk.query
endpoint as that already currently efficiently handles this.
Add SED details in disk.details endpoint
This commit adds changes so that we also get sed details in disk.details endpoint. Basically this endpoint is being used by UI on zpool creation screens for showing disks. In the current changes, we get SED info from DB and sed status (if it's SED disk) in realtime from disk.query endpoint as that already currently efficiently handles this.
Fix NVMe-oF failover test reliability and add diagnostics
- Fix flush method: send_flush() -> flush_namespace()
- Add read retry loop for namespaces not ready after failover
- Increase teardown sleep from 5s to 15s for cleanup
- Add fixture lifecycle logging to diagnose teardown issues
- Verify service state and port release after stop
Add TestFailback and fix test isolation
- Add TestFailback crash->orderly failback cycle tests (4 tests)
- Change TestFailover fixtures to class scope to fix backend switching
- Add restore_original_master fixture to restore HA state after tests
- Add MAX_FAILOVER_TIME checks for both failover and failback operations
Add large-scale HA failover test (51 subsystems/70 namespaces)
- Add TestFailoverScale class with 8 parametric variations
- Use ThreadPoolExecutor for parallel connection/verification
- Add MAX_FAILOVER_TIME check (60s limit)
- Set 15-minute timeout for test (ZVOL overhead)
- Update docstring to document both test suites
NAS-138706 / 26.04 / Fix typo in NFS bindip validator. (#17724)
The validator had a typo. The intent is to convert `None` to the string
`'None'`
Fix associated CI test.
NAS-138701 / 26.04 / Only expose available GPUs as valid choices for container device (#17723)
## Context
We only want those GPUs to be exposed as valid choices for a container
GPU device if the GPU in question is also available to host (has not
been isolated from the host).
NAS-138692 / 26.04 / Fix ipv6 interface autoconf CI check (#17719)
Moving from `dhclient` to `dhcpcd` introduced a change in that the
`sysctl` `autoconf` setting is updated to reflect the current status of
the interface.
By default the IPv6 `autoconf` setting is `1` (enabled). The `dhcpcd`
daemon dynamically updates this based on what routing capability it
senses from the interface. We don't currently route IPv6 in our CI
environment, so the `IPv6` `autoconf` setting is disabled, i.e. set to
`0`.
This PR updates the test to be reactive to the current `IPv6` routing
capability.
This has been tested in a CI run.
NOTE: The VMs under test start as DHCP and get configured with static IP
in `test_005_interface.test_002_configure_interface.`
NAS-138569 / 26.04 / NVIDIA support for LXC containers (#17691)
## Context
It was requested to add nvidia support for lxc containers which has been
added. However lxc containers for nvidia also require nvidia drivers to
be available on the host, earlier this implementation lived in docker
plugin which has now been changed and moved to system advanced service
so it can be configured from there and used for both containers and
docker plugin.
NAS-137899 / 26.04 / Fix edge case where ix-apps dataset can have invalid mountpoint (#17695)
This commit fixes an issue where if a pool was exported and then
imported, we set the mountpoint for ix-apps ds to where we expect
ix-apps to be mounted but if apps are already configured with a
different pool - this obviously causes problems. This step is not
required here because when we start/configure apps - we ensure that
ix-apps dataset has correct mountpoint set (for the actual pool which is
really configured to be used for apps).
NAS-138691 / 25.10.1 / Add more handling for transient ENOENT on cred check (by anodos325) (#17717)
During health checks for machine account password, the third-party
python-gssapi library may fail to create a ccache object after
successful kinit. In this situation a CallError with errno set to ENOENT
will be raised. This commit converts the error to be non-fatal since by
this point we've validated that the credential is correct. This prevents
the activedirectory service from becoming FAULTED for spurious reasons.
Original PR: https://github.com/truenas/middleware/pull/17716
Co-authored-by: Andrew Walker <andrew.walker at truenas.com>
NAS-138691 / 25.10.2 / Add more handling for transient ENOENT on cred check (by anodos325) (#17718)
During health checks for machine account password, the third-party
python-gssapi library may fail to create a ccache object after
successful kinit. In this situation a CallError with errno set to ENOENT
will be raised. This commit converts the error to be non-fatal since by
this point we've validated that the credential is correct. This prevents
the activedirectory service from becoming FAULTED for spurious reasons.
Original PR: https://github.com/truenas/middleware/pull/17716
Co-authored-by: Andrew Walker <andrew.walker at truenas.com>