[lld][Hexagon] Fix findMaskR8 missing duplex support (#183936)
findMaskR8() lacked an isDuplex() check, unlike findMaskR6(),
findMaskR11(), and findMaskR16() which all handle duplex instructions.
When the assembler generates R_HEX_8_X on a duplex SA1_addi instruction
(e.g. `{ r0 = add(r0, ##target); memw(r1+#0) = r2 }`), the wrong mask
0x00001fe0 placed relocation bits at [12:5] instead of [25:20],
corrupting the low sub-instruction (e.g. memw became memb).
Add the isDuplex() check returning 0x03f00000, and add a comprehensive
test covering all duplex instruction x relocation type combinations
across findMaskR6, findMaskR8, findMaskR11, and findMaskR16.
arm64/vmm: Support PMU v3p9
The only new register is read-only. As the kernel just passes the
registers to the guest directly no further change should be needed.
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D51764
arm64: Treat the PMUVer field of ID_AA64DFR0 as unsigned
The PMUVer field of ID_AA64DFR0 contains an unsigned version of the
Performance Monitors Extension, but it is currently treated as signed.
Change it to unsigned.
Reviewed by: andrew
Sponsored by: Arm Ltd
Signed-off-by: Kajetan Puchalski <kajetan.puchalski at arm.com>
Pull Request: https://github.com/freebsd/freebsd-src/pull/2062
[AArch64] Fold zero-high vector inserts in MI peephole optimisation
Summary
This patch follows on from #178227.
The previous ISel fold lowers the 64-bit case to:
fmov d0, x0
fmov d0, d0
which is not ideal and could be fmov d0, x0.
A redundant copy comes from the INSERT_SUBREG/INSvi64lane.
This peephole detects <2 x i64> vectors made of a zeroed upper and low
lane produced by FMOVXDr/FMOVDr, then removes the redundant copy.
Further updated tests and added MIR tests.
NAS-140009 / 26.0.0-BETA.2 / prevent blocking sqlite write thread (by yocalebo) (#18350)
During a failover MASTER event, the inline hook_datastore_execute_write
hook blocks the SQLite thread for ~70 seconds:
1. 10s — call_remote('failover.datastore.sql') times out trying to
replicate SQL to the (unreachable) remote node
2. 60s — set_failure() calls call_remote('failover.status') with no
explicit timeout, inheriting the 60s CALL_TIMEOUT default
This stalls any middleware operation that needs DB writes (e.g.
pool.dataset.unlock) for the entire duration, directly delaying failover
completion.
To try and remedy the issue:
1. Short-circuit the hook when failover is in progress — Added a
failover.in_progress check to hook_datastore_execute_write. During a
failover event there is no reason to replicate individual SQL statements
to the remote node, so we skip the remote call entirely.
[24 lines not shown]
NAS-140009 / 26.0.0-BETA.1 / prevent blocking sqlite write thread (by yocalebo) (#18349)
During a failover MASTER event, the inline hook_datastore_execute_write
hook blocks the SQLite thread for ~70 seconds:
1. 10s — call_remote('failover.datastore.sql') times out trying to
replicate SQL to the (unreachable) remote node
2. 60s — set_failure() calls call_remote('failover.status') with no
explicit timeout, inheriting the 60s CALL_TIMEOUT default
This stalls any middleware operation that needs DB writes (e.g.
pool.dataset.unlock) for the entire duration, directly delaying failover
completion.
To try and remedy the issue:
1. Short-circuit the hook when failover is in progress — Added a
failover.in_progress check to hook_datastore_execute_write. During a
failover event there is no reason to replicate individual SQL statements
to the remote node, so we skip the remote call entirely.
[24 lines not shown]
[AArch64] Add lowering for misc NEON intrinsics (#183050)
This patch adds custom lowering for the following NEON intrinsics to
enable better codegen for convert and load/store operations:
- suqadd
- usqadd
- abs
- sqabs
- sqneg
NAS-140009 / 25.10.2.2 / prevent blocking sqlite write thread (by yocalebo) (#18348)
During a failover MASTER event, the inline hook_datastore_execute_write
hook blocks the SQLite thread for ~70 seconds:
1. 10s — call_remote('failover.datastore.sql') times out trying to
replicate SQL to the (unreachable) remote node
2. 60s — set_failure() calls call_remote('failover.status') with no
explicit timeout, inheriting the 60s CALL_TIMEOUT default
This stalls any middleware operation that needs DB writes (e.g.
pool.dataset.unlock) for the entire duration, directly delaying failover
completion.
To try and remedy the issue:
1. Short-circuit the hook when failover is in progress — Added a
failover.in_progress check to hook_datastore_execute_write. During a
failover event there is no reason to replicate individual SQL statements
to the remote node, so we skip the remote call entirely.
[24 lines not shown]
[AArch64] Fold zero-high vector inserts in MI peephole optimisation
Summary
This patch follows on from #178227.
The previous ISel fold lowers the 64-bit case to:
fmov d0, x0
fmov d0, d0
which is not ideal and could be fmov d0, x0.
A redundant copy comes from the INSERT_SUBREG/INSvi64lane.
This peephole detects <2 x i64> vectors made of a zeroed upper and low
lane produced by FMOVXDr/FMOVDr, then removes the redundant copy.
Further updated tests and added MIR tests.
[mlir][xegpu] Add support for accessing the default order of a layout. (#184451)
Currently, `getOrder` returns null if the user does not provide an
`order` in xegpu layout. This behavior is undesirable when coupled with
utility functions that work on top of layouts (like `isTransposeOf`).
This PR introduce a `getEffectiveOrder` which always returns the true
order, even if user decides to omit it.