mana: refill the rx mbuf in batch
Set the default refill threshod to be one quarter of the rx queue
length. User can change this value with hw.mana.rx_refill_thresh
in loader.conf. It improves the rx completion handling by saving
10% to 15% of overall time with this change.
Tested by: whu
MFC after: 2 weeks
Sponsored by: Microsoft
(cherry picked from commit 9b8701b81f14f0fa0787425eb9761b765d5faab0)
mana: Increase default tx and rx ring size to 1024
Tcp perfomance tests show high number of retries under heave tx
traffic. The numbers of queue stops and wakeups also increase.
Further analysis suggests the FreeBSD network stack tends to send
TSO packets with multiple sg entries, typically ranging from
10 to 16. On mana, every two sgs takes one unit of tx ring.
Therefore, adding up one unit for the head, it takes 6 to 9 units
of tx ring to send a typical TSO packet.
Current default tx ring size is 256, which can get filled up
quickly under heavy load. When tx ring is full, the send queue
is stopped waiting for the ring space to be freed. This could
cause the network stack to drop packets, and lead to tcp
retransmissions.
Increase the default tx and rx ring size to 1024 units. Also
introduce two tuneables allowing users to request tx and rx ring
size in loader.conf:
[14 lines not shown]
Hyper-V: hn: rewrite hn rsc swtich to avoid sysctl hang
Changing the rsc_switch flag using sysctl to turn rsc on or off
could hang. The orignal code sends request to host to get the
mtu setting. Sometimes the host fails to reply, causing
the thread to sleep forever waiting for the host response.
Use existing cached mtu from hn device instead to avoid calling
host.
Reported by: whu
Tested by: whu
MFC after: 1 week
(cherry picked from commit da1deb784d9ad3a4015a3f91fa1a5ce394fd3fdb)
mana: refill the rx mbuf in batch
Set the default refill threshod to be one quarter of the rx queue
length. User can change this value with hw.mana.rx_refill_thresh
in loader.conf. It improves the rx completion handling by saving
10% to 15% of overall time with this change.
Tested by: whu
MFC after: 2 weeks
Sponsored by: Microsoft
mana: Increase default tx and rx ring size to 1024
Tcp perfomance tests show high number of retries under heave tx
traffic. The numbers of queue stops and wakeups also increase.
Further analysis suggests the FreeBSD network stack tends to send
TSO packets with multiple sg entries, typically ranging from
10 to 16. On mana, every two sgs takes one unit of tx ring.
Therefore, adding up one unit for the head, it takes 6 to 9 units
of tx ring to send a typical TSO packet.
Current default tx ring size is 256, which can get filled up
quickly under heavy load. When tx ring is full, the send queue
is stopped waiting for the ring space to be freed. This could
cause the network stack to drop packets, and lead to tcp
retransmissions.
Increase the default tx and rx ring size to 1024 units. Also
introduce two tuneables allowing users to request tx and rx ring
size in loader.conf:
[12 lines not shown]
Hyper-V: hn: rewrite hn rsc swtich to avoid sysctl hang
Changing the rsc_switch flag using sysctl to turn rsc on or off
could hang. The orignal code sends request to host to get the
mtu setting. Sometimes the host fails to reply, causing
the thread to sleep forever waiting for the host response.
Use existing cached mtu from hn device instead to avoid calling
host.
Reported by: whu
Tested by: whu
MFC after: 1 week
Hyper-V: move memory alloc call for tlb hypercall out of smp_rendezvous
The allocation call could result in sleep lock violation if it is in
smp_rendezvous. Move it out. Also move the pcpu memory pointer to
vmbus_pcpu_data since it is only used on Hyper-V.
PR: 279738
Reported by: gbe
Fixes: 2b887687edc25bb4553f0d8a1183f454a85d413d
MFC after: 2 weeks
Sponsored by: Microsoft
(cherry picked from commit d0cb4674df97aa638d5d17861c364b1625f79401)
Hyper_V: add a boot parameter to tlb flush hypercall
Add boot parameter hw.vmbus.tlb_hcall for tlb flush hypercall.
By default it is set to 1 to allow hyercall tlb flush. It can be
set to 0 in loader.conf to turn off hypercall and use system
provided tlb flush routine.
The change also changes flag in the per cpu contiguous memory
allocation to no wait to avoid panic happened some cases which there
are no enough contiguous memery available at boot time.
Reported by: gbe
Tested by: whu
MFC after: 1 week
Fixes: 2b887687edc25bb4553f0d8a1183f454a85d413d
Sponsored by: Microsoft
(cherry picked from commit e02d20ddff7f9f9509b28095459327bc183dab8a)
Hyper-V: remove unused alloc_pcpu_ptr()
Fixes: 2b887687edc25bb4553f0d8a1183f454a85d413d
Sponsored by: Microsoft
(cherry picked from commit fd911ae609247ef5c91493fb5506e77aa6e497bc)
Hyper-V: TLB flush enlightment using hypercall
Currently FreeBSD uses IPI based TLB flushing for remote
TLB flushing. Hyper-V allows hypercalls to flush local and
remote TLB. The use of Hyper-V hypercalls gives significant
performance improvement in TLB operations.
This patch set during test has shown near to 40 percent
TLB performance improvement.
Also this patch adds rep hypercall implementation as well.
Reviewed by: whu, kib
Tested by: whu
Authored-by: Souradeep Chakrabarti <schakrabarti at microsoft.com>
Co-Authored-by: Erni Sri Satya Vennela <ernis at microsoft.com>
MFC after: 1 week
Sponsored by: Microsoft
Differential Revision: https://reviews.freebsd.org/D45521
[2 lines not shown]
pmap: move the smp_targeted_tlb_shutdown pointer stuff to amd64 pmap.h
Fixes: bec000c9c1ef409989685bb03ff0532907befb4aESC
Sponsored by: The FreeBSD Foundation
(cherry picked from commit 9c5d7e4a0c02bc45b61f565586da2abcc65d70fa)
amd64: add a func pointer to tlb shootdown function
Make the tlb shootdown function as a pointer. By default, it still
points to the system function smp_targeted_tlb_shootdown(). It allows
other implemenations to overwrite in the future.
Reviewed by: kib
Tested by: whu
Authored-by: Souradeep Chakrabarti <schakrabarti at microsoft.com>
Co-Authored-by: Erni Sri Satya Vennela <ernis at microsoft.com>
MFC after: 1 week
Sponsored by: Microsoft
Differential Revision: https://reviews.freebsd.org/D45174
(cherry picked from commit bec000c9c1ef409989685bb03ff0532907befb4a)
Hyper-V: move memory alloc call for tlb hypercall out of smp_rendezvous
The allocation call could result in sleep lock violation if it is in
smp_rendezvous. Move it out. Also move the pcpu memory pointer to
vmbus_pcpu_data since it is only used on Hyper-V.
PR: 279738
Reported by: gbe
Fixes: 2b887687edc25bb4553f0d8a1183f454a85d413d
MFC after: 2 weeks
Sponsored by: Microsoft
Hyper_V: add a boot parameter to tlb flush hypercall
Add boot parameter hw.vmbus.tlb_hcall for tlb flush hypercall.
By default it is set to 1 to allow hyercall tlb flush. It can be
set to 0 in loader.conf to turn off hypercall and use system
provided tlb flush routine.
The change also changes flag in the per cpu contiguous memory
allocation to no wait to avoid panic happened some cases which there
are no enough contiguous memery available at boot time.
Reported by: gbe
Tested by: whu
MFC after: 1 week
Fixes: 2b887687edc25bb4553f0d8a1183f454a85d413d
Sponsored by: Microsoft
Hyper-V: remove unused alloc_pcpu_ptr()
Fixes: 2b887687edc25bb4553f0d8a1183f454a85d413d
Sponsored by: Microsoft
Hyper-V: TLB flush enlightment using hypercall
Currently FreeBSD uses IPI based TLB flushing for remote
TLB flushing. Hyper-V allows hypercalls to flush local and
remote TLB. The use of Hyper-V hypercalls gives significant
performance improvement in TLB operations.
This patch set during test has shown near to 40 percent
TLB performance improvement.
Also this patch adds rep hypercall implementation as well.
Reviewed by: whu, kib
Tested by: whu
Authored-by: Souradeep Chakrabarti <schakrabarti at microsoft.com>
Co-Authored-by: Erni Sri Satya Vennela <ernis at microsoft.com>
MFC after: 1 week
Sponsored by: Microsoft
Differential Revision: https://reviews.freebsd.org/D45521
amd64: add a func pointer to tlb shootdown function
Make the tlb shootdown function as a pointer. By default, it still
points to the system function smp_targeted_tlb_shootdown(). It allows
other implemenations to overwrite in the future.
Reviewed by: kib
Tested by: whu
Authored-by: Souradeep Chakrabarti <schakrabarti at microsoft.com>
Co-Authored-by: Erni Sri Satya Vennela <ernis at microsoft.com>
MFC after: 1 week
Sponsored by: Microsoft
Differential Revision: https://reviews.freebsd.org/D45174
Hyper-V: vPCI: limit 64 cpus for msi/msix interrupt handling
On older Hyper-V vPCI protocol version 1.1, only the first 64 cpus
are able to handle msi/msix. This is true on FreeBSD 13.x and earlier
releases. If MSI IRQ is assigned to cpu id greater than 63, it would
lead to missing interrupts. Add check in set_interrupt_apic_ids() to
only add first 64 cpus into the interrupt cpu set.
Reported by: NetApp
Tested by: NetApp
Reviewed by: kib
Sponsored by: Microsoft
Differential Revision: https://reviews.freebsd.org/D44297
Hyper-V: vPCI: fix cpu id mis-mapping in vmbus_pcib_map_msi()
The msi address contains apic id. The code in vmbus_pcib_map_msi()
treats it as cpu id, which could cause mis-configuration of msix
IRQs, leading to missing interrupts for SRIOV devices. This happens
when apic id is not the same as cpu id on certain large VM sizes
with multiple numa domains in Azure. Fix this issue by correctly
mapping apic ids to cpu ids.
On vPCI version before 1.4, it only supports up to 64 vcpus
for msi/msix interrupt. This change also adds a check and returns
error if the vcpu_id is greater than 63.
(cherry picked from commit 999174ba03642fa63c9654752993a62db461afc9)
Reported by: NetApp
Tested by: whu
Sponsored by: Microsoft
Hyper-V: vPCI: fix cpu id mis-mapping in vmbus_pcib_map_msi()
The msi address contains apic id. The code in vmbus_pcib_map_msi()
treats it as cpu id, which could cause mis-configuration of msix
IRQs, leading to missing interrupts for SRIOV devices. This happens
when apic id is not the same as cpu id on certain large VM sizes
with multiple numa domains in Azure. Fix this issue by correctly
mapping apic ids to cpu ids.
On vPCI version before 1.4, it only supports up to 64 vcpus
for msi/msix interrupt. This change also adds a check and returns
error if the vcpu_id is greater than 63.
Reported by: NetApp
Tested by: whu
Sponsored by: Microsoft
(cherry picked from commit 999174ba03642fa63c9654752993a62db461afc9)
Hyper-V: vPCI: fix cpu id mis-mapping in vmbus_pcib_map_msi()
The msi address contains apic id. The code in vmbus_pcib_map_msi()
treats it as cpu id, which could cause mis-configuration of msix
IRQs, leading to missing interrupts for SRIOV devices. This happens
when apic id is not the same as cpu id on certain large VM sizes
with multiple numa domains in Azure. Fix this issue by correctly
mapping apic ids to cpu ids.
On vPCI version before 1.4, it only supports up to 64 vcpus
for msi/msix interrupt. This change also adds a check and returns
error if the vcpu_id is greater than 63.
Reported by: NetApp
Tested by: whu
MFC after: 1 week
mana: fix leaking pci resource problem detaching mana deivces
Fixing the error messages when detaching the mana gdma devices
showed in dmesg: "Device leaked memory resources".
Reported by: NetApp
MFC after: 3 days
Sponsored by: Microsoft
(cherry picked from commit 47e99e5bc5bcfa621fe6a3e62386f227c47e8cff)
mana: fix leaking pci resource problem detaching mana deivces
Fixing the error messages when detaching the mana gdma devices
showed in dmesg: "Device leaked memory resources".
Reported by: NetApp
MFC after: 3 days
Sponsored by: Microsoft
mana: Fix TX CQE error handling
For an unknown TX CQE error type (probably from a newer hardware),
still free the mbuf, update the queue tail, etc., otherwise the
accounting will be wrong.
Also, TX errors can be triggered by injecting corrupted packets, so
replace the mana_err to mana_dbg logging.
Reported by: NetApp
MFC after: 1 week
Sponsored by: Microsoft
(cherry picked from commit 516b5059705b6b8bbba28821dbe05964c128f9a9)
mana: Fix TX CQE error handling
For an unknown TX CQE error type (probably from a newer hardware),
still free the mbuf, update the queue tail, etc., otherwise the
accounting will be wrong.
Also, TX errors can be triggered by injecting corrupted packets, so
replace the mana_err to mana_dbg logging.
Reported by: NetApp
MFC after: 1 week
Sponsored by: Microsoft
Hyper-V: vmbus: check if signaling host is needed in vmbus_rxbr_read
It is observed that netvsc's send rings could stall on the latest
Azure Boost platforms. This is due to vmbus_rxbr_read() routine
doesn't check if host is waiting for more room to put data, which
leads to host side sleeping forever on this vmbus channel. The
problem was only observed on the latest platform because the host
requests larger buffer ring room to be available, which causes
the issue to happen much more easily.
Fix this by adding check in the vmbus_rxbr_read call and signaling
the host in the callers if check returns positively.
Approved by: re (gjb)
Reported by: NetApp
Tested by: whu
Sponsored by: Microsoft
(cherry picked from commit 49fa9a64372b087cfd66459a20f4ffd25464b6a3)
(cherry picked from commit c81166b018acfbe521f52415ff37b8c2696d77c6)
Hyper-V: vmbus: check if signaling host is needed in vmbus_rxbr_read
It is observed that netvsc's send rings could stall on the latest
Azure Boost platforms. This is due to vmbus_rxbr_read() routine
doesn't check if host is waiting for more room to put data, which
leads to host side sleeping forever on this vmbus channel. The
problem was only observed on the latest platform because the host
requests larger buffer ring room to be available, which causes
the issue to happen much more easily.
Fix this by adding check in the vmbus_rxbr_read call and signaling
the host in the callers if check returns positively.
Reported by: NetApp
Tested by: whu
Sponsored by: Microsoft
(cherry picked from commit 49fa9a64372b087cfd66459a20f4ffd25464b6a3)
Hyper-V: vmbus: check if signaling host is needed in vmbus_rxbr_read
It is observed that netvsc's send rings could stall on the latest
Azure Boost platforms. This is due to vmbus_rxbr_read() routine
doesn't check if host is waiting for more room to put data, which
leads to host side sleeping forever on this vmbus channel. The
problem was only observed on the latest platform because the host
requests larger buffer ring room to be available, which causes
the issue to happen much more easily.
Fix this by adding check in the vmbus_rxbr_read call and signaling
the host in the callers if check returns positively.
Reported by: NetApp
Tested by: whu
Sponsored by: Microsoft
(cherry picked from commit 49fa9a64372b087cfd66459a20f4ffd25464b6a3)
Hyper-V: vmbus: check if signaling host is needed in vmbus_rxbr_read
It is observed that netvsc's send rings could stall on the latest
Azure Boost platforms. This is due to vmbus_rxbr_read() routine
doesn't check if host is waiting for more room to put data, which
leads to host side sleeping forever on this vmbus channel. The
problem was only observed on the latest platform because the host
requests larger buffer ring room to be available, which causes
the issue to happen much more easily.
Fix this by adding check in the vmbus_rxbr_read call and signaling
the host in the callers if check returns positively.
Reported by: NetApp
Tested by: whu
MFC after: 3 days
Sponsored by: Microsoft
mana: add lro and tso stat counters
Add a few stat counters for tso and lro.
Approved by: re (gjb)
Sponsored by: Microsoft
(cherry picked from commit b167e449c8db01f082691503fb5c1255ad5750eb)
(cherry picked from commit a72a0af8194effa5f7e7a88d09673ed798819e88)