speed: remove unused counters and dead parameters
In the speed implementation, a number of unused variables and
parameters (save_count, c[][], rsa_c, dsa_c, ecdsa_c, ecdh_c, and
the num argument of print_message()/pkey_print_message()) were
still left behind.
These values are no longer referenced and cannot affect the
time-based benchmark logic, so remove them.
Functional behaviour of speed remains unchanged.
ok tb@
let tun pretend it's a softnet thread with it's own tun_input_process.
this largely reimplements if_vinput and if_input_process in tun so
packets pushed through the stack from a tun/tap write can operate
largely like they're being processed by a softnet thread.
there's a couple of important differences between tun/tap and softnet
thought. firstly, multiple threads/processes can write to a single
tun/tap descriptor concurrently, so each thread has its own netstack
struct on the stack. secondly, these tun/tap threads are not the
softnet threads, so they can't avoid taking real interface references
when processing requeued packets.
the alternative to this woudl be letting tun/tap writes queue packets
for processing in a softnet thread, but that adds latency and
requires a lot of thought about a backpressure mechanism when a
thread writes too fast for the stack to process.
let if_vinput and if_input_proto requeue packets on a struct netstack.
this moves us from directly calling into different layers of the
network stack to moving the call back up to if_input_process to
dispatch. this reduces the kernel thread stack usage, but also makes
it safe(r) to dispatch this work from an smr critical section. it
also allows us to dispatch work without holding netlock, and
eventually getting if_input_process to amortise the locking over
bundles of these different dispatch calls.
extend struct netstack to queue packet processing in the existing context
at the moment if_input_process runs packets in an mbuf_list, generally
produced by an ifiq, through the network stack. as the headers on
the packet are parsed, subsequent protocol handlers are called to
process the next layer of the packet. currently these handlers are
dispatched by by directly calling functions, which consumes the
stack on the kernel threads running the network stack. if you have
a deep topology of virtual interfaces (eg, carp on vlan on aggr on
physical ports), you have a deep call stack.
the usual alternative to this is to queue packets handled by virtual
interfaces and get them processed by their own ifiq and their own
if_input_process call. this is what the stack used to do, but the
cost of locking and queueing and dispatching it to a softnet thread
to process (even if it was the same thread) adds significant overhead,
so we moved to direct dispatch to speed things up.
this change is kind of a hybrid approach, where input handling is
[21 lines not shown]
turn tun_input into a wrapper around p2p_input.
tun packets have the address family as a 4 byte prefix on their
payload which is used to decide which address family input handler
to call. p2p_input does the same thing except it looks at
m_pkthdr.ph_family.
this makes tun_input it's 4 byte prefix to set m_pkthdr.ph_family
and then calls p2p_input to use it.
call ip input handlers for pf diverted packets via if_input_proto.
this is a step toward being able to run tpmr and veb without the
net lock. right now ip input needs net lock, so if if_input_proto
can move their calls to a locked context, tpmr and veb wont need
to be locked first.
add if_input_proto() as a wrapper around calls to mbuf proto handling.
this version directly calls the proto handler, but it will be used
in the future in combination with struct netstack to move the proto
handler call around.
let the softnet threads use ifnet refs without accounting for them.
currently you need a real ifnet refcnt via if_get/if_unit, or you
can use if_get_smr in an smr read critical section, but this allows
code in the softnet threads to use an ifnet ref simply by virtue
of running in the softnet thread. this means softnet can avoid the
atomic ops against ifnet refcnts like smr critical sections can
do, but still sleep, which you cant do with in an smr critical
section.
this is implemented by having if_remove net_tq_barriers() before
letting interface teardown proceed.
populate the enchdr in network byte order instead of host byte order.
this prepends the packet payloads you can see via enc(4) interfaces,
and should have been populated consistently from the beginning.
better late than never.
i've already fixed tcpdump to cope with these fields in either
order, so this is mostly about setting a good example in the kernel
than anything else.
if pf can't find a parent for a carp interface, don't process the packet.
pf tries hard to pretend carp doesnt exist by mapping carp interfaces
back to their parents for the application of policy (ie, state/ruleset
evaluation). if a carp parent detaches, it's (very unlikely but
still) possible for a packet received by a carp interface to go
through pf.
previously pf would handle this situation by passing the packet
through as if it were received by the carp interface, which is
inconsistent with it trying to use the parent instead.
this change has it drop packets in this situation instead.
ok sashan@ claudio@ henning@
Simplify argument move using TAILQ_CONCAT()
Replace the manual loop moving each argument from cmd->arguments to
last->arguments with a single TAILQ_CONCAT() call. This makes the code
clearer and more efficient, while preserving identical behavior.
OK nicm@
Add a scroll-to-mouse command for copy mode to scroll to the mouse
position and bind to the scrollbar, brings the scrollbar keys into line
with the other mouse keys. From Michael Grant, GitHub issue 4731.
On kernels compiled for both 88100 and 88110, replace the CPU_IS881[01]0
logic to no longer check the cputyp variable, but directly check bits in the
processor identification register; loading this value produces faster and
smaller code than accessing memory, and the compiler can be instructed that
the value is a constant.
vmm(4): don't return EIO from ioctl(2) on vcpu halt.
In the current design, if a vcpu halts without interrupts enabled,
the vcpu run loop returns EIO. This was then being returned as the
result of the ioctl(2) call, which is incorrect. The VMM_IOC_RUN
ioctl is successful and this isn't an error condition. vmm(4) already
associates this vcpu state with vcpu termination and communicates
this to vmd(8) in the returned vcpu state.
This is observed primarily by Linux guests that, due to vmd(8) not
emulating an ACPI method to power off, the kernel disables interrupts
and halts the cpu. vmd(8) ends up logging some noise because of the
EIO return value.
ok mlarkin@
Implement a per-peer pending prefix queue and lookup table and
a pending attribute queue and lookup table.
Withdraws just end up in the peer pending withdraw prefix queue.
For updates the prefix is queued on a pending attribute entry, which
itself is queued on the peer pending update queue.
For updates this allows to aggregate multiple prefixes into a single
UPDATE message.
All prefixes are also stored in the per-peer lookup table and this table
is checked before adding an entry. If the object already exists the prefix
is first dequeued and the requeued at the tail of its queue.
pend_prefix_add() is therefor a bit fiddly.
Similar all attrs are added to the per-peer attribute lookup table and this
is used to locate the update queue where the prefix is queued on.
Once queued an attr is not requeued to ensure updates are sent in FIFO order.
If the attr pointer in struct pend_prefix is NULL then it is a withdraw.
[8 lines not shown]
Push `pageqlock' dances inside uvm_page{de,}activate() & uvm_pagewire().
Tested during multiple bulks on amd64, i386, arm64 and sparc64 by jca@,
phessler@ and sthen@.
use an smr crit section instead of real iface refs in the uRPF check
the uRPF tries to use interface indexes, but if the index doesnt
match it'll resolve to a real interface and do an IFT_CARP and
carpdev index check.
it's easy to do this lookup from an smr crit section and avoid the
refcnt ops.
use an smr crit section instead of real interface refs in pf_match_rcvif
this is only used during ruleset evaluation, so it is less hot than
the carpdev resolution done in pf_test, but it's an easy change to
avoid unecessary atomic ops.
use an smr crit section to get the parent of carp interfaces.
pf maps packets "received" on carp interfaces back to the actual
interface it was received on and applies policy to that parent.
eg, if you have carp0 on top of em0, packets destined to the carp0
mac address, the network stack will think that the packets were
received by carp0. pf maps carp0 back to em0 and applies policy on
em0 though.
previously pf used if_get/if_put to do this lookup of the carp
parent, which is a couple of atomic ops on what can be a contended
cacheline for every packet destined for a carp interface. now the
lookup is done in an SMR critical section, against what will
hopefully be a shared cacheline.