Extend ptrace(2) PT_GET_THREAD_* to include thread names.
Use a new define larger then _MAXCOMLEN to avoid that define from
propagating to ptrace.h. Ensure that pts_name is large enough with
a compile time assert.
okay claudio@ jca@
Introduce a bitmap API that scales dynamically up but is also minimal for
the common case.
Functions include:
- set, test, clear: set, test and clear a bit in the map
- empty: check if a bitmap is empty (has no bit set).
- id_get: return the lowest free id in map
- id_put: return an id to the map, aka clear
- init, reset: initialize and free a map
The first 127 elements are put directly into struct bitmap without further
allocation. For maps with more than 127 elements external memory is allocated
in the set function. This memory is only freed by reset which must be called
before an object is removed containing a bitmap.
It is not possible to set bit 0 of a bitmap since that bit is used to
differentiate between access modes. In my use cases this is perfectly fine
since most code already treats 0 in a special way.
OK tb@
Remove unused algorithms from speed.c
Removed unused algorithms (MD2, SEED, RC5) from the algorithm
enum and the `names[]` table.
The current results for these algorithms were always:
md2 0.00 0.00 0.00 0.00 0.00
seed cbc 0.00 0.00 0.00 0.00 0.00
rc5-32/12 cbc 0.00 0.00 0.00 0.00 0.00
indicating that they are no longer unused.
ok tb@
Convert D_, R_ macro indices to enums in speed.c
Replaced many `#define` based index constants with enums by adding ALGOR_NUM,
DSA_NUM, RSA_NUM, and EC_NUM to the enum definitions.
This makes it easier to add or remove new entries.
ok tb@
speed: remove unused counters and dead parameters
In the speed implementation, a number of unused variables and
parameters (save_count, c[][], rsa_c, dsa_c, ecdsa_c, ecdh_c, and
the num argument of print_message()/pkey_print_message()) were
still left behind.
These values are no longer referenced and cannot affect the
time-based benchmark logic, so remove them.
Functional behaviour of speed remains unchanged.
ok tb@
let tun pretend it's a softnet thread with it's own tun_input_process.
this largely reimplements if_vinput and if_input_process in tun so
packets pushed through the stack from a tun/tap write can operate
largely like they're being processed by a softnet thread.
there's a couple of important differences between tun/tap and softnet
thought. firstly, multiple threads/processes can write to a single
tun/tap descriptor concurrently, so each thread has its own netstack
struct on the stack. secondly, these tun/tap threads are not the
softnet threads, so they can't avoid taking real interface references
when processing requeued packets.
the alternative to this woudl be letting tun/tap writes queue packets
for processing in a softnet thread, but that adds latency and
requires a lot of thought about a backpressure mechanism when a
thread writes too fast for the stack to process.
let if_vinput and if_input_proto requeue packets on a struct netstack.
this moves us from directly calling into different layers of the
network stack to moving the call back up to if_input_process to
dispatch. this reduces the kernel thread stack usage, but also makes
it safe(r) to dispatch this work from an smr critical section. it
also allows us to dispatch work without holding netlock, and
eventually getting if_input_process to amortise the locking over
bundles of these different dispatch calls.
extend struct netstack to queue packet processing in the existing context
at the moment if_input_process runs packets in an mbuf_list, generally
produced by an ifiq, through the network stack. as the headers on
the packet are parsed, subsequent protocol handlers are called to
process the next layer of the packet. currently these handlers are
dispatched by by directly calling functions, which consumes the
stack on the kernel threads running the network stack. if you have
a deep topology of virtual interfaces (eg, carp on vlan on aggr on
physical ports), you have a deep call stack.
the usual alternative to this is to queue packets handled by virtual
interfaces and get them processed by their own ifiq and their own
if_input_process call. this is what the stack used to do, but the
cost of locking and queueing and dispatching it to a softnet thread
to process (even if it was the same thread) adds significant overhead,
so we moved to direct dispatch to speed things up.
this change is kind of a hybrid approach, where input handling is
[21 lines not shown]
turn tun_input into a wrapper around p2p_input.
tun packets have the address family as a 4 byte prefix on their
payload which is used to decide which address family input handler
to call. p2p_input does the same thing except it looks at
m_pkthdr.ph_family.
this makes tun_input it's 4 byte prefix to set m_pkthdr.ph_family
and then calls p2p_input to use it.