get rid of atomic_foo ops in the tx start and completion paths.
atomics were used to coordinate updates to the number of available
slots on the tx ring. start would use what was available, and txeof
(completion) would add back freed slots. start and completion
update a producer and consumer index respectively, so we can use
those with the size of the ring to calculate space instead.
while here i simplified what txeof does a fair bit, which combined
with the removal of the atomics gives us a bit of a speed improvement.
hrvoje popovski reports up to a 20% improvement in one environment,
but 5 to 10 is probably more realistic.
ive had this in a tree since 2017, but mpi's "Faster vlan(4)
forwarding?" post made me dig it out and clean it up.
ok jmatthew@
UnifiedSplitRaw