iperf 2 downloads — a widely used network performance measurement tool
1989
FDDI for the International Space Station. Foundational work on a real-time fiber transport.
Cisco
Catalyst RSM routing module. Integrated Layer 3 into a hardware Ethernet switch. Shipped worldwide.
Broadcom
Wi-Fi chipset testing. Statistical process controls applied to silicon characterization.
Today
Umber Networks · Fi-Wi. Centralized, software-defined Wi-Fi over PCIe-over-fiber fronthaul.
End of an Era
The leash gave out
The cable doesn't reach the device anymore. 25B+ Wi-Fi devices and growing.
Forty years of wired infrastructure ended at this connector. Wi-Fi is what comes next — and it inherited none of the determinism.
End of an Era
Ethernet inside buildings has run its course
For station-facing traffic. The wires now run between APs — not to desks.
The False Belief
Is Wi-Fi just wireless Ethernet?
CSMA/CA vs CSMA/CD — one letter, very different machine.
The Gap
802.3 isn't the Boundary (it's the payload)
The control point has shifted. Adapt using DPDK.
802.11 Mechanics
802.11 spans three domains
Continuous math → discrete math → logic. The kind of work in each is fundamentally different.
Each boundary is an architectural cut point — including the one that matters here.
802.11 Mechanics
Transmit Opportunity (TXOP) is the unit of work
One contention. One BlockAck. Many frames inside. The scheduler reasons about TXOPs — not packets.
An A-MPDU = one radio transmission
MPDU 802.3 frame
MPDU 802.3 frame
MPDU 802.3 frame
MPDU 802.3 frame
MPDU 802.3 frame
···
TXOP = the airtime envelope around an A-MPDU
Consequence
A collision burns the TXOP. A failed MPDU burns its airtime — recovery rides a later aggregate. Either way, the host can’t reschedule what it never saw.
802.11 Mechanics
One Wi-Fi transmission carries many Ethernet frames
An A-MPDU is a single radio burst with N payloads inside.
A-MPDU (one radio transmission)
MPDU 802.3 frame
MPDU 802.3 frame
MPDU 802.3 frame
MPDU 802.3 frame
MPDU 802.3 frame
···
↓
One 802.3 frame (the IP stack's view)
DST MAC 6 B
SRC MAC 6 B
EtherType 2 B
IP / TCP / UDP / payload 46 – 1500 B
FCS 4 B
802.11 Mechanics
CSMA/CA's hidden birthday paradox
Collision risk grows quadratically with stations. A small AP isn't small.
802.11 Mechanics
The PHY chain — antenna to MPDU
Continuous → discrete → logic. Hardware does the first two.
802.11 Dynamics
A population of state machines, one airtime
Each station walks this lattice independently. PER feedback steers every walk — in parallel.
Per-station TXOP feedback — three stations sharing the BSS
A
—
B
—
C
—
At building scale: thousands of these walks. One medium. No shared scheduler.
802.11 Dynamics
Many APs. No shared time, no shared RF state
APs contend for airtime — they don't coordinate it. No shared scheduler exists.
The Verdict
Wi-Fi today cannot be scheduled
Not at the boundary DPDK controls. Not across APs. Not per TXOP with shared RF state.
LBT
Random access. Contended, never granted.
TXOP
Bounded but uncontrollable. Physics decides.
MCS
Discrete and probabilistic. A draw, not a guarantee.
CONTENTION
Non-stationary. Every join changes the math.
There is no shared host scheduler.
Distributed, stochastic contention. Firmware-owned policy. By design.
The Evolution
The Forwarding Planes
IP
Software · early Cisco router
IOS on commodity hardware
Moved IP packets in software
→
802.3
Hardware · merchant silicon
Port ASICs · switch fabric
Switched Ethernet frames · new silicon industry
→
802.11
Software · commodity x86
802.11-aware DPDK
Works on A-MPDUs · TXOPs
Each protocol needed a forwarding plane.
IP got software. 802.3 got hardware. 802.11 gets 802.11-aware DPDK — moving A-MPDUs.
The Divergence
One ECN-marked packet, two fates
The same packet entering an Ethernet driver and a Wi-Fi driver. The control loop only stays predictable in one of them.
Ethernet
predictable dequeue
t + 0
DPDK enqueues the marked packet on tx ring.
~µs
DMA pulls it. Frame goes on the wire.
RX
Receiver sees the ECN bit at the same offset, same byte, same packet.
RTT
Sender's congestion control sees the mark within one RTT.
Service time = wire time. Variance bounded. The control loop converges.
Wi-Fi today
cascade of distortions
t + 0
DPDK enqueues. Driver hands to firmware.
+ agg
Firmware aggregates it into an A-MPDU with N other packets — chosen by firmware policy DPDK can't see.
+ cont
Aggregate waits for medium clear. Backoff is stochastic.
+ rtx
One MPDU fails. Its airtime is gone. Recovery rides a later aggregate — reordered with new neighbors.
+ BA
BlockAck arrives. Some packets ACKed in this TXOP. Some won't be ACKed for several TXOPs.
Service time = airtime × MCS × retry. Variance no longer bounded by the software queue. The control loop oscillates or diverges.
The scheduler moved below the software boundary.
ECN/L4S assumes predictable dequeue. Wi-Fi firmware delivers stochastic dequeue with reordering. Same packet. Different physics.
Proposed split
Implementation
If we built Wi-Fi silicon today
A clean separation: silicon does RF and PHY, everything else lives on the host over PCIe.
The Architecture
DPDK runs the wired edge. Time to do the wireless one
Three architectural moves. Borrowed from cellular. DPDK has the substrate; Wi-Fi silicon needs the hooks.
1
Move the MAC out of the chip.
Into userspace. Into DPDK. Into something you can debug.
2
Cut the chain at the right place.
Split 6: MPDU-over-fronthaul. Decoded frames cross the wire.
3
Give every radio shared time.
TSF coherence across radios — within ~1 µs.
Architecture
The Payoff
Software-defined 802.11
DPDK drives scheduling, media access, and building-wide RF conditions.
Scope
per-TXOP
Scheduling.
Which station gets the next TXOP. Deadlines. Airtime fairness. EDCA priorities decided in userspace, per packet.
Scope
per-AP MAC
Media access.
CSMA/CA contention itself becomes tunable. Backoff windows, RTS/CTS, BlockAck windows — tested and changed without firmware.
Scope
whole building
Building RF conditions.
Channel and frequency reuse coordinated across all RRHs. The building becomes one RF system — not a fleet of contention islands.
Hooks needed
An Open Letter
To Wi-Fi silicon vendors
The architecture only works if the firmware lets it.
Let the host program the policy. Let firmware execute the timing-critical path.
Today's Wi-Fi chips lock policy in firmware. We need silicon that exposes policy hooks — not silicon that hides policy decisions.
Defer 1
EDCA per packet.
Backoff windows, AC selection, contention strategy — all programmable per outgoing MPDU.
Defer 2
MCS edges.
Primary and fallback ladder set by the host. Probing cadence too. Firmware just executes the program.
Defer 3
Sounding & CSI.
Sounding triggered by host. CSI exposed to host. No internal beamforming policy held private.
Firmware does timing-critical work the host can't reach. Keep that.
Firmware becomes just a node in the graph.
Not the policy engine. Not the bottleneck. Just a node.
For DPDK Programmers
Three places to start
Not Wi-Fi-only. The substrate any hardware-clocked peripheral needs.
If you work on
eventdev
Land deadline-aware events.
An rte_eventdev backend that schedules TXOPs against absolute deadlines — not just queue priorities.
If you work on
mbuf / PMDs
Define the MPDU mbuf.
A canonical mbuf shape carrying TXOP/MCS/BA metadata, so PMDs and schedulers speak the same language about an A-MPDU.
If you work on
timing / PTP
Bridge the clock domains.
A clean way to relate host TSC, fronthaul PTP, and 802.11 TSF — affine offsets exposed as a first-class API.
A dedicated Rx element per RRH · ED/NAV state DMA’d into host memory · DPDK PMD polls DRAM
Per RRH
Dedicated sensing Rx
Passive Rx-only element on each radio head. Continuously monitors the RF medium. Decodes Duration fields to derive NAV expiry. No airtime cost.
PCIe-over-fiber
ED/NAV state DMA push
On threshold crossing, RRH DMA-writes ED/NAV state to a pre-registered host memory address. Energy · NAV remaining · RSSI · timestamp. ~2–4 µs latency.
DPDK concentrator
PMD polls host DRAM
64 RRHs × 64B ≈ 4KB array. PMD lcore polls DRAM not PCIe. With Intel DDIO, RRH DMA writes land in L3 not DRAM — ~30ns visibility vs ~100ns. Each DMA write invalidates one cache line; PMD re-fetches from L3.
Each RRH senses independently. DPDK sees everything centrally.
DMA start carries A-MPDU + EDCA params back to the RRH — this is the critical return path. Scheduler closes the loop.
Intel DDIO: RRH DMA writes land in L3 (~30ns) not DRAM (~100ns) — well within one 802.11 slot time (9µs).
The same channel measurements that drive scheduling drive sensing. Modern infrastructure does both — on one radio.
Job 1
comms
Frames ride the channel — wrapped in impulses, tones, and pilots.
PHY decodes the data. Per-subcarrier SNR and timestamps flow up to the host. The thing this talk has been about.
Job 2
sensing
The channel's impulse response comes back.
Every preamble carries known training tones. The receiver must estimate H(f) per subcarrier to demodulate — and that estimate is CSI. Motion, presence, gait, breathing fall out of the multipath the radio already measured.
The cut
one control plane
Reach the PHY, reach both.
When DPDK reaches the PHY for scheduling, it reaches CSI too. 802.11bf names the standard. The architecture costs nothing extra.
Backup · B2
Borrow O-RAN's vocabulary
Cellular has been running this experiment for a decade.
Backup · B3
TX and RX over PCIe
DMA scatter-gather. Header + metadata in one segment, payloads in N more. Same pattern as any modern NIC.
Per-station MCS. Per-station airtime cost. Mark to aggregate and pace per airtime — not byte-depth.
The control loop is what matters. Mark airtime per TXOP and per aggregate — senders pull back before queues form.
Backup · B7
But that's not how today's silicon works
An MCU runs the decision logic in firmware on the chip — one chip, one AP, no coordination.
Backup · B8
The MAC lives in the radio chip
DMA crosses PCIe. Scheduling does not. Every vendor's firmware acts alone.
Backup · B9
"Wi-Fi is just wireless Ethernet"
The assumption that built — and broke — the last 25 years of Wi-Fi architecture.
The Assumption
"It's a NIC that sends Ethernet frames over the air. Hand it 802.3 frames; it figures out the rest."
What it produced
MAC state locked in vendor firmware. ·
APs that don't coordinate with each other. ·
Transport pacing defeated by hidden queue dynamics. ·
A userland fast path that stops at PCIe.
802.11 is not 802.3 with a different L1.
It's a scheduled, contended, probabilistic medium — and the host needs to see it.
Backup · B10
Three orthogonal domains
Every Wi-Fi control decision lives on one of three axes. Centralize them and they compose.
Which RRH transmits. Antenna selection. Spatial reuse. The building stops being a fleet of contention islands.
Axis
on what
Carrier.
Channel, bandwidth, MCS, frequency reuse. Per-packet rate selection — not an AP-wide policy frozen in firmware.
Today's Wi-Fi controls each axis locally, in firmware. DPDK lets us schedule across all three from one place.
Backup · B11
TSF coherence across 24 radios
The hardest problem in the prototype. A worked example, not hand-waving.
within
~1 µs
Concentrator owns the time origin. PCIe-over-fiber timestamps RX/TX events on arrival.
Each RRH's TSF is slaved to a known offset.
PTP alone isn't enough — TSF is exposed via beacons, and beacons are MAC-emitted.
Backup · B12
NAV reset, by scope
802.11 already has the primitive (CF-End). The concentrator chooses where it lands.
Backup · B13
Single channel, or joint decoding?
A scope note on the orthogonal-domains framing.
What today's talk assumes
One transmitter per airtime group at a time. Shannon-bounded single-user decoding at the receiver. The scheduler's job is to serialize cleverly.
What this framing does not cover
Joint decoding across radios — Slepian-Wolf regime. Multiple simultaneous transmissions, jointly recovered at the concentrator. A capacity story, not a determinism story.
The architecture allows it
Spatially separated RRHs · phase-coherent fronthaul timing · centralized PHY processing.
The substrate is there. We’re not exercising it yet.
Determinism comes from centralized scheduling alone. Capacity comes from joint decoding. Today's talk is the first; the second is on the same control plane.
Backup · B14
The channel as a matrix
Why centralized PHY improves MU-MIMO: the H-matrix condition number κ.
S — transmitted signal (or energy) per antenna
R — received signal per antenna
D, D' — diagonal: cable & attenuator path (source and sink)
H — full MIMO channel (signal mixing across all antenna pairs)