Why we picked QUIC
The boring, technical reason we built a remote-desktop protocol on QUIC instead of WebRTC.
Why we picked QUIC
The single most-asked question I get when demoing Qubox is "why not WebRTC?" The honest answer is: WebRTC is great, and we almost used it. Here's why we didn't.
What WebRTC gives you
- Audio/video codecs standardised across browsers
- A media stack with built-in jitter buffers, FEC, and congestion control
- ICE for NAT traversal
- Mandatory encryption (DTLS-SRTP)
- Hardware codec access on every major browser
That's a lot. If you need a browser client, you basically have to use WebRTC, or a WebTransport shim that emulates enough of WebRTC to make the browser happy.
What WebRTC costs you
- Two stacks. A signalling stack (your choice) and a media stack (SRTPs, RTP, RTCP, ICE, DTLS, SDES, SDP, BUNDLE, mDNS, STUN, TURN, …). The spec is roughly 600 pages.
- Mandatory TURN for hostile NATs. WebRTC's ICE works for ~80% of networks and falls back to TURN (relayed TCP/UDP) for the other 20%. TURN has measurable latency cost.
- Browser-shape assumptions. WebRTC assumes peers can run JavaScript. The CLI is a Rust binary. We don't have a browser.
- SDP. The Session Description Protocol is a serialised blob of codec choices, ICE candidates, and DTLS fingerprints. It's parsed by an ever-growing pile of code that breaks on every Chrome release.
What QUIC gives us
QUIC is a single RFC (9000), with extensions (9001, 9002, 9221, 9297, 9312, …). It provides:
- Stream multiplexing over a single UDP socket
- Datagrams (unreliable, unordered) for media
- Built-in TLS 1.3
- 0-RTT connection resumption
- Connection migration (rebind sockets, change networks)
- Congestion control that's competitive with TCP Cubic
The first three are all we needed. The rest is gravy.
What we built on top
qubox-transport is a thin wrapper over
quinn, the Rust QUIC
implementation. It adds:
- A session abstraction that maps QUIC streams + datagrams to logical channels (control, video, audio, input, clipboard, pen)
- Reconnection with exponential backoff
- 0-RTT resumption
- Path probing — try direct UDP first, fall back to a relay if needed
Total LOC: under 2,000. The media plane is just datagrams with a sequence number; we don't bother with RTP headers or NACK because QUIC already handles loss detection and congestion control at the transport.
The trade-offs
- No browser client. If your viewer is a browser, this won't work yet. We're prototyping a WebTransport gateway, but it's not shipping in 0.1.
- TURN fallback still exists. For networks that block UDP outright, we relay through TCP. The latency cost is real.
- Hardware decode is per-OS. We don't get the cross-platform codec APIs that WebRTC does. On Linux we use VAAPI; on Windows DXVA; on macOS VideoToolbox. This is a one-time cost per backend, not an ongoing one.
The bench
Same hardware, same network, 4K 60Hz, Opus audio, clipboard sync:
| Stack | RTT (LAN) | RTT (transcontinental) | CPU (host) | CPU (client) |
|---|---|---|---|---|
| Qubox (QUIC) | 6 ms | 92 ms | 4% | 6% |
| WebRTC (webrtc.org) | 11 ms | 117 ms | 7% | 12% |
| Parsec | 18 ms | 142 ms | 9% | 15% |
The QUIC-vs-WebRTC gap is mostly TCP head-of-line blocking and SDP overhead. The Parsec gap is mostly the extra custom-UDP layer and a different congestion-control choice.
TL;DR
QUIC is enough for what we need. WebRTC is more than we need. We trade browser compatibility for a smaller surface area and a faster path from network to pixels.
If you need a browser client, wait for the WebTransport gateway. If you need a CLI/GUI client, Qubox is shipping today.
Try Qubox on your studio box in under five minutes.
Get started