All posts
·MMP Dev·engineering · protocol · quic

Why we picked QUIC

The boring, technical reason we built a remote-desktop protocol on QUIC instead of WebRTC.

Why we picked QUIC

The single most-asked question I get when demoing Qubox is "why not WebRTC?" The honest answer is: WebRTC is great, and we almost used it. Here's why we didn't.

What WebRTC gives you

  • Audio/video codecs standardised across browsers
  • A media stack with built-in jitter buffers, FEC, and congestion control
  • ICE for NAT traversal
  • Mandatory encryption (DTLS-SRTP)
  • Hardware codec access on every major browser

That's a lot. If you need a browser client, you basically have to use WebRTC, or a WebTransport shim that emulates enough of WebRTC to make the browser happy.

What WebRTC costs you

  • Two stacks. A signalling stack (your choice) and a media stack (SRTPs, RTP, RTCP, ICE, DTLS, SDES, SDP, BUNDLE, mDNS, STUN, TURN, …). The spec is roughly 600 pages.
  • Mandatory TURN for hostile NATs. WebRTC's ICE works for ~80% of networks and falls back to TURN (relayed TCP/UDP) for the other 20%. TURN has measurable latency cost.
  • Browser-shape assumptions. WebRTC assumes peers can run JavaScript. The CLI is a Rust binary. We don't have a browser.
  • SDP. The Session Description Protocol is a serialised blob of codec choices, ICE candidates, and DTLS fingerprints. It's parsed by an ever-growing pile of code that breaks on every Chrome release.

What QUIC gives us

QUIC is a single RFC (9000), with extensions (9001, 9002, 9221, 9297, 9312, …). It provides:

  • Stream multiplexing over a single UDP socket
  • Datagrams (unreliable, unordered) for media
  • Built-in TLS 1.3
  • 0-RTT connection resumption
  • Connection migration (rebind sockets, change networks)
  • Congestion control that's competitive with TCP Cubic

The first three are all we needed. The rest is gravy.

What we built on top

qubox-transport is a thin wrapper over quinn, the Rust QUIC implementation. It adds:

  • A session abstraction that maps QUIC streams + datagrams to logical channels (control, video, audio, input, clipboard, pen)
  • Reconnection with exponential backoff
  • 0-RTT resumption
  • Path probing — try direct UDP first, fall back to a relay if needed

Total LOC: under 2,000. The media plane is just datagrams with a sequence number; we don't bother with RTP headers or NACK because QUIC already handles loss detection and congestion control at the transport.

The trade-offs

  • No browser client. If your viewer is a browser, this won't work yet. We're prototyping a WebTransport gateway, but it's not shipping in 0.1.
  • TURN fallback still exists. For networks that block UDP outright, we relay through TCP. The latency cost is real.
  • Hardware decode is per-OS. We don't get the cross-platform codec APIs that WebRTC does. On Linux we use VAAPI; on Windows DXVA; on macOS VideoToolbox. This is a one-time cost per backend, not an ongoing one.

The bench

Same hardware, same network, 4K 60Hz, Opus audio, clipboard sync:

Stack RTT (LAN) RTT (transcontinental) CPU (host) CPU (client)
Qubox (QUIC) 6 ms 92 ms 4% 6%
WebRTC (webrtc.org) 11 ms 117 ms 7% 12%
Parsec 18 ms 142 ms 9% 15%

The QUIC-vs-WebRTC gap is mostly TCP head-of-line blocking and SDP overhead. The Parsec gap is mostly the extra custom-UDP layer and a different congestion-control choice.

TL;DR

QUIC is enough for what we need. WebRTC is more than we need. We trade browser compatibility for a smaller surface area and a faster path from network to pixels.

If you need a browser client, wait for the WebTransport gateway. If you need a CLI/GUI client, Qubox is shipping today.

Try Qubox on your studio box in under five minutes.

Get started