Livestreaming Trilemma: HLS, WebRTC, MOQ

Livestreaming Trilemma: HLS, WEBRTC, MOQ<br>Live video has always been a trilemma: pick two of scale, latency, and cost. Does the new IETF protocol called Media over QUIC finally let you have all three?A new livestreaming protocol? Really?There's a new multimedia protocol for livestreaming under development. It's called Media over QUIC — MoQ for short — and it's quietly attracting the kind of attention that makes engineers at Meta, Cisco, Google, and Akamai show up to IETF working group meetings. Streaming startups are racing to ship the first production deployments. People are genuinely excited.Your reasonable first reaction is \"...another one?\" We already have HLS, RTMP, WebRTC, SRT, and a dozen lesser variants. Every one of them is battle-tested, in production at major streaming platforms, carrying real video to real users right this second. So why are people suddenly happy about yet another livestreaming protocol that — at first glance — looks like it's solving a problem already solved?Because it isn't solving a problem already solved. MoQ isn't a slightly faster HLS or a slightly more scalable WebRTC. It's a protocol that learns from both and tries to give you the best of each at once — and as we'll see, that turns out to matter quite a lot. In this article I'll walk you through why, and at the end I'll show you how to try it yourself in about ten minutes.A short history of \"can I watch this online?\"The story starts in the late 2000s. People wanted to watch live events on the internet — football games, news broadcasts, their favourite streamer going live — and someone had to invent the plumbing. The first answer was RTMP, Adobe's Flash-era protocol — it worked, but it leaned on a browser plugin that was already on its way out and didn't scale gracefully past a certain point. So the industry went looking for something else. The idea that eventually won was almost embarrassingly simple: instead of inventing some exotic real-time protocol, just chop the live stream into tiny files and serve them over normal HTTP, the same way every web page on Earth is delivered.That idea became HLS (from Apple), with DASH as the open-standard cousin. It worked beautifully. CDNs already knew how to deliver files to millions of users at once, and HLS got to ride that infrastructure for free. Suddenly anyone with a laptop could watch a live football game from anywhere in the world. Everyone was happy.For a while. Then people started noticing the cracks.The first crack was latency. The stream was late — ten to thirty seconds behind the camera, depending on the player's buffer. During the World Cup final, your neighbour upstairs would scream \"GOAAAAL!\" through the ceiling, and your TV would catch up twenty seconds later. The moment had been spoiled. The second crack was related: two viewers on different networks would drift several seconds apart from each other, so reacting in real time with a friend was a coin-flip.So people optimised. Low-Latency HLS and CMAF chunked transfer squeezed the same machinery harder and got latency down to a theoretical two seconds — in practice expect around five. Synchronisation got tighter but never reliable. Everyone was happy again. For a while.Then the use cases evolved. People started wanting things HLS had never been designed for: online betting, where a five-second lag means the line has already moved; live shopping, where the bidder ahead of you sees the item first; watch-parties, where everyone reacts in sync; concerts and live sports, where the audience needs to feel like a single room. All of these need real sub-second latency and real synchronisation — well under what LL-HLS could ever deliver.Luckily, a standard that fit the bill already existed: WebRTC. It hadn't been designed for one-to-many livestreaming — it was built for video calls — but it had the latency (often under 200 milliseconds) and the synchronisation everyone wanted. So people pressed it into service. And it worked.Sort of. Because WebRTC's low latency came at a steep price: it lost the thing that made HLS scale so well. To understand what that price actually is, we need to look at both a little more carefully.Two protocols, two economiesThe reason HLS and WebRTC sit on opposite ends of the latency–scale axis isn't accidental. Each one's strengths come from what it was designed for, and so do its limits.HLS scales cheaply because it delivers video the same way CDNs deliver everything else on the web — as static files. This is the architecture under Twitch, YouTube Live, and most live news broadcasts. Once a chunk is written, the CDN caches it at the edge and serves it to every viewer who asks for it, just like a logo or a JavaScript bundle so the marginal cost of one more viewer is essentially zero. But the same simplicity and scalability come at a cost: latency. Production HLS deployments typically run at 10–30 seconds end-to-end, with LL-HLS bringing that down to roughly 3–5 seconds.WebRTC goes the other...

Livestreaming Trilemma: HLS, WebRTC, MOQ

Related Articles

The Newest Instagram "Exploit" Is the Goofiest I've Seen

Apple WWDC 2026 Livestream

Claude Fable 5

It's Not Just X. It's Y

Show HN: GoPeek – open links in live mini browser windows without new tabs