System Design Nuggets

System Design Nuggets

When to Use WebRTC vs. WebSockets: A Guide to Low Latency

Choose the right real-time protocol. Master the trade-offs between WebSockets (TCP/Reliability) and WebRTC (UDP/Speed). Learn how STUN/TURN servers work and when to use P2P vs. Client-Server.

Arslan Ahmad's avatar
Arslan Ahmad
Jan 06, 2026
∙ Paid

The traditional architecture of the web was designed for a simple purpose: retrieval.

In the early days of the internet, a client would request a document, the server would locate that document, and the data would be sent back.

Once the transfer was complete, the connection was closed. This request-response cycle is efficient for browsing static pages, but it presents a fundamental barrier for modern application design.

Today, users expect applications to be alive.

When a stock price changes on the exchange, it must update on the user’s dashboard immediately.

When a collaborator types a sentence in a shared document, that sentence must appear on every other screen instantly. When two people initiate a video call, the stream must flow without buffering or delay.

To achieve this level of interactivity, system designers must move beyond the limitations of standard HTTP. They must utilize protocols designed for continuous, bi-directional data flow.

The two most prominent technologies in this domain are WebSockets and WebRTC.

In this guide, we will deconstruct the mechanics of both systems. We will explore how they establish connections, how they transport data, and the specific architectural trade-offs you must consider when designing a large-scale system.

The Limitation of Standard HTTP

To understand the value of these protocols, we must first understand the inefficiency they replace.

Standard HTTP is unidirectional.

The client controls the conversation. The server cannot speak unless spoken to.

If you are building a chat application using standard HTTP, the client browser has no way of knowing when a new message has arrived on the server.

To get around this, developers historically used a technique called Polling.

The client would send a request to the server every few seconds to ask, “Is there new data?”

This approach is flawed for two reasons:

  1. Latency: If the client polls every 5 seconds, and a message arrives 1 second after the last poll, the user waits 4 seconds to see it.

  2. Overhead: Every HTTP request contains headers and metadata. Setting up and tearing down a TCP connection for every check consumes significant bandwidth and server CPU.

Real-time protocols solve this by keeping the line open.

WebSockets

WebSockets represent the industry standard for bidirectional communication over the web. They revolutionized web development by allowing a persistent connection between the client and the server.

The Architecture: Client-Server

WebSockets operate on a strict Client-Server model.

In this architecture, the server acts as the central authority.

Every client maintains a dedicated connection to the server.

If Client A wants to send a message to Client B, they cannot simply talk to each other. Client A must send the data to the server.

The server processes the data, perhaps stores it in a database, and then forwards it to Client B.

This centralized approach offers significant advantages for data integrity.

The server can validate every message. It can enforce rate limiting. It can log every interaction for auditing purposes.

The server is the single source of truth for the state of the application.

User's avatar

Continue reading this post for free, courtesy of Arslan Ahmad.

Or purchase a paid subscription.
© 2026 Arslan Ahmad · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture