System Design Nuggets

System Design Nuggets

The Complete Guide to APIs for System Design Interviews [2026]

REST, gRPC, GraphQL, WebSockets. JWT, OAuth, rate limiting, pagination, idempotency. 22 concepts explained from scratch with the 4-step interview approach.

Arslan Ahmad's avatar
Arslan Ahmad
May 09, 2026
∙ Paid

What This Guide Covers

  • What APIs are and how they work under the hood, from HTTP to response

  • REST, gRPC, GraphQL, and WebSockets: what each protocol does and when to use it

  • API gateway architecture: why every production system needs one

  • Authentication and security: OAuth 2.0, JWT, HTTPS, and the attacks that exploit weak APIs

  • API design patterns: pagination, versioning, idempotency, rate limiting, and error handling

  • The API design mistakes that interviewers catch immediately

  • How to design APIs in system design interviews: the exact approach


APIs are the connective tissue of every system design.

When you design Instagram, the mobile app communicates with your backend through APIs.

When your backend services talk to each other, they use APIs.

When your system integrates with third-party services (Stripe for payments, Twilio for SMS), it uses APIs.

A system design answer without well-defined APIs is like an architecture without doors.

The rooms exist, but nothing can move between them.

Despite this, most system design candidates treat APIs as an afterthought. They draw boxes and arrows, mention “REST API” once, and move on.

The interviewer notices. Strong candidates define their API endpoints early, explain their protocol choice, discuss authentication, and address idempotency. These signals demonstrate that you have built real systems, not just studied diagrams.

This guide covers everything you need to know about APIs for system design interviews, starting from the fundamentals and building toward the advanced patterns that score points in interviews.

Subscribe to my newsletter to receive all system design guides and resources in the future.


Part 1: How APIs Work Under the Hood

The Request-Response Cycle

An API (Application Programming Interface) is a contract between two pieces of software. One piece (the client) sends a request.

The other piece (the server) processes the request and sends a response.

The contract defines: what requests are valid, what data each request requires, and what the response looks like.

When you open Instagram and scroll your feed, the app sends a request like GET /api/v1/feed?page=1 to Instagram’s servers.

The server authenticates you (checks your login token), fetches your personalized feed from the database, and returns a JSON response containing the posts, images, and metadata your app needs to render the feed.

Every request has four parts.

  1. The method (GET, POST, PUT, DELETE) tells the server what operation to perform.

  2. The URL (/api/v1/users/123) identifies the resource.

  3. The headers carry metadata (authentication token, content type, request ID).

  4. The body (for POST and PUT) carries the data you are sending.

Every response has three parts.

  1. The status code (200 OK, 201 Created, 404 Not Found, 500 Internal Server Error) tells the client what happened.

  2. The headers carry metadata (content type, rate limit remaining, pagination cursors).

  3. The body carries the data.

For the complete trace of what happens from the moment you click a button to the moment the response arrives, that post follows a request through DNS, load balancer, application server, cache, database, and back.

For how REST APIs actually work under the hood, that post covers the protocol internals.

HTTP Versions

The HTTP protocol has evolved significantly, and understanding the differences matters for system design.

HTTP/1.1 opens a new TCP connection for each request (or reuses connections with keep-alive, but only one request at a time per connection).

If your page needs 50 resources (images, scripts, stylesheets), the browser opens multiple connections and fetches them in parallel, but each connection handles one request at a time.

HTTP/2 multiplexes multiple requests over a single TCP connection. All 50 resources can be requested simultaneously over one connection, dramatically reducing latency. It also compresses headers (which are repetitive across requests) and supports server push (the server proactively sends resources the client will need).

HTTP/3 replaces TCP with QUIC (a UDP-based protocol). The key benefit: connection migration.

If you switch from WiFi to cellular, HTTP/2 drops the TCP connection and re-establishes it (adding latency).

HTTP/3 migrates the connection seamlessly because QUIC connections are identified by connection IDs, not IP addresses.

In system design interviews, mentioning HTTP/2 for internal service-to-service communication (multiplexing reduces latency between microservices) and HTTP/3 for mobile clients (connection migration handles network switching) demonstrates depth.

For the detailed comparison of HTTP/1.1 vs HTTP/2 vs HTTP/3 and what changed at each step, that post covers the evolution.

For the security layer, how HTTPS and TLS work to encrypt every API call, that post covers encryption, certificates, and the TLS handshake.


Part 2: API Protocols

REST (Representational State Transfer)

REST is the most common API protocol. It models your system as a collection of resources (users, orders, products) and defines operations on those resources using HTTP methods.

  • GET /users/123 retrieves user 123. POST /users creates a new user.

  • PUT /users/123 updates user 123 (replaces the entire resource).

  • PATCH /users/123 partially updates user 123 (changes specific fields).

  • DELETE /users/123 deletes user 123.

REST is simple, widely understood, and supported by every programming language and framework.

The trade-off is that REST can be chatty.

If you need a user’s profile, their recent orders, and their payment methods, that is three separate API calls. Each call adds a network round trip.

When to use REST: Public APIs (third-party developers are familiar with REST), CRUD applications (the resource model maps naturally), and any system where simplicity and broad compatibility matter more than performance.

For the anatomy of a REST API and the client-server architecture behind it, that post covers the design principles.

gRPC (Google Remote Procedure Call)

gRPC uses HTTP/2 for transport and Protocol Buffers for serialization.

Instead of sending JSON over HTTP, you define your API in a .proto file (a schema that both client and server share), and gRPC generates client and server code automatically.

The advantages over REST are significant.

Performance: Protocol Buffers are binary, producing payloads 3-10x smaller than JSON. HTTP/2 multiplexing allows hundreds of concurrent requests over a single connection.

Type safety: The .proto file is a strict contract.

If the server changes the API, the client’s generated code breaks at compile time, not at runtime.

Streaming: gRPC natively supports server streaming (server sends multiple responses), client streaming (client sends multiple requests), and bidirectional streaming (both sides send and receive simultaneously).

When to use gRPC: Internal service-to-service communication (where performance matters and both sides are under your control), real-time data streaming, and any system where the overhead of JSON parsing and HTTP/1.1 connection management is a bottleneck.

Keep reading with a 7-day free trial

Subscribe to System Design Nuggets to keep reading this post and get 7 days of free access to the full post archives.

Already a paid subscriber? Sign in
© 2026 Arslan Ahmad · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture