System Design Interview Guide: Designing Scalable APIs (REST, Idempotency, & More)

Stop treating APIs like random URLs. Master 7 senior-level API design patterns, including Resource-Oriented Design, Cursor Pagination, Idempotency, and Rate Limiting, to build scalable systems.

Dec 23, 2025

∙ Paid

There is a massive difference between writing code that works on your laptop and designing a system that works for millions of users.

As a junior developer, your focus is usually on logic.

You spend your days writing loops, fixing bugs, or centering elements on a screen. You write functions that run locally, and they work perfectly because you are the only person using them.

The environment is controlled.

The data is predictable.

But as you move toward senior roles or prepare for System Design Interviews (SDI), the focus shifts. You stop asking “How do I write this function?” and start asking “How do these systems talk to each other safely and efficiently?”

The Application Programming Interface (API) is that bridge.

It is the most critical component of distributed systems.

If you design it poorly, you create a bottleneck that frustrates other developers, breaks mobile applications, and makes the system impossible to scale.

If you design it well, you create a stable foundation that allows the business to grow without technical debt.

Many candidates fail system design interviews not because they do not know how to code. They fail because they treat an API as just a collection of random URLs. They do not think about what happens when the network fails, when a user clicks a button twice, or when the database gets overwhelmed.

Let me break down seven architectural decisions for styling APIs.

1. Resource-Oriented Design (Nouns over Verbs)

When you are just starting out, your instinct is to write API endpoints that look like function names.

You think in terms of actions.

You have a user, and you want to update their email, so you create a specific URL for that action.

The “Action-Based” Trap

You might design endpoints that look like this:

POST /addNewUser
POST /updateUserEmail
POST /deleteUser
GET /returnAllProducts

This style is often called Remote Procedure Call (RPC). While it seems logical at first, it creates a massive maintenance headache.

As your application grows, you end up with thousands of unique URLs.

A frontend developer trying to integrate with your backend has to memorize every single unique function name. It is chaotic, hard to predict, and difficult to document.

The Better Way: RESTful Design

In a well-designed system, we shift our mindset. We stop thinking about the actions (verbs) and start thinking about the resources (nouns).

A resource is just a concept in your system. It could be a User, a Product, an Order, or a Comment.

Instead of creating a new URL for every action, you use the same URL (like /users) and change the HTTP Method (the verb) to describe what you want to do. The internet has a universal language for these actions, so you do not need to invent your own.

GET /users: Retrieve a list of users.
GET /users/123: Retrieve the specific user with ID 123.
POST /users: Create a new user.
PUT /users/123: Update user 123 (replace the whole record).
PATCH /users/123: Update user 123 (change just one field, like the email).
DELETE /users/123: Remove user 123.

Why this matters for your career

Predictability is the most valuable trait of a senior engineer.

When you use this standard “Noun-based” structure, a developer joining your team does not need to read your documentation to guess how to delete a product.

If they know how to delete a user (DELETE /users/{id}), they can guess how to delete a product (DELETE /products/{id}). You have reduced the cognitive load for everyone.

2. Respect the HTTP Status Code Contract

Communication is the most important part of any relationship, including the one between a client (like a mobile app) and a server.

When a client sends a request, the server must reply with the result.

A common anti-pattern in junior developer portfolios is the “Always 200” API. This happens when an API catches an error on the server but still sends back a success indicator to the browser, hiding the error inside the data payload.

The Anti-Pattern

HTTP Status: 200 OK
Response Body (JSON):

{
  "success": false,
  "error": "User not found"
}

To a human reading this, it makes sense. But software is not human.

Monitoring tools, load balancers, and caching layers look at the HTTP Status Code to determine the health of a request.

Let me tell you why this is dangerous.

If you return 200 OK for an error, your monitoring dashboard will show 100% success rates while your users are unable to log in. You are effectively lying to your infrastructure.

The Status Code Guide

You do not need to memorize all 60+ codes, but you must know the categories for an interview:

2xx (Success):
- 200 OK: Standard success.
- 201 Created: Use this specifically for POST requests when a record is created. It tells the frontend that the data was saved successfully.
4xx (Client Error): This implies the user or the frontend dev made a mistake.
- 400 Bad Request: The input data was wrong (e.g., missing password field).
- 401 Unauthorized: “Who are you?” (User needs to log in).
- 403 Forbidden: “I know who you are, but you can’t touch this.” (User is logged in but lacks admin rights).
- 404 Not Found: The ID does not exist.
- 429 Too Many Requests: You are clicking too fast (we will cover this in Rate Limiting).
5xx (Server Error): This implies you (the backend dev) messed up.
- 500 Internal Server Error: The code crashed or the database is unreachable.

Here is the tricky part.

Distinguishing between 4xx and 5xx is critical for debugging.

If your alerts show a spike in 4xx errors, you might need to fix the documentation or the frontend form validation.

If you see a spike in 5xx errors, you need to wake up the backend team immediately because the server is burning.

3. Pagination: Protecting the Database

Scalability is often about protecting the system from its own data.

Imagine you are building an e-commerce system with 10 million orders in the database.

A developer writes a dashboard and calls your API: GET /orders.

If you have not designed your API correctly, your server attempts to query all 10 million rows, serialize them into JSON, and send them over the network.

The result?

Database thrashing: The database CPU spikes to 100% trying to read all rows.
Memory Leak: Your API server runs out of RAM holding the data before sending it.
Latency: The request takes 30 seconds, eventually timing out.

You must implement Pagination. This means breaking the data into “pages” or chunks.

There are two main strategies you need to explain in a system design interview.

Strategy A: Offset Pagination (The Beginner Approach)

This is the most common method. You use limit (how many items) and offset (how many to skip).

GET /orders?limit=20&offset=0 (Page 1)
GET /orders?limit=20&offset=20 (Page 2)

Why it fails at scale

Offset pagination has a hidden performance cost.

If you ask the database to SKIP 1,000,000 rows and TAKE 20, the database actually has to read and count through those first one million rows just to throw them away. As the offset gets bigger, the query gets slower.

Strategy B: Cursor Pagination (The Scalable Approach)

This is what giants like Twitter or Facebook use for “infinite scroll.”

Instead of saying “skip 1 million,” you say “give me 20 items that come after this specific ID.”

GET /orders?limit=20&after_cursor=order_id_8923

Because the database uses an index on the ID, it can jump directly to that record and grab the next 20. It is incredibly fast regardless of how much data you have. It works in O(1) time, which is computer science speak for “instant.”

4. Idempotency (Handling Network Failures)

This concept is the difference between a junior and a senior engineer.

Think of it this way.

A network connection is unreliable.

Sometimes, a request reaches the server, the server processes it, but the response gets lost on the way back to the client.

Imagine a user is transferring $100 to a friend.

User clicks “Send.”
Request hits server. Server deducts $100.
Server sends “Success” response.
Network cuts out. The user never sees the success message.
The user thinks it failed, so they click “Send” again.

If your API is not Idempotent, the server will process the request again and deduct another $100.

The user has been double-charged. This creates angry customers and legal headaches.

How to solve it

You need to implement an Idempotency Key.

Continue reading this post for free, courtesy of Arslan Ahmad.

Or purchase a paid subscription.

System Design Nuggets