Scaling Writes: The Hardest Problem in System Design Interviews
Writes breaking your system? Master four proven patterns—vertical scaling, sharding, queuing, and batching—to scale database writes efficiently.
Imagine you’ve built a wildly successful app. Users are signing up and using it nonstop.
In a system design interview, you might start with a single database – until the interviewer asks: “How will your design handle 10× or 100× more traffic?”
Many candidates immediately add caches and read replicas (great for reads), but what about writes?
Scaling reads is straightforward, but scaling writes really puts your system to the test.
Let’s understand it.
Why Scaling Writes Is Hard (Compared to Reads)
When you read data, you’re not changing anything. It’s easy to make read operations fast and distributed using replicas and caches.
Writes are a tougher challenge because a write changes the single source of truth.
If multiple clients update the same data at once, the system must coordinate them to maintain consistency.
Typically, one primary database handles all writes, and it becomes a bottleneck under high load.
Major write bottlenecks include:
Lock contention: When multiple writers attempt to access the same record concurrently, they will block each other. For example, if thousands of users like the same post at once, they’re all contending for that single database row.
Disk I/O limits: Writing to disk is slower than reading from memory. Each database write involves disk operations (for durability) and index updates. Even a fast SSD can handle only so many writes per second before it becomes a choke point.
Hot keys (hotspots): A hot key is data that gets a disproportionate amount of traffic. If one item (say a viral tweet) is extremely popular and updated constantly, the server or shard for that item can get overloaded while others sit idle. A single hotspot can bottleneck the whole system.
Because of these challenges, scaling write throughput often requires multiple approaches in combination.
Let’s look at four major patterns to handle high write loads and their pros and cons.
1. Vertical Scaling and Write Optimization
The simplest approach is vertical scaling – make the database server more powerful. This means scaling up instead of out.
Upgrading hardware (more CPU, more RAM, faster disks) gives the database more headroom: additional memory for caching and buffering writes, and faster storage for quicker commits.
You can also optimize the database for writes.
For example, reducing the number of indexes or constraints on write-heavy tables means each insert or update does less work. Tuning database settings (or using a storage engine optimized for writes) can further improve performance.
Pros:
Simple to implement.
No complex system overhaul – often it’s just a bigger machine or some configuration tweaks.
It provides an immediate boost in write capacity.
Cons:
Limited ceiling – there’s only so much one server can handle, and high-end hardware gets expensive.
It’s also a single point of failure. Eventually, you hit a point where scaling up further isn’t feasible, and you’ll need to scale out with other techniques.
2. Sharding and Partitioning
When one machine isn’t enough, you turn to horizontal scaling.
Sharding means splitting your database into multiple pieces (partitions) and distributing them across different servers. Each shard handles a subset of the data (and its writes), so the overall load is spread out.
For example, shard by user ID or region: one shard for users A–M, another for N–Z, etc.
Instead of one database handling everything, each shard handles only its portion.
By sharding the database, you massively increase write capacity because multiple servers handle writes in parallel.
Pros:
Huge scalability gains.
You can add shards to increase total throughput almost linearly.
It also isolates hotspots – a very active subset of data is confined to its shard and doesn’t overwhelm others.
Cons:
Complexity increases.
The application must route queries to the right shard. A bad shard key can cause one shard to become a hotspot while others are idle.
Cross-shard operations (queries or transactions) are hard, and maintaining many databases is a lot of overhead.
3. Queuing and Load Shedding
Sometimes traffic comes in bursts that even a scaled-out database can’t handle in real time.
Queuing helps smooth out spikes by decoupling the incoming write rate from the actual processing rate.
Instead of writing directly to the database, the application enqueues each incoming write in a queuing system.
Background workers then pull from the queue and write to the database at a steady rate. This buffers the spike and prevents the database from being overwhelmed.
If the flood is too great and the queue keeps growing, the system may resort to load shedding – dropping or rejecting some writes to protect itself.
For example, it might skip non-critical writes during a surge to protect the system.
Pros:
Queues help your system handle sudden surges gracefully.
The user experience stays snappy (the app isn’t waiting on the database for each request).
Load shedding, while not ideal, prevents a total meltdown by sacrificing less important data during extreme overloads.
Cons:
Using a queue adds delay – writes become asynchronous, so results aren’t visible instantly.
It also adds complexity to maintain the queue and ensure no data is lost. And if you shed load, some actions are never recorded, which is only acceptable for non-essential data.
4. Batching and Aggregation
Another way to improve write performance is to do more work per operation.
Batching means grouping multiple write actions together and performing them as one larger operation.
Aggregation means combining or summarizing data so you write less frequently.
For example, instead of making 100 separate inserts, send all 100 in one batch so the overhead (transaction setup, network latency) is paid once instead of 100 times.
Similarly, rather than updating a database row for every single small event, you could keep a running tally in memory and update the database once every minute with the summed change.
Pros:
Drastically increases throughput.
Batching amortizes the fixed cost of each write across many operations, so the system can handle more writes with the same resources.
It also makes better use of disk and network (fewer, larger writes are more efficient than many small ones).
Aggregation reduces the total number of writes and the amount of data stored (since you’re storing one summary instead of many individual events).
Cons:
Data isn’t as real-time or detailed.
Batching introduces a delay. With aggregation, you lose detail – you can’t get back the individual data points later if you only saved periodic summaries.
These techniques work well when slight delays and reduced detail are acceptable trade-offs for higher performance.
Conclusion: Mastering Write Scalability
In system design interviews and real-world projects, scaling writes is often the toughest challenge. It’s easy to add caches or read replicas, but a flood of writes requires careful architecture.
These patterns are your key tools for tackling write-heavy scenarios. Each comes with benefits and trade-offs, and they’re often combined in practice.
These strategies help you confidently explain how to maintain high write throughput at scale.
You’ll also be ready to build systems that stay reliable as they grow.
Mastering write scalability not only helps you ace your system design interview but also equips you to design robust, real-world systems.


