#Database Sharding and Partitioning

Elliot Forbes · Jun 20, 2026 · 6 min read

Vertical scaling - adding more CPU, RAM, or faster disks to one machine - has hard limits. Once you hit them, no single node can hold all your data or serve all your writes.

Sharding solves this by splitting one dataset across many nodes. Each node holds a subset of the rows, so the write load and storage are distributed rather than concentrated.

The terms are easy to confuse. Partitioning means dividing a table into logical segments; sharding means spreading those partitions across separate physical machines or nodes. Sharding is partitioning taken to a distributed level.

The Core Concept

A router (also called a coordinator or proxy) sits in front of the database tier. When a query arrives, the router applies a function to the shard key to determine which shard holds the relevant rows, then forwards the query there.

Sharding: a router hashes the shard key to route rows across shards A, B and C

Each shard is an independent database node. It has no knowledge of the other shards. This means the data for a single row always lives on exactly one shard, and queries that touch that row go to exactly one node.

Choosing a Shard Key

The shard key is the most consequential design decision in a sharded system. A poor choice causes hotspots and makes the system impossible to scale evenly.

Range-based partitioning divides the key space into contiguous ranges. For example, user IDs 1-1000000 go to Shard A and 1000001-2000000 go to Shard B.

Range queries are efficient - you only touch one or two shards. The risk is that a hot range (for example, the most recent time bucket) receives all the write traffic.

Hash-based partitioning applies a hash function to the key and maps the result to a shard. This spreads rows evenly across shards.

The tradeoff is that range queries now scatter across all shards, since adjacent keys hash to random buckets.

A good shard key has three properties. High cardinality - enough distinct values to distribute data evenly. Even distribution - no single value or small cluster of values accounts for a large fraction of the data.

A match with query patterns is also essential. If most queries filter by user ID, shard by user ID so queries hit one shard rather than all of them.

The hotspot problem occurs when one shard receives a disproportionate share of traffic. A classic example is sharding by user ID when a small number of celebrity accounts generate most of the reads.

If user 42 has 50 million followers, all their reads hit Shard A. Shard A becomes a bottleneck regardless of how many other shards exist.

One mitigation is to assign high-traffic keys a dedicated shard. Another is to add a random suffix to the key to spread rows across multiple shards.

Tradeoffs

Sharding buys you horizontal write scalability, but it comes with significant operational costs.

Cross-shard queries are expensive. A query that does not filter on the shard key must be sent to every shard, with results gathered and merged by the router. This is called a scatter-gather query.

Joins across shards are even worse. They either require fetching data from multiple shards and joining in application memory, or they require denormalizing data so related rows land on the same shard.

Rebalancing is painful. If a shard fills up or a new node is added, rows must be migrated to restore an even distribution. During migration, routing logic must handle rows that are mid-move. This is a significant operational burden.

Consistent hashing reduces the cost of adding or removing shards. Instead of mapping each key to a fixed shard, keys and shards are placed on a virtual ring.

Adding a shard only requires migrating keys that fall between the new node and its neighbor - not a full reshuffle of every row. This is how systems like Cassandra and DynamoDB manage their rings.

Sharding also interacts with the guarantees described by the CAP theorem.

Cross-shard transactions are hard to make atomic, and most sharded systems choose availability over strict consistency.

It also pairs with replication. Replication copies the same data to multiple nodes for redundancy; sharding splits different data across nodes for scale. Both are commonly used together.

How It Comes Up in Interviews

When should you shard a database?

Sharding should be a last resort. Exhaust read replicas, query caching, connection pooling, and vertical scaling first.

Sharding adds significant complexity: cross-shard queries, rebalancing, and the loss of cross-row transactions. Only reach for it when write volume or data size genuinely cannot fit on one node.

How do you choose a shard key?

Choose a key that is high-cardinality, distributes data evenly, and matches your most common query access pattern.

If most queries look up by user ID, shard by user ID. Avoid keys based on time alone - they create a write hotspot on the latest shard.

What is a hotspot and how do you avoid it?

A hotspot is a shard that receives disproportionately more traffic than the others. It happens when the shard key is skewed - a small set of key values accounts for most queries.

Mitigations include appending a random bucket suffix to spread load, assigning celebrity entities to dedicated shards, or adding a caching layer in front of the hot shard.

How do cross-shard queries work?

If a query does not include the shard key in its filter, the router must send the query to every shard (scatter), collect all the results, and merge them (gather). This is slow and grows linearly with shard count.

The practical fix is to design your shard key so that your most frequent query types always include it, making scatter-gather a rare exception rather than the common case.

What’s Next

Database Replication Explained - how read replicas and leader-follower setups work before you reach for sharding
Consistency and the CAP Theorem - the consistency tradeoffs that sharding forces you to make
Caching in System Design - reducing read pressure so sharding can be deferred longer

Frequently Asked Questions

What is database sharding?

Database sharding splits a single dataset across multiple independent database nodes. Each node (shard) holds a subset of the rows.

A router applies a function to a shard key to determine which shard to query, so write load and storage are distributed rather than concentrated on one machine.

What is the difference between sharding and partitioning?

Partitioning divides a table into logical segments within the same database engine - for example, PostgreSQL table partitioning by date range.

Sharding takes this further by placing those partitions on separate physical machines. All sharding involves partitioning, but not all partitioning is sharding.

What is a shard key?

The shard key is the column (or combination of columns) used to decide which shard a row belongs to.

A good shard key has high cardinality, distributes rows evenly, and matches the filters in your most frequent queries. A bad shard key creates hotspots where one shard receives most of the traffic.

What are the downsides of sharding?

The main downsides are: cross-shard queries require scatter-gather which is slow; joins across shards require application-side merging; distributed transactions spanning shards are complex and rarely fully atomic.

Rebalancing data when shards fill up is also operationally expensive. These costs mean sharding should only be introduced when simpler approaches have been exhausted.

Share this article

X LinkedIn Reddit Hacker News

Continue Learning

software-engineer

Consistency and the CAP Theorem: A System Design Guide

A clear, interview-focused guide to the CAP theorem and consistency models - what it means, the tradeoffs, and how it comes up in system design interviews.

Jun 2026 5 min read

software-engineer