Replication copies the same data to multiple nodes (better read scale and availability). Sharding splits data across nodes (better write/size scale), but makes queries and transactions more complex.
Advanced answer
Deep dive
Replication
Replication keeps **copies of the same dataset** on multiple nodes.
Primary/replica (leader/follower) is common.
Benefits: high availability, read scaling (read replicas), easier failover.
Trade-offs: replication lag (stale reads), failover complexity, write throughput still limited by the primary.
Sharding
Sharding splits the dataset into **partitions** (shards) across nodes.
Benefits: scale data size and write throughput.
Trade-offs: complex queries (cross-shard joins/aggregations), distributed transactions, resharding, choosing a good shard key.
Practical guidance
Start with replication for HA and read scale.
Consider sharding only when one node can’t handle data size/write throughput.
Many real deployments combine both: each shard is replicated.