Sharding splits a dataset in a [[Database]] into multiple partitions that are spread out across multiple computers. There's two layers to this, the logical shard, and the physical shard ## Why Sharding? By breaking up the data into multiple independent pieces, you can operate on the data independent of the other shards, which means you reduce the amount of processing and collisions. This is called a [[Shared-Nothing Architecture]]. It allows for Horizontal Scaling of a Database. ## Types of Sharding ### Logical Sharding The first step in sharding is logically sharding the data. This is the act of breaking up logically the data into chunks. ### Physical Sharding These are physical computers that logical shard chunks are stored upon. ## Replication Splitting up the data is all well and good, but sometimes, you need to access data that is commonly needed by the application, and close proximity is important. OR You simply want more database servers able to handle the entire dataset, replication is the act of storing and acting upon the same logical shard on multiple physical shards. ## Drawbacks #grow Research drawbacks of Database Sharding... ## When to Shard As a last resort. Try implementing, in this order first: * Separate DB server * [[Caching]] * [[Read Replication]] * [[Vertical Scaling]] the server --- # References https://www.digitalocean.com/community/tutorials/understanding-database-sharding