Database Sharding at eBay

Database Sharding at eBay

Randy Shoup has written a great article in InfoQ about Scalability Best Practices: Lessons from eBay that explains eBay’s general scalability strategy and how eBay uses database sharding to achieve database scalability.Randy Shoup is a Distinguished Architect at eBay.

Randy Shoup’s six best practices are:

Best Practice #1: Partition by Function

There is no single monolithic database at eBay. Instead there is a set of database hosts for user data, a set for item data, a set for purchase data, etc. – 1000 logical databases in all, on 400 physical hosts. Again, this approach allows us to scale the database infrastructure for each type of data independently of the others.

Best Practice #2: Split Horizontally

The more challenging problem arises at the database tier, since data is stateful by definition. Here we split (or “shard”) the data horizontally along its primary access path. User data, for example, is currently divided over 20 hosts, with each host containing 1/20 of the users. As our numbers of users grow, and as the data we store for each user grows, we add more hosts, and subdivide the users further. Again, we use the same approach for items, for purchases, for accounts, etc.

Best Practice #3: Avoid Distributed Transactions

It turns out that you can’t have everything. In particular, guaranteeing immediate consistency across multiple systems or partitions is typically neither required nor possible. The CAP theorem, postulated almost 10 years ago by Inktomi’s Eric Brewer, states that of three highly desirable properties of distributed systems – consistency (C), availability (A), and partition-tolerance (P) – you can only choose two at any one time. For a high-traffic web site, we have to choose partition-tolerance, since it is fundamental to scaling. For a 24×7 web site, we typically choose availability. So immediate consistency has to give way.

Best Practice #4: Decouple Functions Asynchronously

The next key element to scaling is the aggressive use of asynchrony. If component A calls component B synchronously, A and B are tightly coupled, and that coupled system has a single scalability characteristic — to scale A, you must also scale B. Equally problematic is its effect on availability. Going back to Logic 101, if A implies B, then not-B implies not-A. In other words, if B is down then A is down. By contrast, if A and B integrate asynchronously, whether through a queue, multicast messaging, a batch process, or some other means, each can be scaled independently of the other. Moreover, A and B now have independent availability characteristics – A can continue to move forward even if B is down or distressed.

Best Practice #5: Move Processing To Asynchronous Flows

Moving expensive processing to asynchronous flows, though, allows you to scale your infrastructure for the average load instead of the peak. Instead of needing to process all requests immediately, the queue spreads the processing over time, and thereby dampens the peaks. The more spiky or variable the load on your system, the greater this advantage becomes.

Best Practice #6: Virtualize At All Levels

At eBay, for example, we virtualize the database. Applications interact with a logical representation of a database, which is then mapped onto a particular physical machine and instance through configuration. Applications are similarly abstracted from the split routing logic, which assigns a particular record (say, that of user XYZ) to a particular partition.

Best Practice #7: Cache Appropriately

The most obvious opportunities for caching come with slow-changing, read-mostly data – metadata, configuration, and static data, for example. At eBay, we cache this type of data aggressively, and use a combination of pull and push approaches to keep the system reasonably in sync in the face of updates. Reducing repeated requests for the same data can and does make a substantial impact. More challenging is rapidly-changing, read-write data. For the most part, we intentionally sidestep these challenges at eBay. We have traditionally not done any caching of transient session data between requests. We similarly do not cache shared business objects, like item or user data, in the application layer. We are explicitly trading off the potential benefits of caching this data against availability and correctness.

It is a great public service for eBay to allow a senior engineer make its scalability strategy public.

Other Posts