Cross shard transactions at 10 million requests per second

Cross shard transactions at 10 million requests per second

  • November 9, 2018
Table of Contents

Cross shard transactions at 10 million requests per second

Dropbox stores petabytes of metadata to support user-facing features and to power our production infrastructure. The primary system we use to store this metadata is named Edgestore and is described in a previous blog post, (Re)Introducing Edgestore. In simple terms, Edgestore is a service and abstraction over thousands of MySQL nodes that provides users with strongly consistent, transactional reads and writes at low latency.

Edgestore hides details of physical sharding from the application layer to allow developers to scale out their metadata storage needs without thinking about complexities of data placement and distribution. Central to building a distributed database on top of individual MySQL shards in Edgestore is the ability to collocate related data items together on the same shard. Developers express logical collocation of data via the concept of a colo, indicating that two pieces of data are typically accessed together.

In turn, Edgestore provides low-latency, transactional guarantees for reads and writes within a given colo (by placing them on the same physical MySQL shard), but only best-effort support across colos. While the product use-cases at Dropbox are usually a good fit for collocation, over time we found that certain ones just aren’t easily partitionable. As a simple example, an association between a user and the content they share with another user is unlikely to be collocated, since the users likely live on different shards.

Even if we were to attempt to reorganize physical storage such that related colos land on the same physical shards, we would never get a perfect cut of data.

Source: dropbox.com

Share :
comments powered by Disqus

Related Posts

Peloton: Uber’s Unified Resource Scheduler for Diverse Cluster Workloads

Peloton: Uber’s Unified Resource Scheduler for Diverse Cluster Workloads

Cluster management, a common software infrastructure among technology companies, aggregates compute resources from a collection of physical hosts into a shared resource pool, amplifying compute power and allowing for the flexible use of data center hardware. At Uber, cluster management provides an abstraction layer for various workloads. With the increasing scale of our business, the efficient use of cluster resources becomes very important.

Read More
Horizon: An open-source reinforcement learning platform

Horizon: An open-source reinforcement learning platform

Horizon is the first open source end-to-end platform that uses applied reinforcement learning (RL) to optimize systems in large-scale production environments. The workflows and algorithms included in this release were built on open frameworks — PyTorch 1.0, Caffe2, and Spark — making Horizon accessible to anyone using RL at scale. We’ve put Horizon to work internally over the past year in a wide range of applications, including helping to personalize M suggestions, delivering more meaningful notifications, and optimizing streaming video quality.

Read More
Why React’s new Hooks API is a game changer

Why React’s new Hooks API is a game changer

I have been developing with React since it’s early days and during that time there have been many attempts by both influencers, as well as the core team to improve the API and patterns developers are using to creating software. One of the biggest challenges we have had was how to share behaviour neatly between components to enable reuse or even just separation of concerns. Every single solution proposed up until this point had some problems associated with it.

Read More