Implementing the Netflix Media Database

Implementing the Netflix Media Database

  • December 15, 2018
Table of Contents

Implementing the Netflix Media Database

In the previous blog posts in this series, we introduced the Netflix Media DataBase (NMDB) and its salient “Media Document” data model. In this post we will provide details of the NMDB system architecture beginning with the system requirements—these will serve as the necessary motivation for the architectural choices we made. A fundamental requirement for any lasting data system is that it should scale along with the growth of the business applications it wishes to serve.

NMDB is built to be a highly scalable, multi-tenant, media metadata system that can serve a high volume of write/read throughput as well as support near real-time queries. At any given time there could be several applications that are trying to persist data about a media asset (e.g., image, video, audio, subtitles) and/or trying to harness that data to solve a business problem. Some of the essential elements of such a data system are (a) reliability and availability—under varying load conditions as well as a wide variety of access patterns; (b) scalability—persisting and serving large volumes of media metadata and scaling in the face of bursty requests to serve critical backend systems like media encoding, (c) extensibility—supporting a demanding list of features with a growing list of Netflix business use cases, and (d) consistency—data access semantics that guarantee repeatable data read behavior for client applications.

The following section enumerates the key traits of NMDB and how the design aims to address them.

Source: medium.com

Share :
comments powered by Disqus

Related Posts

Cross shard transactions at 10 million requests per second

Cross shard transactions at 10 million requests per second

Dropbox stores petabytes of metadata to support user-facing features and to power our production infrastructure. The primary system we use to store this metadata is named Edgestore and is described in a previous blog post, (Re)Introducing Edgestore. In simple terms, Edgestore is a service and abstraction over thousands of MySQL nodes that provides users with strongly consistent, transactional reads and writes at low latency.

Read More
Optimal Shard Placement in a Petabyte Scale Elasticsearch Cluster

Optimal Shard Placement in a Petabyte Scale Elasticsearch Cluster

The number of shards on each node, and tries to balance the number of shards per node evenly across the clusterThe high and low disk watermarks. Elasticsearch considers the available disk space on a node before deciding whether to allocate new shards to that node or to actively relocate shards away from that node. A nodes that has reached the low watermark (i.e 80% disk used) is not allowed receive any more shards.

Read More
Scaling Cash Payments in Uber Eats

Scaling Cash Payments in Uber Eats

This article is the fourth in a series covering how Uber’s mobile engineering team developed the newest version of our driver app, codenamed Carbon, a core component of our ridesharing business. Among other new features, the app lets our population of over three million driver-partners find fares, get directions, and track their earnings. We began designing the new app in conjunction with feedback from our driver-partners in 2017 and began rolling it out for production in September 2018.

Read More