Tips for High Availability

Tips for High Availability

  • April 30, 2018
Table of Contents

Tips for High Availability

Over the past four years, Netflix has gone from less than 50 Million subscribers to 125 Million subscribers. While this kind of growth has caused us no shortage of scaling challenges, we actually managed to improve the overall availability of our service in that time frame. Along the way, we have learned a lot and now have a much better understanding of what it takes to make our system more highly available.

But the news is not all good. The truth is that we learned many of our lessons the hard way: through heroics, through mad scrambles when things went wrong, and sometimes unfortunately through customer-facing incidents. Even though we haven’t figured everything out and still have many opportunities to improve our systems, we want to share some of the experience we have gained and the tips or best practices we derived.

Hopefully some of you will take something away that will save you a wake-up call at 3am for a customer-facing incident.

Source: medium.com

Tags :
Share :
comments powered by Disqus

Related Posts

Netflix FlameScope

Netflix FlameScope

We’re excited to release FlameScope: a new performance visualization tool for analyzing variance, perturbations, single-threaded execution, application startup, and other time-based issues. It has been created by the Netflix cloud performance engineering team and just released as open source, and we welcome help from others to develop the project further. (If it especially interests you, you might be interested in joining Netflix to work on it and other projects.)

Read More