Reimagining Experimentation Analysis at Netflix

Reimagining Experimentation Analysis at Netflix

  • September 14, 2019
Table of Contents

Reimagining Experimentation Analysis at Netflix

Another day, another custom script to analyze an A/B test. Maybe you’ve done this before and have an old script lying around. If it’s new, it’s probably going to take some time to set up, right?

Not at Netflix. Suppose you’re running a new video encoding test and theorize that the two new encodes should reduce play delay, a metric describing how long it takes for a video to play after you press the start button. You can look at ABlaze (our centralized A/B testing platform) and take a quick look at how it’s performing.

You notice that the first new encode (Cell 2 — Encode 1) increased the mean of the play delay but decreased the median! After recreating the dataset, you can plot the raw numbers and perform custom analyses to understand the distribution of the data across test cells. With our new platform for experimentation analysis, it’s easy for scientists to perfectly recreate analyses on their laptops in a notebook.

They can then choose from a library of statistics and visualizations or contribute their own to get a deeper understanding of the metrics. Netflix runs on an A/B testing culture: nearly every decision we make about our product and business is guided by member behavior observed in test. At any point a Netflix user is in many different A/B tests orchestrated through ABlaze.

This enables us to optimize their experience at speed. Our A/B tests range across UI, algorithms, messaging, marketing, operations, and infrastructure changes. A user might be in a title artwork test, personalization algorithm test, or a video encoding testing, or all three at the same time.

Source: medium.com

Share :
comments powered by Disqus

Related Posts

Evolution of Netflix Conductor

Evolution of Netflix Conductor

Conductor is a workflow orchestration engine developed and open-sourced by Netflix. If you’re new to Conductor, this earlier blogpost and the documentation should help you get started and acclimatized to Conductor. In the last two years since inception, Conductor has seen wide adoption and is instrumental in running numerous core workflows at Netflix.

Read More
Supercharging Data Delivery: The New League Patcher

Supercharging Data Delivery: The New League Patcher

For the past 8 years, League has been using a patching system called RADS (Riot Application Distribution System) to deliver updates. RADS is a custom patching solution based on binary deltas that we built with League in mind. While RADS has served us well, we felt we had an opportunity to improve some key areas of the patching experience.

Read More