NTP: Building a more accurate time service at Facebook scale

Almost all of the billions of devices connected to the internet have onboard clocks, which need to be accurate to properly perform their functions. Many clocks contain inaccurate internal oscillators, which can cause seconds of inaccuracy per day and need to be periodically corrected. Incorrect time can lead to issues, such as missing an important reminder or failing a spacecraft launch. Devices all over the world rely on Network Time Protocol (NTP) to stay synchronized to a more accurate clock over packet-switched, variable-latency data networks. As Facebook’s infrastructure has grown, time precision in our systems has become more and more important. We need to know the accurate time difference between two random servers in a data center so that datastore writes don’t mix up the order of transactions. We need to sync all the servers across many data centers with sub-millisecond precision. For that we tested chrony, a modern NTP server implementation with interesting features. During testing, we found that chrony is significantly more accurate and scalable than the previously used service, ntpd, which made it an easy decision for us to replace ntpd in our infrastructure. Chrony also forms the foundation of our Facebook public NTP service, available from time.facebook.com. In this post, we will share our work to improve accuracy from 10 milliseconds to 100 microseconds and how we verified these results in our timing laboratory.

Source: fb