Observability at Scale: Building Uber’s Alerting Ecosystem
Uber’s software architectures consists of thousands of microservices that empower teams to iterate quickly and support our company’s global growth. These microservices support a variety of solutions, such as mobile applications, internal and infrastructure services, and products along with complex configurations that affect these products at city and sub-city levels. To maintain our growth and architecture, Uber’s Observability team built a robust, scalable metrics and alerting pipeline responsible for detecting, mitigating, and notifying engineers of issues with their services as soon as they occur.
Specifically, we built two in-data center alerting systems, called uMonitor and Neris, that flow into the same notification and alerting pipeline. uMonitor is our metrics-based alerting system that runs checks against our metrics database M3, while Neris primarily looks for alerts in host-level infrastructure.