Observability at Scale: Building Uber’s Alerting Ecosystem

December 22, 2018

Table of Contents

Uber’s software architectures consists of thousands of microservices that empower teams to iterate quickly and support our company’s global growth. These microservices support a variety of solutions, such as mobile applications, internal and infrastructure services, and products along with complex configurations that affect these products at city and sub-city levels. To maintain our growth and architecture, Uber’s Observability team built a robust, scalable metrics and alerting pipeline responsible for detecting, mitigating, and notifying engineers of issues with their services as soon as they occur.

Specifically, we built two in-data center alerting systems, called uMonitor and Neris, that flow into the same notification and alerting pipeline. uMonitor is our metrics-based alerting system that runs checks against our metrics database M3, while Neris primarily looks for alerts in host-level infrastructure.

Source: uber.com

Tags :

comments powered by Disqus

Implementing the Netflix Media Database

In the previous blog posts in this series, we introduced the Netflix Media DataBase (NMDB) and its salient “Media Document” data model. In this post we will provide details of the NMDB system architecture beginning with the system requirements—these will serve as the necessary motivation for the architectural choices we made. A fundamental requirement for any lasting data system is that it should scale along with the growth of the business applications it wishes to serve.

Istio Multicluster

Istio Multicluster is a feature of Istio–the basis of Red Hat OpenShift Service Mesh–that allows for the extension of the service mesh across multiple Kubernetes or Red Hat OpenShift clusters. The primary goal of this feature is to enable control of services deployed across multiple clusters with a single control plane. The main requirement for Istio multicluster to work is that the pods in the mesh and the Istio control plane can talk to each other.

Cape Technical Deep Dive

In this post, we’ll take a deep dive into the design of the Cape framework. First, we’ll discuss Cape’s architecture. Then we’ll look at the core scheduling component of the system.

Observability at Scale: Building Uber’s Alerting Ecosystem

Tags :

Share :

Related Posts

Implementing the Netflix Media Database

Istio Multicluster

Cape Technical Deep Dive