Cilium 1.5: Scaling to 5k Nodes and 100k Pods, BPF-Based SNAT, and Rolling Key Updates for Transparent Encryption

Cilium 1.5: Scaling to 5k nodes and 100k pods, BPF-based SNAT, and Rolling Key Updates for Transparent Encryption

We are excited to announce the Cilium 1.5 release. Cilium 1.5 is the first release where we primarily focused on scalability with respect to number of nodes, pods and services.

Our goal was to scale to 5k nodes, 20k pods and 10k services. We went well past that goal with the 1.5 release and are now officially supporting 5k nodes, 100k pods and 20k services. Along the way, we learned a lot, some expected, some unexpected, this blog post will dive into what we learned and how we improved.

Besides scalability, several significant features made its way into the release including: BPF templating, rolling updates for transparent encryption keys, transparent encryption for direct-routing, a new improved BPF based service load-balancer with improved fairness, BPF based masquerading/SNAT support, Istio 1.1.3 integration, policy calculation optimizations as well as several new Prometheus metrics to assist in operations and monitoring. For the full list of changes, see the 1.5 Release Notes. At the foundation of Cilium is a new Linux kernel technology called BPF, which enables the dynamic insertion of powerful security, visibility, and networking control logic within Linux itself.

BPF is utilized to provide functionality such as multi-cluster routing, load balancing to replace kube-proxy, transparent encryption using X.509 certificates as well as network and service security. Besides providing traditional network level security, the flexibility of BPF enables security with the context of application protocols and DNS requests/responses. Cilium is tightly integrated with Envoy and provides an extension framework based on Go.

Because BPF runs inside the Linux kernel, all Cilium functionality can be applied without any changes to the application code or container configuration.