Why Cilium Is Crushing the Competition as the Go-To CNI for Kubernetes

Why Cilium Is Crushing the Competition as the Go-To CNI for Kubernetes

January 20, 2025
1141 views
Get tips and best practices from Develeap’s experts in your inbox

This article explores why Cilium stands out and has become a preferred CNI choice in Kubernetes environments. First, I will explain what is a CNI and its uses in Kubernetes. Then, we’ll find out why Cilium has many advantages over other CNIs. Then, I’ll add my detailed personal experience working with Cilium.

What is a CNI

Let’s first define what is a CNI and what part does it play in Kubernetes:

CNI (Container Network Interface) is a standard for configuring networking in containerized environments. It defines how containers interact with the network, including connecting to other containers and the external world.

A CNI plugin is a software component implementing this standard, responsible for setting up the network and assigning IP addresses to containers.

Each time a pod is created to communicate, it requires an IP, and the one to decide what IP the POD will receive is the CNI plugin. The CNI plugin also sets up the necessary network interfaces and routing rules for the pod to enable communication.

In Kubernetes, a CNI plugin is critical for enabling communication between pods, services, and external endpoints.

Examples of CNI Plugins

  • Calico: Focused on networking and security with support for network policies.
  • Cilium: Uses eBPF for high-performance networking, observability, and security.
  • Flannel: Simple overlay network for Kubernetes.
  • Weave: Easy-to-use networking with encryption options.
  • Amazon VPC CNI: Optimized for running Kubernetes on AWS.

From my experience and from working on Cilium, I discovered the advantages it can offer.
We will explore and explain why cilium is the leading CNI plugin and what its advantages over the competition in the following points:

  1. Latency and Performance
  2. Observability
  3. Security

1.Latency and Performance

First, let’s explore what makes cilium different and how it gives cilium an edge.
The technology that gives cilium its advantages is eBPF . It supports dynamic insertion of eBPF bytecode into the Linux kernel at various points, such as network IO, application sockets, and tracepoints, to implement security, networking, and visibility logic. eBPF is highly efficient and flexible.

The default way to run Kubernetes on the cloud was to use kube-proxy. Let’s examine why eBPF is better.

Why is eBPF Better than kube-proxy?
In the context of Kubernetes, eBPF-based tools. Cilium can replace kube-proxy for handling Kubernetes networking.eBPF uses a hash table for storing

  1. Performance:
    • eBPF eliminates the need for iptables rules, which kube-proxy uses to manage traffic.
    • Traffic routing is handled directly in the kernel, reducing context switches and overhead.
  2. Scalability:
    • eBPF scales better in large clusters as it avoids managing long chains of iptables rules, which can slow down as the number of pods and services increases.
  3. Improved Observability:
    • eBPF enables real-time visibility into network traffic and performance metrics without extra agents or sidecars.
  4. Dynamic Updates:
    • eBPF programs can dynamically adjust traffic flows without restarting services or reloading iptables rules.
  5. Fine-Grained Control:
    • eBPF allows advanced features like enforcing policies, load balancing, and service discovery with finer granularity compared to kube-proxy.
  6. Reduced Latency:
    • By bypassing userspace for packet processing, eBPF reduces latency in service-to-service communication.

Use Case in Kubernetes:
Tools like Cilium replace kube-proxy by using eBPF for service routing, load balancing, and network policies. This improves efficiency and provides additional observability and security features.

As seen in the next image,eBPF removes the overhead of managing iptables.

 


Source:
Cilium.io

2.Observability

Cilium lets us use an in-built tool called Hubble that will give us observability into the cluster network traffic, as such we can provide Developers with Information without giving them direct access to our production environments. 

Hubble provides a UI service map of communication between pods:

Source: https://cilium.io/static/4a49f7548d70ce31d97da4e28f2a3566/dbbcb/service-map.png

As well as protocol communication between pods with a simple command,
we no longer need to use tcpdump to examine traffic.

Source: https://cilium.io/static/1fa37e5125289dc3f5328b728183271c/94426/protocol-visibility.png

  • Observability Features:
    • Real-time flow visibility: Monitor network flows, including source, destination, protocol, and response times.
    • L7 Protocol Insights: Understand application-layer traffic for protocols like HTTP, DNS, Kafka, and gRPC.
    • Distributed Tracing: Allows tracking of request flows across services and nodes, providing insights into latency and potential bottlenecks.
    • Integration with Prometheus and Grafana: Expose metrics that can be visualized for detailed monitoring.
    • Service Map: Visual representation of service-to-service communication.
    • Policy Auditing: Analyze and troubleshoot network policy enforcement.
    • Flow Logs: Export enriched flow logs to external systems like ELK or SIEM platforms for further analysis.
  • Comparison to Other CNIs:
    • Calico: Supports basic flow logging and integrates with external tools like Prometheus and ELK for monitoring. However, it lacks L7 visibility and real-time flow visualization like Hubble.
    • Flannel: Does not include observability features out of the box. Users must rely on external monitoring tools for basic network monitoring.
    • Weave: Offers basic flow logs and integrates with Weave Scope, which provides some visualization of container communication. However, it lacks support for L7 insights or distributed tracing.
    • Amazon VPC CNI: Provides flow logs (via AWS VPC Flow Logs) but only at the L3/L4 level. It lacks built-in visualization, detailed flow metrics, or observability at the application layer.

3.Security

Layer 7 Security Policies

Cilium provides Layer 7 (L7) application-aware security policies, allowing fine-grained control over application protocols like HTTP, DNS, Kafka, and gRPC.

  • Policies can be based on high-level application identity rather than low-level IP or port information, which makes it easier to define and manage in dynamic environments.
  • Policies support workload-based identities (via Kubernetes labels), ensuring seamless security for ephemeral workloads.

As an example:

As can be seen here we have a policy that will limit trafic based on L7 rules and we can even limit specific paths in our application

Mitigating Malicious Attacks

XDP: The XDP BPF hook is at the earliest point possible in the networking driver and triggers a run of the BPF program upon packet reception. This achieves the best possible packet processing performance since the program runs directly on the packet data before any other processing can happen. This hook is ideal for running filtering programs that drop malicious or unexpected traffic, and other common DDOS protection mechanisms.

Encryption Capabilities
  • IPSec and WireGuard Integration: Cilium supports IPSec and WireGuard encryption for data in transit between pods and nodes. This allows for the encryption of traffic within the Kubernetes cluster, providing an additional layer of security for sensitive communications.
    • WireGuard is particularly appealing because it offers a low-overhead, fast encryption mechanism that is simpler to configure and maintain than traditional VPNs.
    • IPSec provides compatibility with a wide range of network environments and works well for clusters where regulatory compliance requires it.

 

How good is encrypted Throughput thou?

(Taken from: https://cilium.io/static/7a43e1b4da5fe164429b4083ae6f841e/186b0/bench_wireguard_tcp_1_stream.png

Personal Experience

Why we wanted to switch to cilium:

In our use case, we used the Amazon vpc-cni before we switched.
Amazon VPC CNI did not provide Node to Node encryption and Security policies we wanted. This requirement was mandatory for our customers and so we decided to switch.

How hard was it:

The switch itself was not complicated,Cilium supports ENI mode for aws so the switch was almost seamless aws EKS also integrates with Cilium and Amazon has multiple article on it For Example: https://aws.amazon.com/blogs/containers/transparent-encryption-of-node-to-node-traffic-on-amazon-eks-using-wireguard-and-cilium/
We did not need to change machine Types,nor did we suffer additional costs as the DaemonSet resources require the same amount as the previous amazon VPC CNI.

Migration:

  1. The procedure is to First Deploy the cilium Resources(Helm Chart…)
  2. Then delete the Amazon VPC CNI daemon set or Helm Chart (depends on your installation)
  3. Stage 2 does not cause a network outage as the amazon VPC CNI pods keep running and continue to communicate until an instance is rolled out.
  4. In our case downtime was acceptable so we rolled out all our cluster, it took 15-30min for all instances to become ready and work with cilium and the new encryption enabled we used (cilium 1.13.10 with Wireguard encryption)
    4.5.It may be possible to rollout the cluster with no downtime but as there will be 2 network communications it could be problematic to set it up correctly.

Cilium Management:

Cilium does not require too much maintenance and version updates should be seamless,
Cilium is managed by Operator pods and so updates to the daemon set is relatively easy.
The most you need to know is the configuration you wish to apply and then enjoy the great features Cilium provides you.

Summary

In summary, Cilium is often preferred for several reasons:

  • Advanced eBPF-based architecture offering high performance and low latency.
  • Identity-based security policies for dynamic environments.
  • L7 protocol visibility and enforcement for granular traffic control.
  • Service mesh capabilities without sidecars, reducing resource overhead.
  • Hubble observability platform for in-depth monitoring and protocol visibility.
  • Efficient built-in load balancing.
  • High flexibility and compatibility across cloud and on-prem environments.
  • Superior scalability and performance for large clusters.

Cilium’s focus on eBPF, security, scalability, and observability makes it an ideal choice for modern Kubernetes environments, particularly those with complex networking requirements, high-security demands, and large-scale workloads.

Cilim is also compatible with aws ENI as it has a special mode to leverage aws networking.

We’re Hiring!
Develeap is looking for talented DevOps engineers who want to make a difference in the world.
Skip to content