Cilium and the Future of Container Networking with eBPF

Visualización de red de datos entre nodos distribuidos

Cilium is the project redefining how Kubernetes networking works. Unlike traditional CNIs like Calico or Flannel that rely on iptables, Cilium replaces much of the network stack with eBPF programs loaded in the kernel. The result: significantly better performance, deeper visibility, and more expressive security policies.

The Problem With iptables

iptables has been the Linux networking standard since 1998. It works fine in small K8s clusters. In large clusters, it’s a measurable bottleneck:

  • O(n) scaling. Each new rule adds an entry at the end of the chain. With 5,000 services, every packet traverses thousands of comparisons.
  • Expensive updates. Changing a rule requires recreating the entire chain, affecting performance during the change.
  • Opaque debugging. Tracking why a specific packet was accepted or dropped requires deep skill.

kube-proxy in iptables mode reflects these problems: with thousands of services, node CPU saturates just managing rules.

What Makes Cilium Different

Cilium loads eBPF programs at kernel hook points (XDP, tc, socket, cgroup). When a packet arrives:

  • The eBPF program processes it directly in the kernel, bypassing the conventional networking stack.
  • O(1) hash-table lookups replace iptables’ linear chains.
  • No data copy between user and kernel space for network operations.

Concretely, benchmarks show:

  • Latency reduction: 30-50% lower p95 latency vs iptables under service mesh loads.
  • Higher throughput: 2-3x packets per second on large nodes.
  • Lower CPU consumption: up to 70% less kernel CPU in clusters with thousands of services.

Capabilities Beyond Basic CNI

Cilium goes far beyond “pod networking”. Relevant additional capabilities:

L7 Network Policies

Cilium can apply policies not only by IP/port (L3/L4) but by application content (L7). For example:

apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
  name: "allow-frontend-api"
spec:
  endpointSelector:
    matchLabels:
      app: api
  ingress:
  - fromEndpoints:
    - matchLabels:
        app: frontend
    toPorts:
    - ports:
      - port: "80"
        protocol: TCP
      rules:
        http:
        - method: "GET"
          path: "/api/v1/.*"

This denies by default any POST or path outside /api/v1/. With iptables this granularity isn’t possible — it would require an additional proxy.

Integrated Observability (Hubble)

Hubble is Cilium’s observability layer. It shows in real time which packets are processed, which policies apply, which connections are established. A dashboard that replaces much of what previously required tcpdump + conntrack.

Transparent Encryption

Cilium can encrypt all pod-to-pod traffic with WireGuard or IPsec, enabled with a flag. Without an additional service mesh.

Ambient Mesh Mode (Istio Integration)

In 2023, Istio launched ambient mesh that removes sidecars using ztunnel + waypoint. Cilium integrates as the CNI that makes parts of that design possible, with performance traditional sidecars couldn’t match.

When to Adopt Cilium

Cilium makes clear sense when:

  • Large K8s clusters (>50 nodes, >1000 services). Scale benefits become noticeable.
  • Service mesh without sidecars. If you were considering traditional Istio, Cilium with Hubble + L7 policies can replace much of it.
  • Strict security requirements. L7 policies, transparent encryption, strong service identity.
  • Deep observability required. Hubble gives visibility other CNIs don’t natively provide.

Cases where it may not pay:

  • Small clusters (<20 nodes). Cilium’s operational overhead outweighs benefits.
  • Old kernel (<4.19). Without a modern kernel, eBPF features are limited.
  • Teams without eBPF experience. Cilium debugging requires eBPF understanding — real learning curve.

Migration From Another CNI

Migrating from Calico or Flannel to Cilium isn’t trivial, but the process is well-documented. Typical steps:

  1. Test in staging with a dedicated node pool.
  2. Rolling deployment replacing Calico node by node (the app must tolerate the brief per-pod reconnection).
  3. Validation of existing network policies — syntax is similar but there are subtle differences.
  4. Gradual activation of advanced features (Hubble, encryption, L7 policies) after stabilisation.

Organisations like Datadog, Adobe, Bell Canada, and AWS itself in their EKS service have documented successful migrations.

Related, see our coverage of Pixie for K8s observability and eBPF as monitoring technology — both rely on the same kernel primitives.

Conclusion

Cilium isn’t just another CNI: it’s an architectural shift putting eBPF at the center of Kubernetes networking. For large clusters or with advanced security/observability requirements, it’s hard to justify not adopting it. For small clusters, the recommendation is to know the project and be prepared for the transition when growth justifies it.

Follow us on jacar.es for more on eBPF, Kubernetes, and cloud-native platform architecture.

Entradas relacionadas