Cilium Service Mesh: When You Don’t Need Sidecars

Líneas de red fibra óptica iluminadas azules conectando nodos

Cilium started as an eBPF-based CNI and evolved into a complete alternative to traditional service meshes — without sidecars. Its architecture leverages eBPF to do policy, observability, and encryption in kernel, without per-pod proxy. For large clusters, resource savings are significant. Istio responded with Ambient Mode (similar philosophy). This article compares the sidecarless approach and when to pick each.

The Sidecar Problem

The sidecar model (Linkerd, classic Istio):

  • One Envoy/linkerd-proxy per pod.
  • Resource overhead: 50-200MB RAM + CPU per pod.
  • Additional latency: 2-5ms round-trip.
  • Operational complexity: many processes, lifecycle management.

In clusters with thousands of pods, multiplied by sidecar, it’s significant.

Cilium’s Approach

Cilium replaces sidecar proxies with:

  • eBPF in kernel for simple policy and encryption.
  • Centralised Envoy for complex L7 features (only where used).
  • Hubble for native observability.
  • CNI integration — Cilium is both CNI and service mesh in one piece.

Result: comparable features with less overhead.

Main Features

eBPF-based mTLS

Cilium can encrypt inter-node traffic with WireGuard (simple and fast) or IPsec (more compatible). No sidecar injection needed.

# CiliumClusterwideEncryptionPolicy
apiVersion: cilium.io/v2
kind: CiliumClusterwideNetworkPolicy
metadata:
  name: enforce-encryption
spec:
  endpointSelector: {}
  ingress:
    - fromEntities:
        - cluster

WireGuard per node, not per pod. Less granular than per-service mTLS but more efficient.

L7 Policies

Cilium supports application-level policy:

apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
spec:
  endpointSelector:
    matchLabels:
      app: api
  ingress:
    - fromEndpoints:
        - matchLabels:
            app: frontend
      toPorts:
        - ports:
            - port: "80"
              protocol: TCP
          rules:
            http:
              - method: "GET"
                path: "/api/v1/users"

For traffic requiring L7 (HTTP verbs, paths), Cilium starts an Envoy per node (not per pod). Less overhead.

Hubble: Observability

Hubble is Cilium’s observability layer:

  • Detailed flow logs.
  • Service dependency maps (who talks to whom).
  • Policy verdicts (why a request was allowed/denied).
  • Prometheus and Grafana integration.

Functional equivalent to Kiali + linkerd-viz but integrated.

Load Balancing

Cilium includes kube-proxy-replacement load balancer:

  • XDP for ingress with massive throughput.
  • Session affinity, health checks.
  • BGP integration to announce LoadBalancer IPs.

For large clusters, direct replacement for MetalLB + kube-proxy.

Multicluster and Distributed Service Mesh

Cilium Cluster Mesh connects multiple clusters:

  • Services accessible by DNS name across clusters.
  • Automatic failover between clusters.
  • Cross-cluster consistent policy.

Operationally simpler than Istio federation.

Cilium vs Istio Ambient

Istio responded to the sidecar critique with Ambient Mode (GA in 2024):

  • ztunnel per node (L4 + mTLS).
  • Optional Waypoint proxy per namespace/cluster for L7.
  • Advantage: sidecar-less, similar to Cilium.
Aspect Cilium Istio Ambient
Kernel layer Native eBPF iptables + ztunnel
L4 encryption Per-node WireGuard Per-identity mTLS in ztunnel
L7 features Per-node Envoy on-demand Per-namespace Waypoint
CNI integration Native Separate
Policy API CiliumNetworkPolicy Istio AuthorizationPolicy
Observability Hubble Kiali + istioctl
Sidecarless maturity GA 2023 GA 2024

Cilium has more ground in sidecarless. Istio Ambient is newer but has Istio’s mature ecosystem.

Cilium vs Linkerd

Linkerd continues with sidecars (Rust linkerd2-proxy):

  • Linkerd is simpler to operate but has sidecar overhead (though very light).
  • Cilium has more features but more complexity.
  • Cilium is also CNI; Linkerd complements your CNI.

For clusters already having a CNI (Calico, Flannel), migrating to Cilium is big step. For greenfield, unified Cilium is attractive.

Resource Comparisons

Orientation benchmark (cluster 100 nodes, 1000 pods):

Stack Total overhead
Classic Istio (sidecars) ~100GB RAM
Linkerd ~10GB RAM
Cilium + CNI ~5GB RAM
Istio Ambient ~15GB RAM

Approximate numbers, depend heavily on configuration.

Real Cases

Cilium in production:

  • Datadog: replaced iptables-based networking.
  • Bell Canada: standard CNI.
  • Sky UK: multi-cluster service mesh.
  • Lyft: considering migration.
  • Google GKE integrates Cilium as Dataplane V2.

Greater adoption in teams with eBPF expertise.

Limitations

Honest about Cilium:

  • High learning curve: eBPF, CRDs, specific tooling.
  • Kernel compatibility: requires recent kernels for best features.
  • Less granular identity: per-node vs per-service encryption. For strict multi-tenant, Istio Ambient with per-pod-identity mTLS is better.
  • Disruptive migration: changing CNI in existing cluster is project.
  • Smaller community than Istio.

When to Choose Cilium

Good fits:

  • Large clusters (>500 pods) where sidecar overhead matters.
  • Teams with eBPF experience or willing to invest.
  • Greenfield Kubernetes without legacy CNI.
  • Need for L7 policy with high throughput.
  • Multi-cluster with advanced connectivity requirements.

When NOT Cilium

  • Small cluster where sidecars aren’t a problem.
  • Already running Istio with complex features — migration doesn’t pay.
  • Team without low-level networking experience.
  • Fine per-pod identity requirements (prefer Istio Ambient).

Commercial Ecosystem

Isovalent (company behind Cilium) was acquired by Cisco in 2024, signaling enterprise validation but also potential vendor push. Open-source alternative still works without dependency.

Conclusion

Cilium represents a genuine service-mesh evolution: sidecarless, eBPF-native, CNI-integrated. For large clusters and technically capable teams, it offers real resource and feature advantages. Not the right choice for everyone — Linkerd remains valid for simplicity, classic Istio for feature-completeness, Istio Ambient as sidecarless alternative with different trade-offs. Service-mesh choice in 2024 has more mature options than ever; decision should be based on your technical context and team, not trend.

Follow us on jacar.es for more on Kubernetes, eBPF, and service mesh.

Entradas relacionadas