Container Monitoring: Beyond cAdvisor
Actualizado: 2026-05-03
cAdvisor democratised container monitoring and remains relevant — kubelet includes it internally. But in 2024, observing containers well requires more layers: cluster state metrics, eBPF for deep visibility, APM for application context. This article maps what to combine and how to avoid the over-engineering that turns the observability stack into the most expensive thing to maintain.
Key takeaways
- The modern minimum stack is cAdvisor + kube-state-metrics + node-exporter + Prometheus + Grafana: covers 80% of what’s needed.
- cAdvisor gives surface metrics (CPU, RAM, network, disk); eBPF adds syscall latency, per-pod network latencies, and CPU profiling.
- The reference eBPF tools are Pixie (CNCF sandbox), Grafana Beyla, and Parca for continuous profiling.
- OpenTelemetry is the APM standard: instrument critical apps, not all of them.
- The trap is over-engineering: more tools equals more maintenance and more alert noise.
The modern minimum stack
For serious Kubernetes in 2024, the OSS baseline that covers most of the work: kubelet/cAdvisor, kube-state-metrics, node-exporter, Prometheus, Grafana. Overhead: approximately 5% of cluster CPU/RAM, acceptable for any production installation.
What’s missing without eBPF
cAdvisor gives surface metrics but can’t answer: is the pod blocked on filesystem I/O? What’s the network latency between pods A and B? Which specific function consumes 80% of CPU? For those questions, eBPF is the correct tool.
eBPF: the layer that changes diagnostics
Pixie[1]: auto-instrumentation of HTTP, gRPC, and DNS without sidecar or code changes. Live flame graphs, automatic service map. The most accessible entry point for eBPF newcomers.
Grafana Beyla[2]: auto-instrumentation for Go, Java, and Node apps generating OpenTelemetry traces without code modification. Simpler than Pixie, focused on traces and metrics.
Parca[3]: continuous CPU profiling via eBPF for identifying hotpaths.
APM: the application layer
eBPF gives infrastructure visibility; APM gives business logic visibility. OpenTelemetry is the modern standard. Beyla can auto-generate spans for many cases; for business-specific metrics, the SDK is needed. Trace backends: Jaeger and Grafana Tempo.
Essential per-container metrics
Always monitor, without exception: CPU throttling, memory working set, OOM kills, network errors, restart count. For Kubernetes additionally: pod phase, readiness probe failures, HPA desired vs current, PVC usage.
Useful alerts
Few but effective: pod restart > N in Y minutes, sustained CPU throttling > 50%, OOM kills (always), memory > 90% of limit sustained, node not ready, HPA at max replicas.
Fewer useful alerts always beats many ignored ones.
Conclusion
Monitoring containers well in 2024 requires more than cAdvisor, but not everything at once. The OSS base (Prometheus + kube-state-metrics + node-exporter + Grafana) is solid and sufficient for most teams. eBPF adds deep visibility when the problem justifies it. APM with OpenTelemetry completes the picture for critical business apps. The discipline is in adding layers only when the use case demands it, and keeping alerts few and well thought out.