Jacar mascot — reading along A laptop whose eyes follow your cursor while you read.
Arquitectura Tecnología

Pixie: Native Kubernetes Observability Powered by eBPF

Pixie: Native Kubernetes Observability Powered by eBPF

Actualizado: 2026-05-03

Instrumenting a distributed application for useful metrics, traces, and logs has always been expensive: changing code, agreeing on labelling conventions across teams, and re-validating deployments every time a new library shows up. Pixie[1], a CNCF[2] project, proposes a radical alternative: use eBPF[3] to auto-instrument the whole cluster without modifying a single line of the application.

Key takeaways

  • Pixie loads eBPF programs into the kernel to capture HTTP/gRPC/SQL/Redis traffic without touching code.
  • Installs as a DaemonSet; each node spends ~1 extra vCPU and 1.5 GB RAM.
  • Complements Prometheus (explicit metrics for SLOs) with implicit telemetry for reactive diagnosis.
  • Default retention ~24 hours; for history, export to an external backend.
  • Covers the grey zone of reactive diagnosis that classic tools handle poorly.

What Pixie actually does

Pixie installs a DaemonSet on every cluster node. Each agent pod loads eBPF programs into the kernel that capture — at the syscall and network-stack level — traffic from the most common protocols:

  • HTTP/HTTPS.
  • gRPC.
  • DNS.
  • MySQL.
  • PostgreSQL.
  • Kafka.
  • Redis.

Data is processed locally, enriched with Kubernetes control-plane metadata (pod, namespace, service), and made available via PxL[4], a DataFrame-style query language built for this telemetry.

Minutes after installing Pixie you get automatic visibility into:

  • Service map: communication graph between pods with p50/p95/p99 latencies.
  • Flame graphs: continuous CPU profile per pod, no prior instrumentation.
  • HTTP request bodies: even HTTPS (via eBPF hooks on OpenSSL’s SSL_read/SSL_write).
  • Slow SQL queries: full query text + execution time.

All of this without annotations, sidecars, or redeploys.

Pixie vs. Prometheus + Grafana

The Prometheus[5] + Grafana[6] duo remains the de-facto Kubernetes-metrics standard for good reasons: mature, scalable, well-understood cardinality model. But it covers a different dimension:

  • Prometheus collects explicit metrics: time series the application or exporters expose on /metrics. Requires intentional instrumentation or a suitable exporter.
  • Pixie collects implicit telemetry: what already flows through the network and syscalls. It doesn’t need anyone to export anything.

In practice, they complement each other:

  • For business SLOs (orders processed, account balances, conversions), Prometheus with explicit metrics is the right call — that data doesn’t live in network traffic.
  • For reactive diagnosis (“why is service X slow?”), Pixie answers immediately without requiring you to have instrumented the right cause in advance.

A common pattern: Prometheus for SLO dashboards and alerts — see our guide to Prometheus alerts that actually work — and Pixie as the “zoom” tool when something fails and you need detail.

Requirements and limitations

For Pixie to work you need a few things:

  1. Kernel 4.14+ with CONFIG_BPF_JIT. Most modern distros (Ubuntu 20.04+, Debian 11+, Amazon Linux 2023) ship with this.
  2. Kubernetes 1.18 or higher, with permissions to run privileged DaemonSets on nodes. Recent K8s versions — see Kubernetes 1.27 highlights and later — keep supporting it without surprises.
  3. Resources: each node spends roughly 1 extra vCPU and 1.5 GB RAM. Not negligible in very dense clusters.

Real limitations worth knowing:

  • Short retention window: Pixie stores ~24 hours by default. For long-term historical analysis, export to a backend (New Relic is the official cloud, or DataDog via plugins).
  • Kubernetes only: no version for traditional VMs or bare-metal servers without Kubernetes.
  • Not a full APM: no user-session tracing or distributed sampling like OpenTelemetry[7]. For end-to-end cross-service traces, a dedicated OTel + backend still wins.

When it’s worth it

Pixie shines in teams that meet several of these criteria:

  • Kubernetes cluster with multiple services talking via HTTP/gRPC.
  • Little time or incentive to instrument legacy applications.
  • Frequent need for reactive diagnosis (“something’s slow”).
  • Tolerance for the per-node resource overhead.

Where it does not shine:

  • Clusters with serverless functions (Knative, OpenFaaS) where pods live seconds.
  • Applications using proprietary binary protocols its parsers don’t cover.

For more fragmented architectures, review the general pattern first — we cover the traps and wins in from monolith to microservices. Related, see our coverage of eBPF as a monitoring tool — the substrate Pixie shares with other modern low-level observability tools.

Conclusion

Pixie rewrites the economics of Kubernetes observability: it cuts upfront instrumentation cost to zero and puts useful data in teams’ hands in minutes. It doesn’t replace Prometheus for SLOs or an APM for cross-service tracing, but it covers a grey zone — reactive diagnosis — that classic tools handle poorly.

Was this useful?
[Total: 0 · Average: 0]
  1. Pixie
  2. CNCF
  3. eBPF
  4. PxL
  5. Prometheus
  6. Grafana
  7. OpenTelemetry

Written by

CEO - Jacar Systems

Passionate about technology, cloud infrastructure and artificial intelligence. Writes about DevOps, AI, platforms and software from Madrid.