eBPF for Continuous Profiling: Parca and Beyla

eBPF-based continuous profiling transforms debugging from “ad-hoc pprof when it fails” to “always-on profile.” 2024 modern stack: Parca, Grafana Beyla, Pyroscope (now Grafana). This article covers complementary roles + integration patterns.

The Stack

Three complementary tools:

Parca: 24/7 CPU profiling via eBPF.
Beyla: HTTP auto-instrumentation + metrics + traces.
Pyroscope: language-specific deep profiling.

Combined: cluster-wide coverage + targeted depth.

Parca: What It Covers

Installs DaemonSet, eBPF agent per node:

Cluster-wide CPU flame graphs.
Compiled languages: Go, Rust, C++ (deep stack unwinding).
Interpreted: Python, Node (interpreter frames).
Prometheus-compatible: metrics + labels.
Minimal overhead: 0.5-1% CPU.

Beyla: Auto-Instrumentation

Covers:

HTTP/gRPC traces without SDK changes.
RED metrics (rate, errors, duration).
Automatic service graphs.
Standard OTel output.

Not profiling per se — instrumentation without code changes.

Pyroscope

Originally independent, now part of Grafana:

Language agents: native libraries, deeper info.
Compiled + interpreted: richer data.
Storage/querying backend: can ingest from Parca.

Use case: deeper language-specific profiling for specific critical apps.

Install

Parca

helm install parca parca/parca
helm install parca-agent parca/parca-agent --namespace monitoring

Per-node agent, centralised Parca server. Auto-discovers processes.

Beyla

helm install beyla grafana/beyla \
  --set env.BEYLA_AUTO_INSTRUMENT_TARGET=my-app

Pyroscope

helm install pyroscope grafana/pyroscope

Grafana Integration

Unified stack:

Grafana Tempo: traces.
Grafana Mimir: metrics.
Grafana Loki: logs.
Grafana Pyroscope: profiles (+ Parca ingestion).
Grafana: single dashboard.

Click trace → see profile of that period. Strong correlation.

Real Overhead

Measured in production:

Parca: ~0.5-1% CPU per node.
Beyla: ~1-2% CPU, 50MB memory.
Pyroscope language agent: 1-5% depending.

Combined <5% — acceptable.

Use Cases

Detect Regressions

Deploy new version → Parca shows increased CPU in specific function → rollback before user impact.

Optimise Hotspots

Weekly flame graphs review → identify top-10 CPU consumers → optimise.

Compare Periods

“Why is latency up this week?” → compare profiles last week vs this week.

Root Cause Incidents

Post-mortem: profile during incident vs baseline highlights specific function.

Flame Graph Interpretation

Basics:

Width = CPU time.
Stack from bottom up.
Hot paths = wide top-level boxes.

Skill — learn with Brendan Gregg tutorials.

Language Specifics

Go

Parca excellent: DWARF info, full stacks. Preferred.

Rust

Similar to Go — compiled, rich stacks.

Python/Node

Interpreter frames visible. For Python-level detail, Pyroscope language agent better.

Java

Complex JVM. Specialised Pyroscope Java agent.

C/C++

Native — Parca perfect.

Cost

Storage: compressed profile samples ~1GB/day for 50-node cluster.
S3 or similar for long-term.
Retention: typically 15-30 days.

Manageable.

Alternatives

Datadog Continuous Profiler: commercial, mature, multi-language.
New Relic: similar.
Elastic Profiling: newer.
Cloudflare AMP: specific context.

Open-source Parca/Beyla/Pyroscope stack has comparable features + free.

Security

eBPF-based tools require:

Privileged container.
HostPID / HostNetwork.
Sufficiently recent kernel (5.10+).

Attack surface consideration. Restrict RBAC.

For New Adopters

Start path:

Install Parca in staging.
Review flame graphs for weeks.
Skill build interpreting profiles.
Deploy production carefully.
Correlate with traces/logs setup.

Skill + value accumulate.

When Profiling Not Useful

No performance issues — wasted effort.
Simple workloads — htop suffices.
No expertise interpreting — data without action.

Most production systems benefit; some don’t.

Conclusion

Continuous profiling via eBPF is significant improvement over ad-hoc pprof. Parca + Beyla + Pyroscope stack covers spectrum without big commercial cost. For performance-sensitive ops, invest. Flame-graph interpretation skill pays. Grafana stack integration seamless. Modern production observability increasingly includes profiling as fourth pillar (after metrics/logs/traces). Now is good time to adopt.

The Stack

Parca: What It Covers

Beyla: Auto-Instrumentation

Pyroscope

Install

Parca

Beyla

Pyroscope

Grafana Integration

Real Overhead

Use Cases

Detect Regressions

Optimise Hotspots

Compare Periods

Root Cause Incidents

Flame Graph Interpretation

Language Specifics

Go

Rust

Python/Node

Java

C/C++

Cost

Alternatives

Security

For New Adopters

When Profiling Not Useful

Conclusion

Entradas relacionadas

WASI preview 3: threads and async in WebAssembly

Llama 3.2 at the edge: Meta bets on small

Cloudflare Workers in 2025: from edge to enterprise