eBPF-based continuous profiling transforms debugging from “ad-hoc pprof when it fails” to “always-on profile.” 2024 modern stack: Parca, Grafana Beyla, Pyroscope (now Grafana). This article covers complementary roles + integration patterns.
The Stack
Three complementary tools:
- Parca: 24/7 CPU profiling via eBPF.
- Beyla: HTTP auto-instrumentation + metrics + traces.
- Pyroscope: language-specific deep profiling.
Combined: cluster-wide coverage + targeted depth.
Parca: What It Covers
Installs DaemonSet, eBPF agent per node:
- Cluster-wide CPU flame graphs.
- Compiled languages: Go, Rust, C++ (deep stack unwinding).
- Interpreted: Python, Node (interpreter frames).
- Prometheus-compatible: metrics + labels.
- Minimal overhead: 0.5-1% CPU.
Beyla: Auto-Instrumentation
Covers:
- HTTP/gRPC traces without SDK changes.
- RED metrics (rate, errors, duration).
- Automatic service graphs.
- Standard OTel output.
Not profiling per se — instrumentation without code changes.
Pyroscope
Originally independent, now part of Grafana:
- Language agents: native libraries, deeper info.
- Compiled + interpreted: richer data.
- Storage/querying backend: can ingest from Parca.
Use case: deeper language-specific profiling for specific critical apps.
Install
Parca
helm install parca parca/parca
helm install parca-agent parca/parca-agent --namespace monitoring
Per-node agent, centralised Parca server. Auto-discovers processes.
Beyla
helm install beyla grafana/beyla \
--set env.BEYLA_AUTO_INSTRUMENT_TARGET=my-app
Pyroscope
helm install pyroscope grafana/pyroscope
Grafana Integration
Unified stack:
- Grafana Tempo: traces.
- Grafana Mimir: metrics.
- Grafana Loki: logs.
- Grafana Pyroscope: profiles (+ Parca ingestion).
- Grafana: single dashboard.
Click trace → see profile of that period. Strong correlation.
Real Overhead
Measured in production:
- Parca: ~0.5-1% CPU per node.
- Beyla: ~1-2% CPU, 50MB memory.
- Pyroscope language agent: 1-5% depending.
Combined <5% — acceptable.
Use Cases
Detect Regressions
Deploy new version → Parca shows increased CPU in specific function → rollback before user impact.
Optimise Hotspots
Weekly flame graphs review → identify top-10 CPU consumers → optimise.
Compare Periods
“Why is latency up this week?” → compare profiles last week vs this week.
Root Cause Incidents
Post-mortem: profile during incident vs baseline highlights specific function.
Flame Graph Interpretation
Basics:
- Width = CPU time.
- Stack from bottom up.
- Hot paths = wide top-level boxes.
Skill — learn with Brendan Gregg tutorials.
Language Specifics
Go
Parca excellent: DWARF info, full stacks. Preferred.
Rust
Similar to Go — compiled, rich stacks.
Python/Node
Interpreter frames visible. For Python-level detail, Pyroscope language agent better.
Java
Complex JVM. Specialised Pyroscope Java agent.
C/C++
Native — Parca perfect.
Cost
- Storage: compressed profile samples ~1GB/day for 50-node cluster.
- S3 or similar for long-term.
- Retention: typically 15-30 days.
Manageable.
Alternatives
- Datadog Continuous Profiler: commercial, mature, multi-language.
- New Relic: similar.
- Elastic Profiling: newer.
- Cloudflare AMP: specific context.
Open-source Parca/Beyla/Pyroscope stack has comparable features + free.
Security
eBPF-based tools require:
- Privileged container.
- HostPID / HostNetwork.
- Sufficiently recent kernel (5.10+).
Attack surface consideration. Restrict RBAC.
For New Adopters
Start path:
- Install Parca in staging.
- Review flame graphs for weeks.
- Skill build interpreting profiles.
- Deploy production carefully.
- Correlate with traces/logs setup.
Skill + value accumulate.
When Profiling Not Useful
- No performance issues — wasted effort.
- Simple workloads — htop suffices.
- No expertise interpreting — data without action.
Most production systems benefit; some don’t.
Conclusion
Continuous profiling via eBPF is significant improvement over ad-hoc pprof. Parca + Beyla + Pyroscope stack covers spectrum without big commercial cost. For performance-sensitive ops, invest. Flame-graph interpretation skill pays. Grafana stack integration seamless. Modern production observability increasingly includes profiling as fourth pillar (after metrics/logs/traces). Now is good time to adopt.
Follow us on jacar.es for more on eBPF, profiling, and observability stacks.