GitHub - gma1k/Podtrace: eBPF-driven diagnostic tool for Kubernetes applications 🐝

A lightweight yet powerful eBPF-driven diagnostic tool for Kubernetes applications. Podtrace delivers full-stack observability from kernel events to application-layer behavior, all activated on demand, with no prior configuration or instrumentation. With a single command, it uncovers insights across the entire lifecycle of a pod, including network flows, TCP/UDP performance, file system activity, memory behavior, latency patterns, system calls, and high-level application events such as HTTP, DNS, and database queries.

Overview

Podtrace attaches eBPF programs directly to the container, allowing it to observe real behavior as it happens at runtime. It automatically correlates low-level kernel activity with high-level application operations, surfacing clear, human-readable diagnostic events that reveal what the pod is experiencing internally.

Instead of assembling data from multiple systems or modifying application code, Podtrace provides deep operational visibility in one place, enabling you to understand:

Why a service is slow
Where latency originates
How network and I/O resources are being used
Which operations block or fail
How requests flow through the application
What happens inside the pod during incidents

By combining system-level details, application-layer insights, and real-time event correlation, Podtrace acts as a single on-demand observability lens. This makes it uniquely effective for debugging, performance analysis, and production incident response in Kubernetes environments, especially when time, context, or access is limited.

Documentation

Podtrace documentation is available in the doc/ directory.

Features

Network Tracing

TCP Connection Monitoring: Tracks TCP IPv4/IPv6 connection latency and errors
TCP RTT Analysis: Detects RTT spikes and retry patterns
TCP State Tracking: Monitors TCP connection state transitions (SYN, ESTABLISHED, FIN, etc.)
TCP Retransmission Tracking: Detects TCP retransmissions for network quality diagnostics
Network Device Errors: Monitors network interface errors and packet drops
UDP Network Tracing: Tracks UDP send/receive operations with latency and bandwidth metrics
I/O Bandwidth Tracking: Monitors bytes transferred for TCP/UDP send/receive operations

File System Monitoring

File Operations: Tracks read, write, and fsync operations with latency analysis
File Path Tracking: Captures full file paths
I/O Bandwidth: Monitors bytes transferred for file read/write operations
Throughput Analysis: Calculates average throughput and peak transfer rates

Memory & System Events

Page Fault Tracking: Monitors page faults with error code analysis
OOM Kill Detection: Tracks out-of-memory kills with memory usage details

Application Layer

HTTP Tracing: HTTP request/response tracking via uprobes
DNS Tracking: Monitors DNS lookups with latency and error tracking
Database Query Tracing: Tracks PostgreSQL and MySQL query execution with pattern extraction and latency analysis
TLS/SSL Handshake Tracking: Track TLS handshake latency, errors and failures
Connection Pool Monitoring: Tracks connection pool usage, monitors pool exhaustion, and tracks connection reuse patterns

System Monitoring

CPU/Scheduling Tracking: Monitors thread blocking and CPU scheduling events
CPU Usage per Process: Shows CPU consumption by process
Process Activity Analysis: Shows which processes are generating events
Stack Traces for Slow Operations: Captures user-space stack traces for slow I/O, DNS, CPU blocks, memory faults, and other operations exceeding thresholds
Lock Contention Tracking: Monitors futex and pthread mutex waits with timing and hot lock identification
Syscall Tracing: Tracks process lifecycle via execve, fork/clone, open/openat, and close syscalls with file descriptor leak detection
Network Reliability: Monitors TCP retransmissions and network device errors for network quality diagnostics
Database Query Tracing: Tracks PostgreSQL and MySQL query execution patterns and latency
Resource Limit Monitoring: Monitor resource usage vs limits
Error Correlation with Root Cause Analysis: Correlates errors with operations and Kubernetes context

Distributed Tracing

Trace Context Extraction: Automatically extracts trace context from HTTP/HTTP2 headers and gRPC metadata
Event Correlation: Groups events by trace ID to build complete request flows across services
Request Flow Graphs: Builds directed graphs showing service interactions with latency and error metrics
Multiple Exporters: Supports OpenTelemetry (OTLP), Jaeger, and Splunk HEC
Sampling Support: Configurable sampling rates to control export volume

Diagnostics

Diagnose Mode: Collects events for a specified duration and generates a comprehensive summary report

Alerting

Real-time Alerts: Sends immediate notifications when fatal, critical, or warning-level issues are detected
Multiple Channels: Supports webhooks, Slack, and Splunk HEC for alert delivery
Smart Deduplication: Prevents alert storms with configurable deduplication windows
Rate Limiting: Configurable rate limits to prevent overwhelming notification systems

Prerequisites

Linux kernel 5.8+ with BTF support
Go 1.24+
Kubernetes cluster access

Building

# Install dependencies
make deps

# Build eBPF program and Go binary
make build

# Build and set capabilities
make build-setup

Usage

Basic Usage

# Trace a pod in real-time
./bin/podtrace -n production my-pod

# Run in diagnostic mode
./bin/podtrace -n production my-pod --diagnose 20s

Diagnose Report

The diagnose mode generates a comprehensive report including:

Summary Statistics: Total events, events per second, collection period
DNS Statistics: DNS lookup latency, errors, top targets
TCP Statistics: RTT analysis, spikes detection, send/receive operations, bandwidth metrics (total bytes, average bytes, peak bytes, throughput)
UDP Statistics: Send/receive operations, latency analysis, bandwidth metrics, error tracking
Connection Statistics: IPv4/IPv6 connection latency, failures, error breakdown, top targets
TCP Connection State Tracking: State transition analysis, state distribution, connection lifecycle monitoring
File System Statistics: Read, write, and fsync operation latency, slow operations, bandwidth metrics (total bytes, average bytes, throughput)
HTTP Statistics: Request/response counts, latency analysis, bandwidth metrics, top requested URLs
Memory Statistics: Page fault counts and error codes, OOM kill tracking with memory usage details
CPU Statistics: Thread blocking times and scheduling events
CPU Usage by Process: CPU percentage per process
Process Activity: Top active processes by event count
Activity Timeline: Event distribution over time
Activity Bursts: Detection of burst periods
Connection Patterns: Analysis of connection behavior
Network I/O Patterns: Send/receive ratios and throughput analysis
Process and Syscall Activity: Process execution, fork/clone, file operations, and file descriptor leak detection
Stack Traces for Slow Operations: User-space stack traces for operations exceeding thresholds with symbol resolution
Lock Contention Analysis: Futex and pthread mutex wait times and hot lock identification
Network Reliability: TCP retransmission tracking and network device error monitoring
Database Query Performance: Query pattern analysis and execution latency (PostgreSQL, MySQL)
Connection Pool Statistics: Connection pool usage, acquire/release rates, reuse patterns, and exhaustion events
Potential Issues: Automatic detection of high error rates and performance problems
Resource Limit Monitoring: Monitor resource usage vs limits
Error Correlation with Root Cause Analysis: Correlates errors with operations and Kubernetes context

Running without sudo

After building, set capabilities to run without sudo:

sudo ./scripts/setup-capabilities.sh

Distributed Tracing

Podtrace supports distributed tracing to correlate events across services in your Kubernetes cluster. Traces are automatically extracted from HTTP headers and exported to popular observability backends.

Quick Start

# Enable tracing with OpenTelemetry
./bin/podtrace -n production my-pod \
  --tracing \
  --tracing-otlp-endpoint http://otel-collector:4318

# Enable tracing with Jaeger
./bin/podtrace -n production my-pod \
  --tracing \
  --tracing-jaeger-endpoint http://jaeger:14268/api/traces

# Enable tracing with Splunk
./bin/podtrace -n production my-pod \
  --tracing \
  --tracing-splunk-endpoint https://splunk:8088/services/collector \
  --tracing-splunk-token YOUR_TOKEN

Features

Automatic Trace Extraction: Extracts W3C Trace Context, B3, and Splunk headers
Service Correlation: Groups events by trace ID across multiple services
Request Flow Graphs: Visualizes service interactions
Multiple Exporters: OTLP, Jaeger, and Splunk HEC support
Sampling: Configurable sampling rates (0.0-1.0)

Alerting

Podtrace can send real-time alerts when critical issues are detected, including resource limit violations, high error rates, exporter failures, and fatal errors. Alerts are sent via webhooks, Slack, or Splunk HEC.

Quick Start

# Enable alerting with webhook
export PODTRACE_ALERTING_ENABLED=true
export PODTRACE_ALERT_WEBHOOK_URL=https://alerts.example.com/webhook
./bin/podtrace -n production my-pod

# Enable alerting with Slack
export PODTRACE_ALERTING_ENABLED=true
export PODTRACE_ALERT_SLACK_WEBHOOK_URL=https://hooks.slack.com/services/YOUR/WEBHOOK/URL
export PODTRACE_ALERT_SLACK_CHANNEL="#alerts"
./bin/podtrace -n production my-pod

# Enable alerting with Splunk
export PODTRACE_ALERTING_ENABLED=true
export PODTRACE_ALERT_SPLUNK_ENABLED=true
./bin/podtrace -n production my-pod

Features

Automatic Detection: Monitors resource limits, error rates, and system failures
Multiple Severity Levels: Fatal, Critical, Warning, and Error
Alert Deduplication: Prevents duplicate alerts within a time window
Rate Limiting: Configurable limits to prevent alert storms
Retry Logic: Automatic retries with exponential backoff

See the Alerting Guide for detailed configuration and usage.

Prometheus & Grafana Integration

Podtrace exposes runtime metrics for Kubernetes pods using a built-in Prometheus endpoint. These metrics cover networking (TCP/UDP), DNS, CPU scheduling, file system operations, memory events, and HTTP tracing, all labeled per process and event type.

Running:

./bin/podtrace -n production my-pod --metrics

launches an HTTP server accessible at:

http://localhost:3000/metrics

Prometheus Scrape Configuration

In your Prometheus scrape job, set <PODTRACE_HOST> to the address of the pod or host running Podtrace.

scrape_configs:
  - job_name: 'Podtrace'
    static_configs:
      - targets: ['<PODTRACE_HOST>:3000']

Available Metrics

All metrics are exported per process and per event type:

Metric	Description
`podtrace_rtt_seconds`	Histogram of TCP RTTs
`podtrace_rtt_latest_seconds`	Most recent TCP RTT
`podtrace_latency_seconds`	Histogram of TCP send/receive latency
`podtrace_latency_latest_seconds`	Most recent TCP latency
`podtrace_dns_latency_seconds_gauge`	Latest DNS query latency
`podtrace_dns_latency_seconds_histogram`	Distribution of DNS query latencies
`podtrace_fs_latency_seconds_gauge`	Latest file system operation latency
`podtrace_fs_latency_seconds_histogram`	Distribution of file system operation latencies
`podtrace_network_bytes_total`	Total bytes transferred over network (TCP/UDP)
`podtrace_filesystem_bytes_total`	Total bytes transferred via filesystem ops
`podtrace_cpu_block_seconds_gauge`	Latest CPU block time
`podtrace_cpu_block_seconds_histogram`	Distribution of CPU block times
`podtrace_resource_limit_bytes`	Resource limit in bytes (CPU/Memory/I/O)
`podtrace_resource_usage_bytes`	Current resource usage in bytes
`podtrace_resource_utilization_percent`	Resource utilization percentage
`podtrace_resource_alert_level`	Resource alert level (0-3: none/warning/critical/emergency)
`podtrace_pool_acquires_total`	Total connection pool acquires
`podtrace_pool_releases_total`	Total connection pool releases
`podtrace_pool_exhausted_total`	Total pool exhaustion events
`podtrace_pool_wait_time_seconds`	Histogram of pool wait times
`podtrace_pool_connections`	Current number of connections in pool
`podtrace_pool_utilization`	Pool utilization percentage

Grafana Dashboard

A ready-to-use Grafana dashboard JSON is included in the repository at podtrace/internal/metricsexporter/dashboard/Podtrace-Dashboard.json

Steps to use:

Open Grafana and go to Dashboards → New → Import.
Paste the JSON or upload the .json file.
Select or your Prometheus datasource as the datasource.
Import. The dashboard will display per-process and per-event-type metrics for RTT, latency, DNS, FS, and CPU block time.