eBPF

In simple terms

Traditionally, if you wanted to add custom behaviour to the Linux kernel (a new packet filter, a performance tracer, a security policy), you had two options: write a kernel module (dangerous — bugs crash the kernel, requires root, requires recompilation) or accept the performance cost of doing it in userspace (expensive system calls per event). eBPF is a third option: you write a small, safe program in a restricted C-like language; a verifier checks it for safety; and the kernel JIT-compiles it and runs it at hook points inside the kernel at near-native speed. No kernel module, no reboot, no crash risk.

More detail

Origins: BPF (Berkeley Packet Filter) was introduced in BSD in 1992 for efficient packet filtering (tcpdump). eBPF (extended BPF) in Linux 3.18 (2014) dramatically extended it with a richer instruction set, maps (persistent data storage), and hook points throughout the kernel — far beyond just packet filtering.

The eBPF architecture:

Write: eBPF programs are written in restricted C (no loops unless bounded, no arbitrary memory access) and compiled with clang/LLVM to eBPF bytecode.
Verify: the kernel’s eBPF verifier performs static analysis — checks all branches terminate, no out-of-bounds memory access, no unsafe pointer dereferences, no calling non-approved kernel functions. Guarantees safety without runtime overhead.
JIT compile: the eBPF bytecode is JIT-compiled to native x86-64/ARM instructions. Performance is ~95% of hand-written kernel code.
Attach: the program is attached to a kernel hook point.
Run: every time the hook fires, the eBPF program executes in kernel context.

Key hook points:

Network stack:

XDP (eXpress Data Path): earliest hook — just after the driver receives a packet, before any kernel processing. Can drop, redirect, or modify packets at line rate (100 Gbit/s). Used for DDoS mitigation, load balancing.
TC (Traffic Control): hook in the kernel’s traffic control layer; both ingress and egress.
Socket filters (classic BPF): filter packets on a socket — used by tcpdump, Wireshark.

System calls and tracing:

kprobes / kretprobes: dynamic probes on any kernel function, entry or return.
uprobes: dynamic probes on userspace function calls.
tracepoints: static, stable probe points in the kernel (better than kprobes for production).
perf events: CPU performance counter sampling for profiling.

Security:

LSM BPF: hook into Linux Security Module framework — implement custom MAC policies.

eBPF maps: persistent, shared key-value stores accessible from both kernel eBPF programs and userspace. Types: hash map, array, LRU hash, ring buffer, per-CPU hash (lock-free), stack trace maps. Maps pass data from kernel programs to userspace for aggregation and display.

Key tools and frameworks:

bcc (BPF Compiler Collection): Python + C framework for writing BPF tools. Scripts like execsnoop (trace exec calls), tcplife (TCP connection lifetimes), runqlat (run queue latency histogram) are written in bcc. Brendan Gregg’s tools use bcc.

bpftrace: high-level tracing language for one-liners and scripts. bpftrace -e 'kprobe:sys_read { @[comm] = count(); }' — count read() calls by process name. Similar to DTrace for Linux.

libbpf + CO-RE: the modern low-level API for portable eBPF programs. CO-RE (Compile Once, Run Everywhere) uses BTF (BPF Type Format) to write programs that work across kernel versions.

Cilium: Kubernetes CNI plugin using eBPF for pod networking, network policy, and load balancing. Replaces iptables (O(n) rule matching) with eBPF hash maps (O(1)). Used by Google GKE, Azure AKS, and many large clusters.

Falco: security monitoring using eBPF; detects suspicious syscalls and network activity.

Katran: Facebook’s eBPF-based L4 load balancer; replaced IPVS; runs at XDP for 100G line rate.

Performance: XDP can process ~25 Mpps (million packets per second) on a single core vs. ~10 Mpps for DPDK with similar code logic — because XDP runs earlier in the stack with less overhead. For observability (bpftrace), overhead is typically under 1% CPU.

Why it matters

eBPF has transformed Linux kernel development. What previously required kernel modules (years of development, kernel stability risk) can now be done in eBPF in hours with zero kernel modification. Cilium has replaced iptables for Kubernetes networking at Google, Meta, and Microsoft scale. Cloudflare uses eBPF/XDP for DDoS mitigation at 100 Gbit/s. Every major observability platform (Datadog, Dynatrace, New Relic) has an eBPF-based agent. Facebook’s Katran, Meta’s Katran/IPVS replacement, and countless performance tools run on eBPF. It is arguably the most significant Linux kernel technology of the last decade.

Real-world examples

Cloudflare: eBPF/XDP drops malicious packets at line rate (~100 Gbit/s) before the kernel network stack processes them; their DDoS mitigation runs at ~26 Mpps per core.
Google GKE (Dataplane V2): uses Cilium with eBPF instead of kube-proxy (iptables) for service routing; O(1) instead of O(n) rule lookup.
Meta Katran: eBPF-based L4 load balancer running on every Facebook server; replaced custom IPVS solution.
Datadog: eBPF-based network performance monitoring traces all TCP connections and DNS queries with under 1% overhead.

Common misconceptions

“eBPF is only for networking.” eBPF is a general kernel programmability mechanism — used for networking, observability (tracing), security policy (LSM BPF), and performance profiling. The “Berkeley Packet Filter” name is historical.
“eBPF programs can crash the kernel.” The verifier prevents unsafe programs from loading. A verified eBPF program cannot crash the kernel, leak memory, or access arbitrary memory. This is the key safety property that distinguishes eBPF from kernel modules.

Learn next

eBPF is a middle ground between kernel bypass (full userspace driver) and standard Linux kernel processing. It enables observability at the kernel level. The sandbox model of the eBPF verifier is related to WebAssembly’s and capability-based security models.