EBPF, which stands for Extended Berkeley Packet Filter, is a technology that makes it possible to run special programs deep inside the Linux operating system in an isolated way. As it filters data packets from networks and embeds them into the kernel, the BPF also provides a network interface with security layers that ensures the packet data is reliable and accessible. Using this approach, teams can more easily and efficiently collect crucial observability data from Linux applications and network resources.
In a Linux system, most applications users can directly interact with what is known as “userland.” Because userland applications don’t have to integrate deeply with low-level system processes, they are easy to install and run.
However, a major limitation of userland applications is that, because they don’t have low-level access to the Linux kernel, it is difficult for them to collect data directly from the operating system. Only processes that run in so-called “kernel land” have that level of access. Kernel land processes are typically limited to services that are critical to the operating system, such as those that manage hardware.
Traditionally, developers bridged the gap between userland and kernel land by building features into the kernel that exposed certain types of data to userland applications in a secure way. The problem with this approach is that it requires modification of the kernel itself, which is a hugely complex platform that is difficult to change in a secure way.
An alternative solution was to create kernel modules that could collect data from the kernel. System administrators could load those modules into the kernel after the kernel was already running. This made it possible to gain low-level access without directly modifying the kernel. However, the drawback with this strategy is that loading kernel modules typically requires root-level access. Kernel modules may also introduce security and stability issues if they are not properly designed, and they increase the resource utilization of the operating system.
EBPF takes a different approach to gaining low-level access to the kernel. It provides a sandboxed environment (meaning they are mostly isolated from the rest of the system, even though they can interact with certain components based on the settings admins configure) in which admins can load programs that run directly in the kernel.
Using this approach, teams that want to know what is happening deep inside the Linux operating system gain several benefits:
Simplicity: Unlike directly modifying the kernel, loading eBPF programs is fast and easy.
Security: Because eBPF programs are sandboxed from the operating system, they don’t create the security or performance challenges that kernel modules may pose.
Efficiency: EBPF programs use minimal system resources, which means they don’t make a large performance impact on the operating system.
EBPF has many potential use cases related to observability, such as monitoring the performance of hardware devices or helping to detect security issues.
However, one of the areas where eBPF offers the very most value is in the realm of network observability. Modern applications are often deployed across a cluster of servers. They are also frequently hosted inside containers, serverless functions or similar types of infrastructure, which abstracts applications from the host operating system.
Under these conditions, observing the network has conventionally required collecting and correlating data about network operations from a variety of servers. What’s more, it requires getting network data from individual containers that, in most cases, don’t log their networking operations to their host operating system, or even store network-related data persistently. To address these challenges, teams had to deploy a complex array of userland applications—often, a network monitoring agent on each server, as well as an agent that could collect networking data from each container through a service mesh, sidecar architecture or similar approach.
EBPF offers a much simpler, more elegant solution to network observability. With eBPF, teams can run kernel-level programs that observe network operations for all containers running on a server. In this way, eBPF eliminates the need to deploy agents for each container separately. It also provides access to low-level networking data that may not be available from within a container, whose access to kernel-level resources is usually restricted (unless the container runs in privileged mode, which is not a recommended approach).
In this way, eBPF helps address one of the core challenges of network observability in modern applications. It provides a secure, simple, and efficient means of understanding what is happening within all Linux-based endpoints.
If you want to create and run eBPF programs yourself, you can do so using open source tools like bpftrace and bcc, which make it easy to create and deploy programs into a Linux kernel using eBPF. Just make sure you have a Linux-based operating system that is running kernel version 4.4 or later. Earlier versions of Linux don’t support eBPF. You may also need to disable “lockdown mode” on your kernel using these commands:
# echo 1 > /proc/sys/kernel/sysrq # echo x > /proc/sysrq-trigger
However, a simpler way to take advantage of eBPF is to deploy an observability tool that uses eBPF under the hood to collect the data you need to understand what is happening in your systems. Increasingly, modern observability suites are incorporating eBPF to help monitor complex systems.
Using a tool that leverages eBPF, you gain the benefits of eBPF without having to create and deploy eBPF programs yourself. You also enjoy the advantage of being able to analyze the data automatically, which eBPF itself does not help you do because it simply collects data and leaves it up to the user to shape and understand the data.
A pracitcal example of using the extended Berkeley Packet filter can be found at Kentik Labs, Kentik’s open source hub for the developer, DevOps and site reliability engineering (SRE) community. The Convis (“container visibility”) project demonstrates use of the Linux extended BPF facility to attribute process and container information to network traffic. You can read more about Convis in this Kentik Labs blog post, “Convis - Open Source Container Visibility”.