eBPF – The Future of Networking and Security
cilium.ioThe future ought to be capabilities. All this policy scripting stuff is just drudgery make-work that gets us nowhere.
Can you provide more context on why you feel that's true (or even possible)?
For the last few years, I managed the Container Runtime group at Facebook. My experience has been:
1. `if (has_capability(..., X)) { ... }` gets put into code pretty haphazardly in a way that's not necessarily super well structured. Once it's there, it's ABI, and you're screwed if you want to iterate on it. That's why cap_sys_admin is /almost/ root.
2. If you wanted to do the right thing from the jump (e.g. for bpf itself), you'd have to add a new capability. This is a heavy lift for something that might not actually get any traction. It requires changing a bunch of common tools, and you likely end up breaking a bunch of applications.
3. Debugging capability failures is a pain in the ass. We ended up building and deploying capability tracing infrastructure just to figure out what people are actually using.
4. For gradual roll outs of enforcement/changes, you need the flexibility to warn first, enforce second. We did large scale monitoring of all such changes to make sure we didn't break the workloads.
5. Even if you nail all of the above, the ability to make finer-than-capability-grained decisions (i.e. binding to port 20 or 80 is okay but not port 22) is really valuable.
I'm all for kernel abstractions that just work and solve all problems for all people, but I think the overwhelming trend has been towards kernel interfaces that provide a lot of flexibility and then more opinionated libraries/tools that kind of let us have our cake and eat it to (io_uring => liburing, bpf => libbpf, btrfs => btrfstools).
Are we talking about POSIX capabilities or object capabilities?
What POSIX and the linux kernel calls "capabilities" unfortunately result in quite a bit of confusion, which I believe is the cause of your post. POSIX capabilities bear little resemblance to actual capability based security (where a capability is a send/recv-able token that references an object and a set of rights for interacting with that object).
I was not aware of object capabilities -- TIL.
That said, looking at the (apparently) leading implementation, capsicum
> Capsicum also introduces capability mode, which disables (with ECAPMODE) all syscalls that access any kind of global namespace; this is mostly (but not completely) implemented in userspace as a seccomp-bpf filter.
So I do feel that bpf ultimately enables building the kinds of abstractions that people want.
BPF wasn't originally conceived of as a reference monitor or ACL system; in fact, originally, it was believed that operating systems would use BPF-style packet filters to do pretty much all their demuxing.
That's all true. I'm worried about what people will do with this stuff in practice (more rope to hang themselves) not eBPF fundamentally is.
Are you referring to STREAMS? https://en.m.wikipedia.org/wiki/STREAMS
Not really, though BPF was sort of a design alternative to Solaris/SVR4 STREAMS. If you follow the cites, BPF is really an evolution of CSPF, the CMU-Stanford Packet Filter. (As I understand it! McCanne could show up at any moment and correct me.)
AFAIK BPF wasn’t conceived as anything security-related, it was just an optimization.
From the article:
Buggy kernel code will crash your machine. The kernel is not protected from a buggy kernel module. I think people assumed that this is just how things are; that's the price to do kernel programming. eBPF changed this dogma. It brought safety to kernel programming.
"It brought safety to kernel programming" , if you use eBPF and don't expose bugs in the parser, or checking or validation systems. (These have already happened).
Extending a bit on what Alexei is talking about (Full eBPF summit talk: https://youtu.be/jw8tEPP6jwQ?t=639)
Many people seem to make an assumption that kernel code is perfect and that when code is merged into the Linux kernel, it is automatically secure. That is definitely not the case. Kernel developers make mistakes as well and they have devastating consequences.
Right now, the security of the Linux kernel code depends on a combination of code review, fuzzing, controlling the pace of code changes, and running LTS releases to increase the chance others found the bugs already.
eBPF further increases the security model of kernel development by adding a verification step to the model. It means that there is an additional layer of protection in case of code imperfections.
The focus on eBPF safety is awesome. eBPF is software, software will have bugs, eBPF is no exception. The best way to improve the security of software is to question it. Given the wide spread use of eBPF in highly critical and exposed scenarios, the pressure on making it as bug-free as possible is very high so it's probably fair to assume that the scrutiny put in place, will lead to a high quality implementation of the verifier.
BPF has always been statically verified, back to 1991 or whenever.
If anything, eBPF is less sound than classic BPF, because the verifier is dramatically more complicated, as is the execution environment.
He means security in the sense of adding new security controls to Linux. Yes, the core idea of BPF (e- or otherwise) is that the code can be verified not to harm the kernel.
It was, but it was an optimization over earlier VM-based packet filters, which were definitely not optimizations; they were pursued as elegant system design, not high-performance networking.
I'm not an expert in BPF by any means. My gut tells me that the hype of eBPF is an example of Hyrum's law. That is, eBPF will be leveraged beyond its design intent, as an in-kernel JIT engine. This is more a comment on human nature than the technology itself.
Even looking at the original BPF which focused on filtering packets as they are forwarded to userspace (think tcpdump)[1] and looking at the extensions that eBPF provides on top to hook into various subsystems[2,3], it's clear that this is going far beyond the use cases originally envisioned. I'd love to see an eBPF paper to follow up / contrast with the '93 USENIX BPF paper.
[1]: https://www.tcpdump.org/papers/bpf-usenix93.pdf
[2]: https://ebpf.io/what-is-ebpf#hook-overview
[3]: http://www.brendangregg.com/BPF/bpf_performance_tools_book.p...
FWIW: I just wrote a long-ish post on the history from BPF (and before BPF) to eBPF and XDP:
https://fly.io/blog/bpf-xdp-packet-filters-and-udp/
An interesting fact is that packet filtering as a problem domain has been dominated by in-kernel virtual machines going back into the 1980s; it's an idea that comes all the way from Xerox.
Need to know what's type of water the people at Xerox Palo Alto were drinking.
They pioneered many groundbreaking and game changing works on computing including (but not limited to) windowing desktop environment, integrated programming/structural editor with CEDAR/Tioga, SQL (team moved to Oracle), Ethernet networks, laser printer, VLSI and Jupiter operational transform for distributed computing (precursor to CRDT). Each of this technology is now an industry of its own.
I kinda feel like Dealers of Lightning should be required reading at this point[1], both for the breadth of invention and how they squandered it.
[1] https://www.amazon.com/Dealers-Lightning-Xerox-PARC-Computer...
Unfortunately we are still quite far from the safe computing platforms they were using at Xerox (Interlisp-D, Smalltalk, Mesa and Mesa/Cedar).
The best we have gotten so far are the hybrids .NET/Windows, JME, Android Java/Linux, Chrome/Linux, Swift/iOS/macOS.
The shift from BPF to eBPF was less of an evolutionary step as the name might indicate. The overlap with the name BPF is primarily due to the requirement for eBPF to be a superset of BPF in order to avoid having to maintain two virtual machines long-term. This was one of the conditions for eBPF to be merged and in that context, the name eBPF made sense.
Disagree (see sibling post). Classic BPF could have been translated into any virtual machine design they came up with (because classic BPF is incredibly simple). When McCanne came up with the same design in 1998, his team called it "BPF+", for the same reason eBPF is called eBPF --- because it is pretty much an evolution of the earlier idea.
I'm not going to argue with you. You can read up on initial naming and framing in slides of netconf and plumbers conferences as well as LKML archives.
Remember when Microsoft claimed to invent various computing technologies, even though they had been around since the 70s or earlier?
That’s the type of history you’re articulating here.
To be clear: the dispute over the history of BPF/eBPF is not interesting, and I don't want to litigate it anymore than they do.
I'm just here to say that eBPF and BPF are in fact pretty closely related. The eBPF design is uncannily similar to Begel, McCanne, and Graham's BPF+ design[1]; in particular, the BPF+ paper spends a fair amount of time describing an SSA-based compiler for a RISC-y register ISA, and eBPF... just uses (at this point) LLVM for a RISC-y register ISA.
Most notably, the fundamental execution integrity model has, until pretty recently, remained the same --- forward jumps only, limited program size. And that's to me the defining feature of the architecture.
The lineage isn't important to me, so much as the sort of continuous unbroken line from BPF to eBPF, regardless of what LKML says.
[1]: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.597...
Easy to predict something that's already happening. :)
https://github.com/xdp-project/xdp-tutorial
It's a good thing, I think! Compared to loading new unmanaged C code into the kernel, BPF is a really nice way to add functionality to Linux.
Disclaimer: I wrote the post.
Happy to answer any questions.
First of all, congrats. The tech is great and I hope you'll be able to make a company around it.
As for the question: How are you looking to make money?
I'm not going to spam this forum with a marketing pitch so I'll just refer to https://www.isovalent.com/product and add that you can buy a Cilium Enterprise distribution with enterprise specific add-ons from us.
At first two annoying lies in the title alone.
The Future of Networking? Networking is not only linux. eBPF is linux-only. Everyone else uses the secure variant dTrace, which has even wide-spread user-space support. So you can trace across the kernel, processes and its extensions/scripts. For decades.
Future of Security? eBPF is insecure. User-accessible arrays in the kernel can never be secure. dTrace did not do that for a reason, it was already compromised with the spectre-like attacks, and the fixes were laughable at best to safe face.
Linux might be advised to do better (or is just NIH?), but advertising Worse as Better was fashionable in the 80ies only.
I personally think that networking will be almost exclusively based Linux in some form. If you want to interpret it as "eBPF - The Future of Linux Networking" then that is totally fine as well. That said, eBPF-based networking can be offloaded to SmartNICs already so it may be less Linux specific than you seem to assume right now.
Comparing dTrace and eBPF is definitely a very interesting question. I've actually asked Brendan Gregg in the Q&A of his keynote at eBPF summit this year how he compares dTrace and eBPF these days. Here is his answer (jumps right to the specific question): https://youtu.be/jw8tEPP6jwQ?t=4618
I doubt that eBPF will remain a Linux-only technology. Ports to FreeBSD are already underway it seems [0] and Microsoft declared intent to invest into eBPF [1]. I'm not sure what that means on timeline for eBPF availability on Windows though. There are also several user space implementations for eBPF which could become interesting to provide a universal programmability approach across traditional kernels like Linux, microkernels like Snap and application kernels like gVisor.
[0] https://papers.freebsd.org/2018/bsdcan/hayakawa-ebpf_impleme... [1] https://twitter.com/markrussinovich/status/12830391539203686...
eBPF is also used high throughput blockchain, Solana
https://github.com/solana-labs/rbpf
Unlike more common Rust + LLVM + WASM toolchain, Solana smart contracts use Rust + LLVM + eBPF.
Solana uses a custom Rust re-implementation of a custom C re-implementation of the Linux BPF VM for what appears to be licensing reasons. Notably, it's jitting all bytecode without a verifier or emitting runtime bounds checks[0]. I suspect you can pop a shell on every single computer on their testnet somewhere between "trivially" and "extremely trivially".
They appear to be running some kind of "open security test"[1] but are only paying out their own imaginary funny money. I'd suggest you run for the hills as fast as you can instead of considering Solana.
0: https://github.com/solana-labs/rbpf/blob/f7007d6ae8728e61401... 1: https://forums.solana.com/t/tour-de-sol-stage-1-details/317
Interesting. I am not sure if your comment Without a verifier make sense. Because AFAIK you need to verify the contract only once, when it is deployed. Not every time it is invoked. Verifying a contract should be super cheap compared to executing it, unless eBPF verification is somehow super expensive.
I was under the impression that Cilium was one of the more common choices for Kubernetes CNI but judging by the other comments... maybe not?
We’re currently moving to Kubernetes for our infrastructure at the Berkeley OCF (https://ocf.berkeley.edu/), and picked Cilium for all the networking things.
It’s good to see that there’s a company backing it now!