eBPF for tracing how Firefox uses page faults to load libraries
taras.glek.netI went to go learn more about eBPF, but the ebpf.io site reads like a sales pitch. "Revolutionary technology", "The possibilities are endless, and the innovation that eBPF is unlocked has only just begun", "revolutionary new approaches", "unprecedented visibility".
I know I may be a curmudgeonly old fart, but to echo clktmr's comment, in this case, it seems like a glorified strace(). It also seems like a lot of hype for something that seems to have potential to have lots of unintended consequences.
eBPF is much more than a glorified strace(): with eBPF you can basically inserts your own code in a lot of places in the kernel.
This can be packet-processing code to modify the way packets are routed, filtered, altered, or it can be used to instrument kernel codepath to monitor or debug issues.
The latter is what can be compared to strace(), but with strace() you only see what is happening at the userspace/kernel boundary. With eBPF you can actually look at what is going on in the kernel itself, which is really powerful.
The downside is that eBPF can be a pain to use with all those obscure tools tightly dependents upon your kernel release and options... But things are improving quickly and if you want to give it a try, I would recommend starting with bpftrace: https://bpftrace.org/
I would be curious to know what malicious implications eBPF could have. related discussion [1] [2] For example, could a file-less trojan be injected via eBPF to reroute specific data payloads or copy specific payloads to different destinations? Or silently censor specific destinations?
Are there ways that eBPF could be abused and if so what mitigations, limitations and logging can one implement so that eBPF can remain enabled in a hardened and sensitive environment?
[1] - https://utcc.utoronto.ca/~cks/space/blog/linux/DisablingUser...
ebpf requires root privilege to run. If you're root, there are a lots of harms one can do a system without any ebpf script/commands.
Agreed, however people can be easily tricked into running scripts as root hence my question about mitigations and logging and file-less trojans. I assume I can get half the internet to run my script as root. The remaining challenge is how does one work backwards and see what occurred? I can see some pieces with auditd logging. I can disable user-space eBPF. What additional logging and mitigations can be enabled?
Some additional discussion points [1]
> The remaining challenge is how does one work backwards and see what occurred?
How would you work backwards to see what occurred if you'd run a malicious script/binary as root? The launching of an eBPF thing would leave the same traces and non-traces, right? And if there's a way to introspect all running eBPF things, it might be harder for an eBPF thing to hide itself, due to my assumed limitations of the eBPF runtime/VM/world-view-thing, the only problem then would be forgetting to look for it, but eBPF isn't unique in being potentially forgotten.
For other things such as a malicious script I would use SELinux, IPTables owner module and auditd to see what is going on and to limit what can be done. This assumes one removes the unconfined_t types and assumes a file if running as root. None of those things dynamically execute code by design. That said my question is around file-less behavior and monitoring. As far as I can tell there is zero monitoring unless to your point you build it yourself and have custom eBPF code running all the time. I would not expect this to be a common pattern.
A vulnerability in this space is entirely different in my view. If a Linux workstation is browsing a watering hole that tries to exploit eBPF the code is injected directly in the network stream with root permissions and never touches the storage unless it wants to. This could theoretically be a wonderful way to chain exploits and hand them over to undocumented CPU instructions or monitor a victims traffic or block their access to a site and they would be none the wiser and no audit trail or a need to elevate privileges. This is always running in the background as root and monitoring all the traffic and can dynamically execute instructions on the fly based on network input.
Outside of eBPF this would require exploiting the persons web browser then elevating privileges and making changes to the system with calls that could be monitored or even blocked with existing tools such as SELinux, Firejail, auditd and so on.
So I guess ultimately my questions are: Where are the monitoring tools and mandatory access controls for eBPF? Or if there is no answer for that then my question would be: What is the kernel boot option to entirely disable eBPF? It appears I can only change the JIT settings.
To answer my own question it appears the only option is to recompile the kernel to disable BPF.
I've written my BSc thesis on Kubernetes bandwidth management with eBPF a year ago. This is exactly what I felt trying to research this technology. Countless blog posts about how great eBPF is, close to none useful resources... And from what I've seen since, it's only gotten marginally better.
Brendan Gregg's writings are pretty good.
https://www.brendangregg.com/blog/2019-01-01/learn-ebpf-trac...
This presentation should give insight to some of the possibilities of ebpf. Strace is passive, ebpf can be active and make decisions in kernel space.
https://www.usenix.org/conference/nsdi21/presentation/ghigof...
Another example is controlling the scheduler with ebpf.
It's a glorified strace in the same way that gcc is a glorified hello world
strace is a Linux specific utility, ptrace() is the syscall. If you're going to claim that you can achieve what TFA describes without eBPF, it would help if you atleast got the notation right.
I haven't tried my hands at eBPF yet. But wouldn't it be easier to just use strace() as long as you are only interested in syscalls?
Author here. strace wouldn't work for this. Need to track individual page faults + their addresses. Problem with memory-mapped IO is that it's all done by memory-access "side-effects".
The strace-like functionality is supposedly more efficient and is more convenient.
If you're only interested in syscalls, then yes. But a library's memory is mmaped (syscall), which just establishes a virtual address mapping for the library file. When the library is accessed, that mmap'ed region is faulted in (not a syscall). This is something where you need eBFP (or dtrace, etc) to see what's happening.
Not Linux, but in FreeBSD page fault tracing is provided by ktrace(8) ('ktrace -t +f').
I use "ktrace -t f" once in a while for debugging and it's really handy. Output looks like
78436 cat PFLT 0x6c71f99cda8 0x2<VM_PROT_WRITE> 78436 cat PRET KERN_SUCCESS 78436 cat PFLT 0x3c6efd36c280 0x2<VM_PROT_WRITE> 78436 cat PRET KERN_SUCCESS 78436 cat PFLT 0x3c6efd36e158 0x2<VM_PROT_WRITE> 78436 cat PRET KERN_SUCCESS ...
Obviously not nearly as flexible as ebpf though. For instance it'll log all page faults happening in the context of the process, and so includes page faults that happen in the kernel due to copyin()/copyout() etc. Sometimes it's helpful and other times confusing.