Bytecode Alliance: One year update
bytecodealliance.orgThe focus on building a capabilities based API is so good.
Most platform apis let you just do stuff. Open /home/user/my-image.png. Ok.
The capabilities model re-orientes operations like this. The app starts with some kind of handle to a starting directory. Maybe it's to /, the root, maybe it's to /home/user, the user directory. Whoever has a reference to that directory handle can use it to open files / other directories inside that directory, but can not go upwards!
So if you have a file-saving middleware, you can be sure, whatever libraries that middleeware uses, that it will never write to anything other than the directory you give it.
This is a subtle & small change, with massive impact. Most OS have a parallel system-call that works similarly, openat(2), which is built around this idea of directory handles.
Why is it called “capability based” and not just what it is: filesystem sandboxing?
Great question.
Capabilities are a more generic, core idea.
In this particular example, the capability to see a directory & it's children is what's getting passed around & refined (starting with the top level directory the app has, then perhaps winnowing down the capability to pass to the upload middleware).
But capabilities can represent other things too. The program might get capabilities to open a socket, to listen on a socket, to send or receive data on a socket. Those capabilities might similarly be refined & passed around to libraries.
The broader model of capabilities is that you can only do whatever you are passed, only exercise whatever capabilities you get. You have no other way to talk to the platform, no platform api, other than the set of capabilities you get from your caller. So capabilities keep getting winnowed down, shrunk, to the right size, where we expect, for the file-upload middleware, that it can read some bytes off some sockets it gets passed, and where it can write files, into a certain directory, but having no other capabilities, it can not do any more than that.
While filesystem sandboxing was core to my example, it's just one demonstration of what a capability might be. The underlying model for how that sandboxing is implemented is "capability based". There's a pretty long history for capability based systems. Not a simple example, but the "E language" is a pretty well known example from 1997 that tried to push capabilities into an interesting distributed frontier: http://www.erights.org/
You can have sandboxes that are not capability-based. It refers to a specific style of modeling the problem. (EDIT: and e.g. the Rust crate referenced doesn't actually sandbox the process, but uses design and kernel APIs to make it harder to mistakenly access the wrong files - still a capability system)
It's not just filesystem sandboxing. One capability could be access to part of the filesystem, sure, but another might be networking, access to camera, microphone, USB devices, playing audio, Bluetooth, maybe you wanna give raw access to a block device rather than just access to a filesystem, ...
The nanoprocess concept seems like a big deal.
I remember reading about a vulnerability in the unix utility "strings". The code is incredibly simple at first glance, but it had a dependency to detect the filetype, and that dependency was not safe on untrusted input.
At that moment I realized that unix security was fundamentally flawed. A utility that does nothing but read its input and write the output shouldn't have permission to do anything else.
This seems like a positive development. I have two items on my wish list:
- tail recursion elimination. There's a draft proposal but only one engine implements it at the moment which is blocking further progress.
- RISC-V backend for Cranelift (TBH, I'm sure someone will do it eventually).
See here for discussion on Wasmtime team moving from Mozilla to Fastly: https://news.ycombinator.com/item?id=24897641