Making Libcurl Work in WebAssembly

jeroen.github.io

43 points by tambourine_man a day ago


vk6 - 16 hours ago

I did a similar project recently, although it was more focused on getting a good Javascript API out of libcurl, rather than integrating with a different language like R: https://github.com/ading2210/libcurl.js

My first approach for networking was also to use SOCK5 through a Websocket. However, this turns out to be really slow. Each new connection created by emscripten requires waiting for: the TLS handshake from the browser to your proxy, the Websocket handshake which takes place over HTTP/1.1, the SOCK5 handshake on the Websocket, and the TLS handshake from libcurl to the destination server.

That's many many round trips required just for a single request! In practice, if the proxy server isn't physically close to you, the latency can be multiple seconds. This is partially mitigated by the fact that libcurl can use HTTP/2 to reuse that socket, but if you're placing requests to different hosts, or those that don't support HTTP/2, this is a huge problem.

The solution is to make it so that multiple TCP sockets can share the same Websocket, and then minimize round trips in the proxy protocol. I wrote a new protocol for this purpose here: https://github.com/MercuryWorkshop/wisp-protocol

It basically acts like multiplexed SOCKS5 over a Websocket. One trick that it uses to reduce latency further is for the client to simply assume creating a new socket succeeded, and to start immediately sending data, which eliminates another round trip. So apart from the very first connection which establishes the Websocket, there is zero added latency for new sockets.

Actually getting Emscripten to use this is slightly cursed and you need to patch the generated JavaScript using some Regex. I could probably get this upstreamed in emscripten someday through.

Also, it turns out that when writing this sort of network proxy, it doesn't really matter what language you use. The bottleneck ends up being the Linux TCP stack. You might think that a hyper optimized Rust or Go based Websocket proxy would be faster, but I found that the Wisp proxy server I wrote in Python was on par with the one written in Rust during synthetic tests. Even the slowest implementations get upwards of 2 gbit/s of throughput (on slow CPUs) which can saturate the NICs of almost all VPS providers.

kamranjon - 21 hours ago

Sorry if this is obvious, but I read the article and am still a bit unsure. If you use libcurl on the front end to download a file using this method - where does the file end up? Is it in the browsers memory? Is it piped through websockets to some backend service? Is it written to local disk using the newish file system API?

immibis - 18 hours ago

Why do you need libcurl to work in WebAssembly... when you're already running in a browser?

(The answer: to run third-party code that uses libcurl because it isn't designed to run in web browsers)