Unsafe at Any Speed: Tradeoffs and Values in the Rust Ecosystem
bitbashing.ioI want to make a sign on the wall that denotes $x days since someone complained about Tokio + Reqwest being bloated and/or making their own design decisions.
I also want to make a campaign to direct people to ureq first so we stop hearing about this every other month. Reqwest is very good and worth just using IMO, but if you really care about this, ureq is pretty battle tested and a fine option for "just give me a fucking HTTP client".
(A selfish third point but ureq needs a happy eyeballs implementation, but that's neither here nor there I guess)
Edit: I did read the article and yes, I am commenting something slightly tangential
Yeah, it seems to me like the author just wants a simple HTTP client for Rust that prioritizes simplicity over raw performance and is unhappy with the fact that reqwest isn't that. Though as developers we often wish it were otherwise, given a choice between implementation simplicity and performance most people will choose the latter (as long as correctness isn't sacrificed). So that's why reqwest is more popular and is the first choice that appears when searching for [rust http client] on Google.
Thankfully, ureq exists for those who prefer implementation simplicity, and that's what the author should probably choose.
I read it slightly differently.
> the author just wants a simple HTTP client for Rust that prioritizes simplicity over raw performance
I saw it as:
> the author just wants a safe HTTP client for Rust that prioritizes safety over raw performance
That resonates with me as well.
That's my main goal - and Rust seems like the language where we can have our cake and eat it too! The standard library has some amazingly performant data structures and algorithms.
Talking to one of the hyper developers, a lot of this machinery was made before it arrived in the stdlib. My response was, "neat, so we can delete the custom stuff now, right? :P" but I suppose any change carries some risk with it.
It is great when there is a standard that works well for everyone.
It is great when there is competition to that bloated monopoly everyone currently uses.
I don't understand my own cognitive dissonance on this topic.
I think the competition can drive the standard to improve.
People can see improvements in alternatives that still align with their principals and adopt them.
So in that regard, I don't think they are disjunct ideas.
> or even just skipping blank lines, hyper does things its own way.
Ugh. IMO a bit benefit of Rust is that you can’t do wild-west-YOLO buffer twiddling without “unsafe,” so people will write better code.
But it really looks like httparse missed the memo.
https://github.com/seanmonstar/httparse/blob/v1.8.0/src/iter...
iter::Bytes looks like an awkward wrapper around slices with all the safety removed. So you can port nasty C-style code right over.
Seriously, it should not be hard to efficiently strip a prefix off a u8 slice in safe Rust. For example, split_first.
These days, stripping fixed-length prefixes and suffixes off of slices is pretty easy with slice patterns. But that code was originally written around the time of Rust 1.0.0, when the language and standard library had far fewer features to help with this. For instance, it uses slice::from_raw_parts() rather than slice.get_unchecked(pos..), since the latter wasn't stable until Rust 1.15.0. Similarly, split_first() wasn't stable until Rust 1.5.0, and slice patterns weren't stable until Rust 1.26.0.
> Seriously, it should not be hard to efficiently strip a prefix off a u8 slice in safe Rust. For example, split_first.
That is a good idea for a contribution, no?
You mean like this one?
https://github.com/seanmonstar/httparse/pull/86/files#diff-4...
Follow up -- the benchmarks on that PR look poor, but the benchmarks don't actually quite measure what they're expected to measure -- they're not black-boxing inputs, so the compiler has an opportunity to constant-fold in some cases.
I raised a PR to fix the benchmarks: https://github.com/seanmonstar/httparse/pull/151
Rust indeed does have an embarrassing excess of diversity problem, made worse by the fact that async isn't in the standard library (outside bare minimum stuff like Future) and async libraries can't really be async runtime neutral. (This is going to improve with async traits.)
It's probably an inevitable result of Rust having a thinner standard library than Go, Java, C#, or other heavier languages. It's supposed to be systems language after all.
Another reason though is that Rust is a rich language with a very powerful type system, and rich powerful languages invite programmers to show off. One of the greatest things about Go is how boring it is. It doesn't give you a lot of room to show off by writing clever code, so instead you have to show off by making a great application. But some of this is unavoidable if you want what Rust delivers: a near-zero-overhead systems language with hard safety and automatic memory management without GC. That's dumping a heavy load on the type system, and Rust does deliver pretty well.
I try to exercise discipline when writing Rust and not show off by being more clever than I need to be. I do write some hand-whittled performance code here and there in high-performance applications, but I try really hard to avoid using unsafe. I've found that you usually can.
> It doesn't give you a lot of room to show off by writing clever code, so instead you have to show off by making a great application.
I agree with your comment as a whole, but I have also seen plenty of dumb Go code. hey, lets toss out all compile time type checking:
https://johnstarich.com/go/pipe/pkg/github.com/johnstarich/g...
> by the fact that async isn't in the standard library
I wish async/await were not in the language
So just ignore the async use cases? Or force tokio or similar to support it?
My biggest pain point so far is channels, I like them in go because they are straight forward to use, allow concurrency, aren't a minefield of problems, and are easy to reason with. I wanted to scan N directories (in parallel) -> encrypt and checksum in (parallel) -> queue files for upload. With go channels it's straight forward.
I was looking for MPMC (like go channels) to allow multiple producers to enqueue and multiple consumers to dequeue. Rust doesn't do it and tokio doesn't do it. I could use crossbeam[0], flume[1], or async[2]. Not sure what limitations those have and what if any limitations in compatibilities with other crates I need will be.
[0] https://docs.rs/crossbeam/latest/crossbeam/channel/index.htm...
FWIW, either async-channel's types or flume's async APIs should work for your use case. Both are completely agnostic to the async runtime, since the futures are triggered by internal events (recv() waits for send() and vice versa), rather than external events like I/O which are generally coordinated through the runtime.
> So just ignore the async use cases? Or force tokio or similar to support it?
Hell no!
Do it differently
Differently how?
And perhaps more importantly, what tradeoff(s) would those differences entail?
Green threads/Virtual threads are one solution, but then you need to ship a runtime for them, then you pay for what you don't use.
Or don't standardize on them and have a split ecosystem of half libs supporting use `green-thread.rs` and the other half supporting `rust-gthreads`.
I wonder if you could have hidden green threads from the application entirely. Have a "full" standard library where std::thread is M:N green threads and I/O is implemented behind the scenes that way, and a "thin" version where std::thread is OS threads and I/O is just passed through to the OS. (Then of course no_std for embedded.)
Of course the problem then would be that Rust makes it trivially easy to call C code. As soon as someone in a green thread runtime called libc I/O functions directly they'd be in for a weird surprise when the call froze all threads on their worker.
Of course the "full" M:N runtime could be instrumented to print a warning when this occurs unless explicitly told not to, and std::thread in the "full" stdlib could allow explicit creation of full OS threads outside the runtime.
There is an attempt at doing this out there:
> Have a "full" standard library where std::thread is M:N green threads and I/O is implemented behind the scenes that way, and a "thin" version where std::thread is OS threads and I/O is just passed through to the OS. (Then of course no_std for embedded.)
That sounds vaguely like what Rust used to have with its I/O functions before its green thread implementation was removed [0]? Not sure it's quite the same thing, though:
> In today's [2014] Rust, there is a single I/O API -- std::io -- that provides blocking operations only and works with both threading models. Rust is somewhat unusual in allowing programs to mix native and green threading, and furthermore allowing some degree of interoperation between the two.
> [snip]
> In this setup, libstd works directly against the runtime interface. When invoking an I/O or scheduling operation, it first finds the current Task, and then extracts the Runtime trait object to actually perform the operation.
> On native tasks, blocking operations simply block. On green tasks, blocking operations are routed through the green scheduler and/or underlying event loop and nonblocking I/O.
[0]: https://github.com/rust-lang/rfcs/blob/master/text/0230-remo...
May has soundness issues around TLS. You can cause undefined behavior in safe code. They do not plan on changing that (and I'm not sure it's even possible) but there's no way that it will become super popular due to this issue.
(and yes, as my sibling commentor mentions, Rust tried this before. Doesn't mean nobody can do it, but I suspect that it is not possible.)
> Green threads/Virtual threads are one solution, but then you need to ship a runtime for them, then you pay for what you don't use.
Indeed, and IIRC that's one of the reasons Rust removed its green thread implementation in the first place.
> Or don't standardize on them and have a split ecosystem of half libs supporting use `green-thread.rs` and the other half supporting `rust-gthreads`.
Don't forget the third lib for abstracting over the two!
However other people wished it was. So here we are.
There’s a small message contained in that unnecessarily long post - that maybe a popular library used for HTTP in Rust could use less unsafe.
There was no need for this half mocking, half condescending tone. If the author wanted an explanation for a technical decision, they could open an issue and have a conversation like an adult. Instead we’re left with their speculation that leads nowhere. They cry about the existence of some unsafe code, but don’t actually put in the effort to figure out if it can lead to a real problem like unsoundness. And somehow the title implies that they’re saying something profound about the entire rust ecosystem when they just looked at one library. It’s just innuendo, as far as I could tell.
Here’s a post by the maintainer of hyper about what they accomplished in 2023 and what they hope to accomplish in 2024 - https://seanmonstar.com/blog/2023-in-review/
If this work interests you, or you depend on hyper in production like many companies do, then consider sponsoring them! Or maybe you could give back to the commons by submitting PRs that fix issues you’ve found. Or even a good bug report would be appreciated.
But not this kind of article. This article helps no one and does nothing constructive. We all benefit from the work that open source maintainers put in. They have it hard enough without having to read low effort posts trashing their work. Be better.
> If the author wanted an explanation for a technical decision, they could open an issue and have a conversation like an adult.
Opening an issue comes across as a whole lot more aggressive to me. That implies that they owe you an explanation or you want them to change their code.
> They cry about the existence of some unsafe code, but don’t actually put in the effort to figure out if it can lead to a real problem like unsoundness.
unsafe might as well be unsound. The key benefit of Rust is supposed to be better memory safety than C; if everything is using unsafe, why bother?
> Here’s a post by the maintainer of hyper about what they accomplished in 2023 and what they hope to accomplish in 2024 - https://seanmonstar.com/blog/2023-in-review/
Ok, and that doesn't mention "unsafe" once - no justification for why it's used, much less a plan to reduce or eliminate it.
> The key benefit of Rust is supposed to be better memory safety than C; if everything is using unsafe, why bother?
Is Hyper all contained inside unsafe block? As far as I know it is not. So even if there is some unsafe code it is not same as writing it in C.
Apparently even quite basic things like splitting a string are done in unsafe code. So it sounds like while it may not be 100% unsafe, most of it is.
The article got me in touch with one of the hyper devs, and we had a very friendly conversation that taught me a lot.
If questioning some design decisions and calling a marketing blurb a little disingenuous is mocking and condescending, they didn't seem to think so.
> Simpler is better: Complex problems require complex solutions, but we should strive for simplicity in our software. Code is a liability—less code is less that could go wrong, and less to debug when things do go wrong!
This seems to confuse complexity with features. If a library has 100k lines of code but the parts you use are only 1k lines of code then why is that worse than a library with 1k lines of code?
> I want a single HTTPS connection. I don’t need persistent sessions with connection pools and cookies. I don’t need an async runtime.2 I need a glass of scotch, a socket, and a few syscalls. In the same vein,
That's true right until the moment you do need them and then you need to rewrite a lot of code. Especially great fun when the thing you're connecting to assumes everyone has this feature and you spend hours or days debugging things.
Modular, popular and well structured libraries with all the features one might reasonably need are my preference. Minimalism in lines of code is as much a trap as a minimalism in benchmark performance.
> If a library has 100k lines of code but the parts you use are only 1k lines of code then why is that worse than a library with 1k lines of code?
There's nothing wrong with not using the whole feature set of some library. But if library A does the same things as library B with a third of the code, isn't that better? (All other things - e.g. perf - being equal)
> That's true right until the moment you do need them and then you need to rewrite a lot of code.
There's plenty of applications that will only ever need a handful of connections. Probably most applications.
> Do huge swaths of Rust users value vanishingly small performance gains over memory safety, in a language that prides itself on being able to provide speed and safety?
I mean, yeah, probably? It was meant to appeal to people who were still writing things in C/C++, who are by definition people with those values. For general-purpose use the language was always "OCaml but with vanishingly small performance gains".