Why Rust for Low-Level Linux Programming?
groveronline.comAs a long-time developer marketing person, I must say Rust is kicking ass, not just as a language but as a community. They are deeply strategic.
1. Clear audience target: They aren't going after C++ gurus or C magicians but people who are new to systems programming. From Klabnik to Katz to literally everyone in the community, they are consistent with this messaging.
2. As part of 1, they have invested a lot in teaching systems programming 101 (heap v. stack, etc.), i.e., stuff that you learn in the first course in systems programming in college, but many self-taught, higher-level programmers might not know. This is a great example of authentic content marketing based on a clear strategy working out.
3. Their community is very inclusive. My experience (as a marketing guy who barely remembers how to code) is that people are very helpful when you ask questions, submit a patch, etc. This has been the case for me not just with Rust itself but a couple of Rust projects that I've interacted with.
As a C magician, I haven't written a new C project since the Rust 0.8 era. The only reason you would is ease of updating dependencies through distro package managers (because Rust has no stable ABI and performs extensive cross-library inlining). There's no need to market to C people because those who understand the language well will immediately get why Rust is better.
For C++ people, Rust's generics remain less powerful than template metaprogramming (which is Turing-complete, with people building real programs in the tarpit), so there are reasons you might not switch.
Meanwhile, Rust does make it a lot easier to get started with systems programming, which is good! Every tool should help both empower beginners and extend the reach of experts. For example, writing zero-copy parsers in C is fairly hard to get right, and might not be worth the debugging or validation time that even an expert might have to put in. C string manipulation works, but it's verbose and fiddly. In Rust, it's trivial to use the lifetime system to make sure you keep all the input data around long enough and don't read outside the buffer. You could even use #[must_use] and affine types to check that every character of input data ends up attributed to exactly one terminal.
> The only reason you would is ease of updating dependencies
Or if Rust doesn't support your OS yet.
I am working on porting LLVM and writing a MIR to C++ translator in parallel. We'll see which one I get further on.
Because I'd love nothing more than to use Rust.
Is the problem the OS, or the architecture/ABI? I wouldn't expect porting to a new OS to be terribly difficult, though it is a time sink since there's a fair bit of API surface to cover to get Rust's libstd ported, and you'd want to work with upstream so they know your platform matters.
If your OS doesn't look at all like UNIX, then you'll have to give up on libstd (which talks about "file"s and "processes" and such nonsense), but libcore (which presents data-structures and other logic, rather than IO, code) should be fine.
It's a platform with a file format and linker sections that are currently unsupported by LLVM, among other issues.
Windows isn't like UNIX, yet still supports libstd fully. That said, yes, if your OS is so different that it doesn't support files or processes or something, you'd have to stick with libcore. The standard library not so much UNIX-centric as it is "common OS features"-centric. We even went so far as to not pick the UNIX names for common functionality, which can often happen with languages that start on a UNIX and move out.
Similarity isn't a clear-cut yes/no question, but the NT kernel is so similar to UNIX that it was able to adopt a second syscall ABI compatible with Linux. Present operating systems are basically a monoculture in terms of high-level design decisions; Rob Pike complained eloquently about this in 2000, and outside of possibly unikernel development (which still usually has a libc/unix-like layer implementing at least a filesystem) little has changed since: http://herpolhode.com/rob/utah2000.pdf
Yeah, I'm very intrigued by unikernels myself; as far as I know, I'm the first person to have gotten an Iron web app running on Rumprun. I figured I'm the first because I had to send patches to make it work :)
I guess my point is that even if vague designs are relatively a monoculture, that still means that anything that fits inside that box is going to be reasonable to port over. And I'm glad that it's so easy to not use the standard library for other cases; my little kernel doesn't yet have processes or files, and Rust works for it just fine. Well, it doesn't have them yet, anyway... hopefully soon.
Can someone explain why Rust doesn't have an ABI? I understand that its a still a newish language but hasn't it been around long enough to want to define one? Is the idea that it won't have one just like C doesn't have one and you will have a few like how C has stdcall, cdecl and fastcall?
C does have a standard ABI on each platform (where "platform" is slightly vague, ranging from "a place where people agree to use the SYSV ABI" to "Windows+MSVC"), so you can generally call into C libraries from the same "ecosystem" and not have to notice if they get recompiled between runs; library maintainers can put in some work and make promises about ABI stability.
The reason Rust doesn't have a defined ABI is basically that it wouldn't buy the same benefits it does in C. Specifying an ABI requires a lot of per-platform work (which the C community has already done), and, because of the importance of cross-crate inlining (all generic functions get inlined into call-sites by default), would not be sufficient to provide the benefit of in-place library updates. If you rewrite generic code in libfoo, and libbar depends on it, you can't get around recompiling libbar.
This is basically because Rust is a higher-level language where you use iterators, iterator adaptors, and higher-order functions in the course of writing libraries and applications. In C, you would manually inline things like iteration, writing for loops and populating intermediate data structures yourself. In Rust, this is something that can be factored out into libraries, but that means your code's meaning depends more deeply on the meaning of library code. To optimize away these abstractions and provide good performance, the compiler needs to inspect and make decisions based on library source code when compiling code that calls it. To permit efficiency, Rust basically has to be compiled from leaf dependencies upward.
In general, this is probably worth it, but it means we do need to rethink the C/UNIX style of packaging, which doesn't work very well when a libstd update implies every other package must also update. Some form of compiler middle/backend in the package manager (think Android's ART compiler), or a specialized form of binary or IR diffs (like Chrome uses) would probably go a long way. If we want to solve UNIX's problems, there will be a need for some cascading changes across the OS ecosystem.
Or how about we avoid the whole thing and allow multiple versions of a lib to exist, and then prune the branches as they are no longer needed? Something akin to Nix/guix, Gobolinux, or even GNU Stow.
C and C++ libraries can use ELF symbol versioning.
Say you have a function like
but to fix a bug and/or tweak the API you change it toint foo_do(struct foo *, int);
then ELF symbol versioning allows you to dolong foo_do(struct foo *, int, int);
where the runtime linker will link foo_do_v1_0 as foo_do for programs originally compiled against the v1.0 release, while programs built against v1.1 will be linked to foo_do_v1_1. You can do this as often as you want, though you can't generally go back further than when you first began using ELF symbol versioning to compile and release libfoo. You only need to add an ELF .symver alias for functions that have multiple aliases, but you do need to at least enable versioning (usually by specifying a version file with a catchall "*" entry which tag functions not explicitly aliased) at the point you begin maintaining a stable ABI.__asm__(".symver foo_do_v1_1,foo_do@@v1.1"); long foo_do_v1_1(struct foo *F, int arg1, int arg2) { ... } __asm__(".symver foo_do_v1_0,foo_do@v1.0"); int foo_do_v1_0(struct foo *F, int arg1) { long rv = foo_do_v1_1(F, arg1, 0); assert(rv >= INT_MIN && rv <= INT_MAX); return rv; }glibc is pretty much the only major library that makes use of this capability, despite the fact that it's been around for well over a decade. Most developers simply don't have the foresight or interest in providing rigorous forward and backward ABI and API compatibility. Partly that's because in the open source world, recompiling packages is much easier than in the proprietary world. And especially in the Windows world (where the CRT was never forward or backward compatible) you often packaged dependencies with your software, even if dynamically linked. And so newer languages like Go and Rust are being built with the presumption that both recompiling and bundled dependencies are the norm--it's what people are doing anyhow, and it simplifies the compiler and its runtime. That it's sad that this is the norm is beside the matter.
Interesting, I had no idea the C runtime on Windows is not forward or backward compatible.
I'm curious about your thoughts on why the "recompiling and bundling dependencies" approach might not be the best way compared to ELF versioning facilities? Do you just feel like its a less elegant solution?
Thanks
Embedded software is a huge security problem on the internet precisely because it's difficult to update. Once the vendor loses interest in maintaining it, it'll never be updated. With shared library systems like RedHat and Debian, you can at least upgrade shared components for a substantial period as long as the developer cooperates reasonably well.
With the movement to statically compiled apps, we're just going to see more and more ancient code running in the wild.
It's the same thing with containers like Docker. Even assuming a container is using something like RedHat or Debian, the very reason it's a container is because it's customized somehow. However it's done, the result is that maintenance and ownership of the basic software stack becomes increasingly fractured, and it will be more difficult to benefit from the work of the thousands of distribution contributors.
Static compilation and container approaches have much to recommend them. When you cut a release it's arguably better that you control all the dependencies. But what happens when development slows down, you lose interest, or you move on, as do vendors of embedded software inevitably do? The Google's and Amazon's of this world have armies of developers to fill in the gap. Statically compiled Go apps have almost no downsides for Google given how the company is built around their server infrastructure technology and devops army. But for everybody else who is an end-user of software incapable of taking ownership (which can apply to software companies, too), we're just going to see the same problems that have plagued, for example, router software and blogging software, expand.
In the ideal world, developers would pay attention to ABI and API stability, particularly developers of core components. And they would make it easy to design systems so that these core components could be updated without having to rebuild or reinstall the dependent software.
But we don't live in that ideal world (witness OpenSSL, which has horrible API stability[1]), so _sadly_ the path of least resistance is static compilation and, more recently, containers. And so new and very actively maintained software will see quicker releases, but the long tail of less actively developed software will grow increasingly insecure. And all the while developers will shift the blame onto system administrators and everybody else so that they won't have to be burdened by careful and conservative interface design.
[1] While OpenSSL has been moving toward improving their API and ABI stability, interestingly Google's BoringSSL has completely eschewed such stability. Why? Because they don't need that stability, as I explained above. But the vast majority of direct and indirect users of OpenSSL would benefit tremendously from improved stability, because it makes it easier to upgrade dependent software.
Eventually we will want to do updates, unless you prefer bugs to not. When we do, we'll need strategies to tame the quadratic space blowup caused by lack of sharing across the dependency tree relative to the current model.
Yes, but it means they can be done in the background without disrupting the workflow as much.
Install/build the new version while keeping the old in place, then flip over to the new when ready, and then starting taking out the old one.
Thanks for the detailed response and insight.
Don't prematurely publish your interface.
I suppose I'm in their target audience, then. I've not done any systems programming, but do have a curiosity about it, and Rust has caught my eye.
But my main problem is a lack of a project -- a ThingIWantToDo that would be well suited to a systems programming language like Rust. And I don't even know what kinds of problems or projects are well suited to systems programming -- so far, when I've had an itch to scratch and gone to scratch it, I've found Python able to do what I want.
Now I realize that Python is in no way appropriate for all classes of problems, and that there problems for which it is not fast enough. But thus far, the only project I'd like to tackle that I know Python will be too slow for is doing real-time audio processing on Linux with lv2 plugins and JACK. But lv2 and JACK are C APIs, so that's incentive for me to learn C, not Rust.
Understand, this isn't a knock against Rust. As I said, it's caught my eye. I just haven't found a compelling reason to actually get involved yet. I am hoping I eventually will.
> But lv2 and JACK are C APIs, so that's incentive for me to learn C, not Rust.
Maybe, but it might be an incentive to learn just enough C that you can wrap the C interface in Rust. The point of having an API is to have well defined behavior at a particular boundary, which can reduce quite a bit (but not eliminate) a lot of the reasons to use the language it was implemented in on the caller side.
I suspect learning rust will probably make you familiar enough with the basics of C that you won't have to do much specific C learning to use most libraries.
Here's an example of a lv2 plugin written in Rust: https://github.com/poidl/eg-amp_rust and here's a WIP rust wrapper for JACK: https://github.com/nicklan/rust-jack
Magicians, schmagicians. I say that as part of (possibly) that group. We've just learned to "cover our father's nakedness" so to speak.
I just hope Rust practitioners can do a few things where they have to use 'C' ( properly ), much as I think assembly is a good thing for 'C' programmers to do.
I hope the relationship between 'C' and Rust is collegial - ideally, it would approach being the same people over time because legacy code. Nothing divides like language, and flexibility is a great way to harden your skillset.
> They aren't going after C++ gurus or C magicians but people who are new to systems programming.
If this actually is their strategy, is it being done voluntarily or out of necessity? I ask, because I've witnessed enough scepticism about Rust from C and C++ programmers. Rightly or wrongly, there are enough of them who don't appear to be receptive to Rust, and likely never will be. So the Rust community may never be able to appeal to these C and C++ programmers, even if they wanted to. The only option may be to appeal to the non-C and non-C++ programmers.
> I ask, because I've witnessed enough scepticism about Rust from C and C++ programmers.
I don't think there's a single language on the planet that hasn't had skeptics, especially at the beginning. Remember how skeptical everyone was of Python at first due to its significant whitespace?
Programmers are very tribal about their tools.
> Remember how skeptical everyone was of Python at first due to its significant whitespace?
A large group of programmers still hates Python for this reason, to this day. They've just moved on and are probably completely ignoring Python these days.
Python has come a long way, but for domain reasons, the significant whitespace isn't my primary beef. Packaging is, because I haven't captured all the gnosis yet.
Thanks! To be clear, I do very much want C and C++ programmers to be using Rust as well, I just think there's a lot of opportunity in the "new to systems" group. They're also just my group of people, so I find it easier to pitch things to them. I can talk about anything Ruby at any level with a Ruby person, but I first picked up C++ in the late 90s, and hadn't been active in systems-level stuff for a long time when I came to Rust, so I'm a bit removed from feeling their pains directly.
C gets out of the way and lets you do useful things that are "undefined behavior". How convenient is it it Rust, to, say, use the unused bits in a pointer (due to alignment) and put a type tag in them?
Like in C you can cast the pointer to an integer and back. Rust allows such hacks if you mark them with a "hold my beer" keyword:
You can also use `std::mem::forget(*pointer)` to avoid fighting with Rust about who manages the memory.let the_bits:usize = unsafe { std::mem::transmute(pointer) };You can actually turn a pointer into an usize without using an unsafe block. For example, here's how you'd do it with a reference:
let the_bits = pointer as *const _ as usize;
The ISA is generally pretty well-defined, a lot of the undefined behavior is introduced by C.
So I don't think it's fair to say that C "gets out of the way". It won't let you get the overflow flag, or alias arbitrary pointers, for instance.
Okay, now you've piqued my curiosity. Is that something to do because it's really smart and clever and fun, or is there a certain problem or class of problems where doing that is unambiguously the best or least-worst solution?
Haskell does it automatically as an optimisation: if an algebraic type has fewer than 2-3 cases, then it inline the tag bits directly into the pointer, thus saving an indirection on pattern match.
Some C data structures also make use of low level bit tricks like this to save space and reduce indirections. For instance, the hash-array mapped Trie uses a 32 bit mask to both track which indices of the current node are actually populated, and incidentally, how large the node currently is. It's quite clever.
These are always my foot examples to evaluate any alleged systems programming language. No language less powerful than a theorem prover is currently capable of expressing these idioms safely.
Rust actually does a bit of this internally in the form of the null pointer optimization. If you have an Option<&T> (or Option<*T>) value, it's actually stored as a single pointer-sized value, with None being represented by a null pointer on the assumption that null is not a valid pointer.
As to3m says, this is often done in programming language interpreters.
If most objects in your language were heap allocated and you want to store a small integer you would allocate a new object on the heap with space for one integer, set up its headers, etc. You could instead set one of the unused bits in the pointer to indicate that it's an integer and not a pointer, then store the integer in the remaining bits, avoiding the heap allocation at all. The Lisp world calls this a fixnum.
The more pointer bits you can steal, the more kinds of data you can store directly in the "pointer" itself. It's also possible to store the type of objects that are actually allocated on the heap in tags on the pointers to them, but I don't know if that's done any more.
You might do it if you were writing an implementation of a language such as Scheme.
> developer marketing person
What is that?
> They aren't going after C++ gurus or C magicians
Don't they have anything to gain from using Rust?
Sure they do. But when you say "gurus" and "magicians" these are people who have spent decades with the language and are awesome at it. It's possible that Rust may improve their lives, but it would take years before they have that same level of proficiency. Not because Rust is hard, but because they are so good at C++ and getting that good in anything is hard.
Rust does market to C/++ people, a lot. We just don't market to the super-awesome C++ folks. It's the same reason I preferred Word 2003 over the new stuff for half a decade. I had years of memorized shortcuts, custom macros, and general ui familiarity. I could eventually learn the shiny new Word and become as good, but the activation energy for that is too much and I was happy with 2003.
Rust isn't only going after non-systems folks. The community is roughly half systemsy. But it may seem this way because Rust tries very hard to not alienate non-systems people with jargon and unexplained systems concepts.
A developer evangelist. Think someone doing a talk about new Java 9 features at a Java conference.
C++ gurus and C magicians already have invested too deep into their languages to throw everything away and start from zero.
For example I love Rust and play occasionally with it, but for the time being C++ is my native language on the job when I need to use a native language outside .NET or JVM.
I know it since the C++ARM "standard" and we depend on standard OS tooling that Rust is still catching up with.
The day will come when our customers will be able to do mixed debugging between JVM/.NET and Rust. Or produce COM as easy as C++ compilers do.
But these are things that beginners in systems programming aren't usually doing.
As a C magician, rust provides too many clear improvements over C to ignore it. I certainly don't feel like I am 'throwing everything away and starting from zero,' as much of my C (and other language) knowledge transfers over to rust.
I'm not a C++ guru, but I think modern C++ is powerful enough that it doesn't feel lacking in features compared to rust, like C does. There is less of a draw for seasoned C++ programmers.
Rust seems to be gaining a lot of momentum and I am becoming more and more confident that it will be regarded as a major language for embedded and general systems programming and possibly even a successor to C.
For me C was already lacking when I got to learn it in 1992 , because by then I was quite comfortable with Turbo Pascal 6.0.
Just check the feature list and type safety differences. The only advantage from C was being less fragmented than Pascal dialects.
So I became a C++ hipster (if that would be a thing in the 90's).
We used to have the same heat from C guys that C# and other language users nowadays have from systems languages.
Hence why I am always supportive of new programming languages that target the same use cases.
Developer marketing: marketing to developers. Marketing gets a bad reputation among developers, but it's vital for any technology to be marketed: keeping documentation up to date, answering questions for newcomers, explaining strengths/weaknesses, etc.
>Don't they have anything to gain from using Rust?
Who's "they"?
The magicians.
They might have something to gain, but they're probably deeply invested in the tech already.
People are very hard to dislodge from such positions, unless the new positions have overwhelming benefits such as: a new environment does not support the old tools/programming language; a radical paradigm shift has happened that risks to make completely obsolete their previous knowledge, etc.
Rust is more of an incremental improvement over C++, than a radical leap forward. So C++ magicians are less likely to want to switch over.
I still use 'C'. I do this primarily because the legacy code base is staggeringly large. I'd jump on a gig doing Rust in a heartbeat, all other factors to the good.
> Don't they have anything to gain from using Rust?
Sure they do, but they're not throwing away a decade of hard-won experience in their specialties just to tinker; at least not with production code bases. The barrier of entry for beginner systems programmers is lower.
As a C++ enthusiast, I wouldn't use the words "throw knowledge". All the concepts you can find in Rust are almost matching one to one to a C++14 equivalent (albeit pattern matching for instance). The difference being the enforcement of these good practices by the compiler. It would most likely take me 2 days to read the latest rust doc and 1 month of practice to be proficient.
Thing is, it's a bit like switching from Python 2 to Python 3. Why would I move to this new environment where I would need to recode everything from scratch? Python 3 will take decades to overthrow its predecessor, how long will it be for Rust? Will it ever succeed? Will the C++ committee react and borrow some of Rust awesomeness? Can I find co-workers willing to learn Rust?
Someone who markets to developers. The language and context should clear up the ambiguity that they don't mean they are merely a developer that works in marketing for something like, say selling potato chips
So like a lite developer advocate
It used to be called advertising, then marketing, then evengelizing and now advocacy, yes
This is something that should be done as well by the guys doing D.
Even if they can succeed by going after the C++ gurus as well.
Thanks for pointing out the community aspect - they're welcoming, pragmatic and don't foeget, the documentation and examples are rather good.
Rust is what it took C++ 20 years to become, except anew and reimagined.
It's ready and usable, today.
Can you give an example of 2? Where have they invested in teaching systems programming 101? Is there a specific blog? Thanks.
The Rust book has this chapter, for example: http://doc.rust-lang.org/stable/book/the-stack-and-the-heap....
Ah neat, I did not know about the book, thanks.
No worries. I'm currently in the process of re-writing it for the second edition...
Would it be worth including a chapter(s) on writing Linux system utilities using Rust?
I hope you announce it on HN when the rewrite is finished.
I don't want to put Linux above other platforms by only including it. And it's already a huge task on its own.
I will for sure. It's also going to end up getting published by No Starch.
I love C, but I think we really have to stop building all kinds of shared libraries in C. Important code which needs to be secure and solid can't be built on C anymore, it puts everybody at risk. Just look at the disaster OpenSSL has been.
I think Rust would be create for building common crypto infrastructure and things such as crypto currency. It seems risky to me to build something like Bitcoin with C++ where millions can easily be at stake if the system doesn't work.
I am an application programmer so I might not be the primary target, but I started programming with Swift and although it isn't the same as Rust it has some similarities. A lot stricter language than C++, C, Lua, Python and Objective-C which have have used most in the past. So many bugs are caught at compile time. I used to be skeptical towards static typing, primarily because languages like C++ and Java made types so awful to work with. But with the newer OOP languages with more functional inspiration, it is getting easier to deal with strict typing.
You don't have to chose between productivity and safety so much anymore.
> You don't have to chose between productivity and safety so much anymore.
Exactly. If I had to sum up Rust's philosophy in one sentence, this would basically be it. (Add "and performance" after "safety" too.) :)
Safe. Productive. Fast.
Choose any three.
(Taking a hint from SQLite.)
I'll raise one set of shared libraries in particular: graphics parsers. Whether you're writing in PHP or Ruby or whatever, odds are you'll end up manipulating graphics in some C library.
Imagemagick has a long history of issues (although I appreciate the most recent, major issue could have been written into any language). Mozilla only just audited libjpeg-turbo and found a series of issues, and a quick Google will point to most of the options being terrible.
I'm sure someone will (if not already) write a decent Rust alternative - but what everyone is missing at the moment is bindings for their favourite high level language with comparable APIs to their existing tools.
You are missing one very important point: bindings. You can bind to C from basically every other language, which is quite important for shared libraries.
You can expose Rust code as 'C' libraries and bind to them.
Rather than writing for Linux in Rust, we need a new kernel written in Rust. I'd like to see a replacement for the QNX microkernel written in Rust. It's about 60K bytes of code, yet you can run POSIX programs on it. (You need file system and networking, which are user processes.) The QNX kernel is stable - it changes very little from year to year. There's hope of catching all the bugs. This offers a way out of "patch and release" OS development.
Yes, you take a 20% or so performance hit for using a microkernel. Big deal.
At one time, you could download the QNX kernel sources and look at them.[1] This would be helpful in getting the microkernel architecture right. It's very hard to get that right. See Mach or Hurd.
[1] http://community.qnx.com/sf/sfmain/do/downloadAttachment/pro...
L4.sec is already formally verified. So the microkernel is already done. You need the user land services to provide POSIX compatibility, and that you can possibly do in Rust.
Looks cool, thanks. They should take the opportunity to fix a few Unix/ACL security problems though, instead of just reproducing the same old POSIX quagmire. Make chroot isolation complete with plan9-like private namespaces, don't implement the traditional broken user/group security model and instead learn from the Polaris virus safe computing prototype (they're already partway there by using the capability secure seL4 kernel).
I would personally also want to eliminate a lot of the duplication in the POSIX API, but that probably won't fly. Can have your cake and eat it too.
Redox may be exactly what you're looking for: http://www.redox-os.org/
And there are others: http://wiki.osdev.org/Rust
Redox is close, but they have pipe-like, rather than call-like, interprocess communications semantics. That means another layer of overhead. More important, it breaks the tight integration between scheduling and interprocess calls required to make a message-passing OS work fast under load.
Here's the QNX architecture document on this, which discusses how message passing and CPU scheduling integrate.[1] Microkernel designers need to read this very carefully. A good test is to run a message-passing benchmark on an idle system, then run it again with a CPU-bound process of equal priority in round-robin mode also running. If the message passing-task starves, or the CPU-bound task starves, message passing was misdesigned. If, on a multiprocessor, a simple message pass causes a CPU switch, message passing was done wrong.
If message passing and scheduling do not play very well together, a service-oriented architecture (sorry, "microservices" architecture) will be sluggish. This is where most microkernels fail.
[1] http://www.qnx.com/developers/docs/6.4.1/neutrino/sys_arch/i...
QNX is proprietary. Also, the NetBSD anykernel is where I think the sweetspot is.
Or MINIX. You could rewrite MINIX's microkernel servers one by one. You have a working kernel and userspace on day one and the end result is a robust microkernel design written in a safe language.
Is there any reason why embedded software for autonomous vehicles is still being written in C/C++? This last week I was talking to a friend at a company that makes a small autonomous vehicle. During testing their prototype suddenly went off in a straight line. They had to pull a safety to halt the vehicle or it would have gone straight forever into the Pacific Ocean. Turns out there was an unsafe access to a variable in memory, which had not been caught with their software and hardware test platform, even with thousands of virtual sorties.
If their code was written in Rust, that sort of bug could not have occurred.
> any reason why embedded software for autonomous vehicles is still being written in C/C++
Nearing a half-century of momentum in the community which includes developers, mature tools, etc. Until rust arrived it was nearly the only game in town for predictable low-latency systems programming.
Yes, now that Rust's here there's a bit of an alternative. But if you've got a team of 30+ software devs who know C/C++ and an existing well-tested codebase of millions of lines, even if you had multiple Rust champions it would take a very long time to evolve towards Rust.
A more productive line of questioning, rather than prodding them to convert their probably-quite-substantial codebase into a language that, with all due respect to the Rust team, is probably still a wee bit cutting edge to be putting in self-driving cars, is to ask them what static analysis tools they've been using and why their static-analysis tool missed this one.
If the answer is "no static analysis tool", there's your problem.
But as my first paragraph implied, they can't catch everything, so "we installed many layers of protection and it still got through even so" is definitely a possibility. Rust may have a different set of such issues but they will of course always exist.
No, there's no reason, as it could have been written in Ada and gotten many of the same safety guarantees as Rust provides.
Given Ada's history in safety critical systems (avionics), it's actually somewhat surprising more didn't use Ada. They could have just adopted the military standard (which is fairly stringent, as I understand it). The military is pretty adverse to losing billion dollar pieces of equipment, so they probably take quite a few precautions.
Didn't most of the military-industrial complex drop Ada in favor of C or C++ as soon as the DoD dropped the requirement that critical software must be done in Ada?
AFAIU the JSF software is done in C++, there is (or at least used to be) some JSF coding guidelines document on Bjarne Stroustrups web page. Of course, blaming C++ for the JSF boondoggle is unfair, but still, one wonders whether it was wise of the DoD to allow C/C++...
You got me as to why this is. Dude, I tried back in the day - Ada, MODULA, all those.
Maybe the "badass 'C' hax0r" meme was stronger than I realized. But I think a lot of it was just switching cost.
To be fair, Ada has a lot of library cruft for dynamically sized structures that you don't have to deal with in C. It can be pretty annoying.
There's two reasons Rust might not be ready here yet. First, while LLVM supports a wide number of platforms, some embedded devices literally only support the exact version of the C compiler they ship to you, sometimes, it's even got its own custom patches. Second, we sort of assume 32 bits at the lowest, though we have a patch in the queue that starts some work on 8/16 bit support. This means some tiny micros are out of reach at the moment.
> we have a patch in the queue
If you mean this one[1], it's merged. Still lots of work to do, and even more corners where things will shake out[2], but there's definitely progress.
[1]: https://github.com/rust-lang/rust/pull/33460 [2]: https://github.com/rust-lang/rust/pull/34174
Ah nice! I was unsure if the first had gotten through bors yet or not, and I was pretty sure the second one hadn't.
A rust -> C compiler would be really nice for those custom/slow updating environments, but I can understand if that just too much of a distraction.
Since I got two replies with basically the same thing at the same time, I'll pick one at random and it'll serve as a reply to both. You won the coin flip :)
This is feasible in a sense, but C is a fairly tricky target to compile to: you have to make sure that you don't accidentally include UB in the code you generate. I know pcwalton has lots of feels here...
The easiest way to do it would be if LLVM had a C backend; I know that it did, but it was removed a few years back, and I haven't heard anything about it coming back into tree yet. MIR might also in theory enable new backends, but then you'd have to re-implement all of the optimizations that we currently rely on LLVM supplying.
> I know that it did, but it was removed a few years back, and I haven't heard anything about it coming back into tree yet.
(out-of-tree) fork(s) have been kept alive by several groups. The most current one I'm aware of is: https://github.com/JuliaComputing/llvm-cbe
I'll just say - I've generated A Great Deal of 'C' code. It's not hard to avoid UB at all. You only use a very concise subset of the language. YMMV.
This sounds more like generating 'C' is a distraction rather than a goal.
It is certainly not impossible. I think it's just presented as a bit more trivial than it is. You can generate C, but is it easy to generate good C? That's what I was trying to get at with the LLVM comments. Rust relies a lot on a good optimizer; a straightforward transformation might be significantly slower.
Agreed. I did not mean to minimize the effort involved.
Is there any possibility of compiling Rust into C and then use that particular C compiler that works for the processor in question?
Industry standards, certified compilers and cargo cult are usually the main reasons.
This is the main reason. Also legacy. I work on a lot of embedded code for $AUTO_MANUFACTOR some of our code bases go back to the 90's.
The code base is so modified by macros/typedefs its hardly even C anymore.
Not to mention the LLVM has to support the embedded device you are targeting. And Rust's support of legacy CPU's (8008, 8080, 80386, 68000) are lacking.
Oh, god the macros! I also work in the auto industry and the macros are frigging everywhere and makes navigating the source a nightmare.
I would sum it up in a similar fashion. Another point that should be taken into consideration is that the industry is not really interested in software, and therefore most stuff will be done as it was always done. And now that everything should be based on Autosar (a standardized C-based OS and Giga-framework specification) chances are even lower that anybody looks at saner alternatives.
Rust won't change anything about this, if there would have been interest in changing the situation other alternatives like Ada would have been available for years.
But presumably some sort of bug would have. Broken is broken. If Rust correctly deduces the intent leading to the bad dereference, then it's REALLY GOOD! :)
( no snark; I hope you get my point )
Ironically, reliability is actually a value of merit with 'C'/C++ - in cases. It's just that the ways of doing that seem rather inaccessible these days, or the flow of people past seeing them is not working out.
I don't think there will ever be a way around developing proper test vectors. It's quite interesting work but it tends to go unrewarded.
As a C++ developer for 13 years I got to say that language just makes screwing up so much easier. The last year I've been coding Swift and man it is so much easier to avoid so many pitfalls in C++. I could write page upon page about all the problems with C++ and how those problems don't exist in modern languages like Swift and Rust.
Programming languages isn't just fashion, we invent them because we thing we can solve old problems in better ways.
Most of my co-workes doing C++ never even wanted to look at the alternatives. Being the only thing they have ever done, they don't even realize how bad it is. They have just internalized it.
You'll get no argument from me. That RAII exists at all is the best evidence ever ( even though I've used a variation on RAII in assembly in the past ).
The interesting question is - are there actually fewer defects, objectively, or are they simply rendered .. something like latent?
One I fixed in... April - if the file system on an SD card was scrogged, writing to the file system crashes the box. So I moved the write of a configuration file from the event of a switch change ( because if the switch was never put in that position, then there was no reason to ever do that ) to the top of the program so it'd crash when you powered up.
It helps me personally to think that defects are just something I've chosen to do despite my best effort. Keeps me on my toes. I certainly understand people being fatigued by that.
I would guess maturity of the wider ecosystem around the language. Not necessarily the language itself or the immediate tooling & toolchain, but the whole stack of sensor support, hardware support, mature algorithm implementations, etc. Rust is maturing quickly but it would take a long time (years/decades!) to expand out to all the niches which have been built around C/C++ even if it were an absolutely perfect replacement.
There are standards (MISRA C) which are supposed to stop things like that happening. Perhaps they weren't being followed?
There are other safe languages they could have used which have a longer track record than Rust, e.g. Ada. It's used in avionics. Why shouldn't it being used here?
While a decent guideline, MISRA does not guarantee correctitude. There are many ways you can twist code that MISRA will not complain but the code will be wholly broken.
I do not know what kind of unsafe memory access happened in their systems, but you can do all sorts of memory opperations and as long as the explicit typecasts are a-ok misra won't flinch.
Totally agree, you can write software thats perfectly MISRA compliant and still contains lots of different bugs.
Truth be told, the question here is: "Is the class of errors that is prevented by Rust natively also prevented by MISRA?"
My take is that while there is some overlap, MISRA is unable to guarantee anything, while Rust is able to guarantee certain things that C can't. (that's from my limited understanding of Rust, I haven't futzed with it yet)
I don't think C (or C++) should be used for autonomous vehicles at all, as it is known to be unsafe, but if it is, the MISRA C guidelines or something similar should be used to help prevent certain kinds of bugs.
Almost any other statically typed language, along with similarly strict guidelines, would be preferable to C, but there is no ideal language. Rust still allows dynamic heap memory allocation and recursive functions. It is also new. Ada has been used for decades.
MISRA is already used extensively in the auto industry. But i guess what I was trying to say is that while it helps, it can easily be tricked while a compiler designed with the safety measures MISRA promotes already baked into it will not let you do certain things.
Agreed.
MISRA is okay. It's not a panacea.
Wiki says it is proprietary..
Yeah, although the principles/rules are in quite public places. It's rather like the ITU standards in that regard - you pay for the documents.
Applying them feels like buying indulgences. :)
A bunch, including:
1. Most of those platforms don't have compilers for any languages other than C(++). If the platform has a lot of history behind it, maybe you could write it in Ada, but that's pretty much it.
2. Development tools (debuggers, static analyzers, standards compliance verification tools and so on) for C and C++ are very hard to match, both in strength and in sheer availability. In the meantime, Rust still relatively recently got decent GDB support.
3. A lot of Rust's features simply aren't needed when writing this kind of software (e.g. the breadth of features related to memory management is largely unneeded because everything is statically allocated).
4. For better or for worse, C is well-understood (C++ is... well, not that I haven't seen good safety-critical code written in C++, but in my experience, C++ code is a lot easier to get wrong, both by humans and compilers). Rust isn't, not yet in any case. There's no Rust equivalent for e.g. MISRA, and not because Rust doesn't need one.
5. To, uh, put it bluntly -- C and C++ are very well known in the far corners of the world where a lot of this software is outsourced. Rust -- not so much, because outsourcing companies don't really encourage their employees to learn this kind of stuff.
6. There's a lot of commercial risk involved. I'm not sure about autonomous vehicles, this is probably a more volatile field, but many safety-critical systems have to be maintained for a very long time (10 years is fairly common, and 15-20 isn't unheard of). Rust may well be dead and burried ten years from now, whereas language enthusiasts have been singing requiems to C (on roughly the same tune as Rust, no less) for almost thirty years now.
Rust is a great development in this field and I can't wait for the day when we'll finally put C (and especially frickin C++, Jesus, who writes that!) to sleep, but it's at least five years away from the point where I'd even half-heartedly consider it for a project with critical safety requirements.
> If their code was written in Rust, that sort of bug could not have occurred.
I don't know the specifics of the bugs you mentioned, so I can't really comment on this, but in my experience, most of the similar claims that float around the Interwebs are somewhat exaggerated when put in their proper context. E.g. Heartbleed, which wasn't because C something something PDP-11, but because someone decided to be smart about it and implement their own (buggy) memory management system so as to make the damn thing run decently on twenty year-old operating systems.
I've seen people write that kind of code, for similar reasons, in Java and Go -- and, at least once, with Heartbleed-like results. The ways in which a language can be misused rarely reveal themselves before that language breaks out of its devoted community.
To be clear on it though -- I think Rust is a step in the right direction, and one that we should have taken a long, long time ago. If it can make it through its infancy, and if it can get enough commercial support, it will be a great alternative to C and C++.
> A lot of Rust's features simply aren't needed when writing this kind of software (e.g. the breadth of features related to memory management is largely unneeded because everything is statically allocated).
True, but I don't think that Rust's other features wouldn't be useful here. References which know about mutability/immutability, sum/enum types, "fat" pointers/slices w/ bounds checking, the ability to construct library APIs which enforce non-memory safety through session/affine/linear types, sane integer typing, etc, could all still be useful to a fully-statically-allocated program.
> static analyzers
Rust doesn't exactly need these, no? Most static analysis in C/++ is safety/UB focused. Rust doesn't need this, unless you're going to spend a lot of time with `unsafe` code.
Rust does have clippy, a lint library with >150 lints which catch things ranging from correctness to style to safety issues. I'm one of the maintainers, so I'm biased, but I've personally found it to be much better than its equivalents in C++land. Perhaps not Javaland.
Static analyzers are good for a lot more than just finding potentially unsafe memory access. In fact, memory access bugs are typically just the low-hanging fruits that static analyzers find (and which, most of the time, you can find by code review, assuming your team consists of more than two developers and that they actually get some sleep every once in a while).
It's issues related to timing constraints, incomplete branches, common but subtle mistakes (e.g. in C, suspicious memory allocations, like malloc-ing strlen(x) instead of strlen(x) + 1 bytes) and so on. E.g. http://www.viva64.com/en/examples/ . Many of these are, indeed, because unsafe memory access allowed without restriction, but they're fewer than one might expect. Most of them are either language warts which no language is devoid of, no matter what its fans would say) or programming blunders that occur because our brains work the way they do.
I know they are, I'm saying that the main attraction is memory access stuff (at least, for me it was when I used to use them).
For the rest, Rust does have static analysis tooling of the kind you describe in the form of clippy. There's still a lot that can be done, but it's already quite helpful and catches all kinds of things.
> I know they are, I'm saying that the main attraction is memory access stuff (at least, for me it was when I used to use them).
It depends on what you're working on. In the context of the original question of the thread (i.e. autonomous vehicles), I'd consider memory access to be the least difficult thing that static analysis can help me with. With code review, careful structuring of your data and, if it's available, hardware support for memory access models (e.g. ring buffers), memory access bugs can be reasonably avoided even without code analysis tools (not that it should!). Things like timing analysis are a lot harder to do without proper tools.
> For the rest, Rust does have static analysis tooling of the kind you describe in the form of clippy. There's still a lot that can be done, but it's already quite helpful and catches all kinds of things.
I would, uh, rather not be put in a situation where I have to send documentation to an approval body, and have the documentation mention -- as the only static analysis tool that was used -- a community project that's at version 0.0.75.
For comparison, there was a thread around here a while ago, where I think Gerald Holzmann from JPL mentioned how they used several (something like the top 5) code analysis tools to check their code. A top of all available static analysis tools for Rust would be a lot shorter than that.
This kind of stuff is important for mission-critical applications. I like Rust and I think it's a step in the right direction (and would certainly love to see it go all the way in that direction!) but I'm not about to write code that could kill people in a language whose only viable compiler barely reached 1.0, barely has a working debugger and only lint-level static analysis. It's the right track, but we're not there yet.
Edit: oh -- and I would like to point out one thing that seems to be often lost in the HN bandwagon. If you look at the numbers, it turns out that programmers have been able to get C and C++ to perform reliably for quite some time now. Failures are high-profile, but by and large, the medical, space and automotive industries have been doing a pretty good job at delivering safe tools, considering how many cardiac pumps, cars, airplanes and spaces probes are around and how few of them fail. It goes without saying that we should aspire for better, but the status quo is really hard to beat.
> I would, uh, rather not be put in a situation where I have to send documentation to an approval body, and have the documentation mention -- as the only static analysis tool that was used -- a community project that's at version 0.0.75.
Heh. Yeah, clippy has a lot more to do, and it's not a product with official support, but so far it's been pretty useful :) The version number is just because I want to have an rfc about it before I release a 1.0.
But yeah, it's nowhere near the level of lint tooling C++ has. For most people, I believe it might be sufficient, but for mission-critical stuff I'm not so sure -- you're probably right. Though Rust's type system also might help in creating safety guarantees (non memory safety) for mission critical things.
> only lint-level static analysis
what do you mean? Clippy calls itself a linter because it uses the lint API, but it does all kinds of static analysis. The meaning of "lint" in the Rust community is slightly overloaded.
> if you look at the numbers, it turns out that programmers have been able to get C and C++ to perform reliably for quite some time now.
Of course :)
Rust is not stable. The language is not battle tested like C/C++.
I will certainly concede the second, but we put a _lot_ of effort into ensuring that Rust is stable. Things have changed a lot since the pre-1.0 days.
> Things have changed a lot since the pre-1.0 days.
The instability of the pre-1.0 days left a very bad impression on many people who tried Rust then. They came to know Rust as a compile-today-but-not-tomorrow kind of language. What, if anything, is being done to try to inform these people that the situation has changed, to encourage them to try Rust again? What's being done to restore Rust's reputation?
> What, if anything, is being done to try to inform these people that the situation has changed, to encourage them to try Rust again? What's being done to restore Rust's reputation?
A lot? For example, "Stability as a Deliverable", which was here on HN: http://blog.rust-lang.org/2014/10/30/Stability.html
I'm not sorry for opening up the language during its early development. The alternatives would have been to produce a deeply flawed language or to keep the language secret. Both of these are far worse than a few ignorant comments.
There are still plenty of crates that say "you need to be using nightly!".
That put me off starting to develop something in Rust right now, unfortunately, because I'm a huge fan of the way Rust was developed and its core ideals.
We can't force people to use the stable version of Rust. But it does exist. And we're working on bringing the most popular nightly features to stable as soon as possible.
In any case, even if you're using nightly your code won't break nearly as much as it did pre-1.0. We use nightly in Servo and we've been through dozens of Rust upgrades that sailed through without a hitch--and we have 150+ dependencies.
Oh absolutely - people are inevitably going to want to play with the new shiny (for various reasons, improved functionality and novelty being the two biggest), and Rust is still relatively young so the crate ecosystem is, while not small, not yet comprehensive.
This means that while Rust-the-language is stable, the ecosystem around it isn't quite yet. That's perfectly fine and it's nobody's fault, least of all the people developing the language. But it is one of the barriers to me picking it up right now, though.
That's more about encouraging crate developers to make their package work on stable. There are crates that are made to take advantage of nightly features, but I believe many of those label them as "beta", "unstable", "only works in nightly".
If somebody uses a piece of software or language that is pre-1.0 with "Beta" and "unstable" written all over it and then they get mad when it changes, that is purely their fault.
The programming community of people who actually write real things and create production software, I highly doubt any of those people are the people you are referring to and thus Rust should be doing nothing to restore its reputation with those people, as they are ignorant and/or highly incompetent, and I hope they never write a piece of code that makes it into production.
Well before 1.0 itself, we were already trying to telegraph our intentions here. October 2014! http://blog.rust-lang.org/2014/10/30/Stability.html
We try to be fairly vocal about the things we're doing here, but of course, it can be tough to get the word out. While some people may not know things have changed, a lot of people also do.
Some examples of things we do to ensure stability:
* The RFC process requires lots of discussion before major change happens, to ensure we can do things in a compatible way.
* We run a tool, "crater", both on PRs that we are worried might cause issues, and just in general. This tool compiles all of the open source Rust code on crates.io with the new revision, and reports problems. It's not perfect, but it helps a lot.
* For that matter, we don't merge any code ourselves; bors manages a fleet of 30ish machines that test every commit with our full test suite.
* We recently added three significant crates (and their dependencies, which last I checked was around 80ish crates in total?) to be part of our test suite, so we know that they build properly on every commit.
> What, if anything, is being done to try to inform these people that the situation has changed, to encourage them to try Rust again?
Releasing something called 1.0.
The Rust project has been vocal about its stability guarantees; all that's left is to correct people when they claim the language isn't stable.
http://blog.rust-lang.org/2014/10/30/Stability.html
EDIT: As you can see from the number of responses you got, this matters a lot to the Rust community :-)
If anything we have better stability guarantees, we actually run the beta against the entire ecosystem before releasing.
Rust has been _extremely_ vocal about its stability guarantees. Not sure what else can be done here.
Yes, Rust is younger. There is less code out there running to root out undefined behavior.
Except Rust allows for less undefined behavior. I wouldn't be surprised if it improved at a faster rate then C. Or C++ ()
() Please don't say C/C++. They are different beasts.
Addendum to your nitpick: Also don't even leave it at just "C++" unless you really are a master of everything in C++98 plus all the new stuff in C++14.
Even the name tells you that C++ was originally designed to be C with extra features. It can still be used that way and frequently is. These days, there are contexts in which the differences are very significant, and there are many others where any statement about either applies to both, and in those latter contexts "C/C++" is completely valid.
And C++ was originally just a C pre-processor. It's not anymore. So what it was originally is not that relevant nowadays. And that's the whole point. They have diverged too much.
These contexts you speak of are rare enough that the "no-C/C++" heuristic is useful.
Can you give specific examples of this alleged lack of stability?
I'd like to be able to use something like rust, and maybe I will for smaller projects or for novelty sake, but I chafe over how slow compilation time is relative to C (not C++!) projects last I checked.
On the hardware of yesteryear, a parallel compile could build Postgres in about 45 seconds (750-1305KLOC, depending on measurement) , and user mode Linux (which doesn't compile so many drivers) in about a minute.
We've made steady improvements here, so depending on when you checked, it might be much better.
The real improvements will come when incremental compilation lands. The precursor requirements are just landing now; so it won't be immediately here, but it will be soonish.
Well, Rust is awesome, but there is a place for C too. I just don't understand lack of the life and no improvements in C for ages. Better typing system (for example _Generic doesn't know uint8_t, etc types - they are just typedefs), 'pure' keyword for functions without side effects, tuples support, deprecate a lot of the things and so on.
> I just don't understand lack of the life and no improvements in C for ages.
Well, MSVC still hasn't even fully implemented C99. One of the big draws of C, as I see it, is its wide support on many operating systems and architectures. If you're going to abandon that by using new C features, you might as well use a language with less cruft.
>Well, MSVC still hasn't even fully implemented C99
Neither does GCC. Both don't fully implement C11 either.
Most these features are either things C-Compilers don't need to support themselves. Namely: special integer types can be placed in libraries instead of compilers.
Also bounds checking interfaces are a performance loss and not included in C compilers despite them being part of the C11 standard. (Well they're optional)
It's harder than it looks. If you improve things willy-nilly, then you split the language - some will use the more modern version, some will stay behind.
IMO, a better evolution is to do what the Rust folks have done - define a new language. This way it has a new name and you don't have to qualify which version of 'C' you mean.
_Generic absolutely can handle uint8_t. In fact, the reason _Generic is problematic is precisely because on most implementations uint8_t is a typedef to unsigned char. But in a _Generic list you can't specify multiple compatible types. If you're unsure if uint8_t is compatible with unsigned char, you have to chain multiple _Generic expressions, nesting one inside the default: case like: `_Generic(x, uint8_t: foo_u8, default: _Generic(x, unsigned char: foo_uc, default: baz))`.
So if I specify both unsigned char and uint8_t as cases the same _Generic expression, with GCC 6.1 I get:
and with Apple clang-7001.81 I getfoo.c:6:2: error: ‘_Generic’ specifies two compatible types uint8_t: "uint8_t", \ ^ ... foo.c:5:2: note: compatible type is here unsigned char: "unsigned char", \ ^
Another issue with _Generic: you have to be careful with type promotion, especially because everything smaller than int is quickly promoted to int in most kinds of expressions.foo.c:12:22: error: type 'uint8_t' (aka 'unsigned char') in generic association compatible with previously specified type 'unsigned char'Another issue is type qualifiers: (int) is different from (const int) is different from (volatile int) is different from (const volatile int). _Atomic and restrict increase the permutations.
I have a fuzzy memory that early clang had a wrong implementation of _Generic that didn't obey the standard. But as far as I know, today both clang and GCC have identical behavior. Whether Microsoft implements it compatibly if they add it is another question. For example, Microsoft has an idiosyncratic interpretation of macro tokenization and evaluation that makes implementing certain C99 variable argument macro constructions difficult.
Performance. Rust is still twice as slow as C (http://benchmarksgame.alioth.debian.org/u64q/performance.php...) which is still a fair bit slower than if a skilled assembly programmer had taken on the task.
Rust aficionados will say that their compiler is getting better, but so is C. clang has gotten faster than gcc on some benchmarks and on some others gcc has catched up and is now faster than clang again.
But what if you don't need optimal performance? Then you can use Rust. But then you can also use Go, Python, SBCL, Haskell, Java, C#...
This difference on this test is caused by Rust not having stabilized SIMD support. Also Rust support hand rolled assembly (on nightly) that C has.
On non-SIMD tasks Rust/C are neck and neck https://benchmarksgame.alioth.debian.org/u64q/rust.html
You're just cherry picking benchmarks. In the cases you care about raw number crunching power you'll likely be using a GPU not SIMD instructions as CPU's are roughly 3-4 orders of magnitude slower then GPU's at pure number crunching tasks.
Not that SIMD isn't important as it's instructions also cover things like AES, SHA1/2, Random numbers, Cache pre-loading/evacuation, memory fences, and fast loading paths. But so few programmers worry about these things you are really hitting a niche market.
GPUs aren't panacea. Such generalizations are wrong and will have you rearchitecture approaches after GPU io bottlenecks.
>GPUs aren't panacea. Such generalizations are wrong
This is true. But if you are doing hard number crunching you are using a GPU once you exhaust what a CPU can do. And most the time before you even touch SIMD as it's only a 4-8x speed up, while a GPU is 1000-10,000x.
>will have you rearchitecture approaches after GPU io bottlenecks.
90% of these are caused by bad software. Either using legacy API's. Or by writing code that forces GPU's to talk to processor more often then is necessary.
PCIe bandwidth is ~7.88GB/s [1] on modern Intel Chips (post 3xxx series). Compute GPU's often offer >10GB of on-board RAM. With last generations flag ships over a staggering 32GB of on-board RAM.
And if you are hitting a wall with GPU compute/IO limits, CPU SIMD instructions aren't going to help you in the slightest.
[1] http://www.tested.com/tech/457440-theoretical-vs-actual-band...
It's not the bandwidth that's the problem, it's the latency. Many problems need more control flow and branching, and that is better done on the CPU. If you need to make decisions and take the previous iterations output as an input, then the overhead of transferring the data back and forth from cpu to gpu outweighs the benefit of the gpu speed.
GPUs are good if you have an independent data parallel algorithm you are using to transform large blocks of floating point data. For other uses, CPU is better.
> This difference on this test is caused by Rust not having stabilized SIMD support. Also Rust support hand rolled assembly (on nightly) that C has.
Cool. Let's call that language with SIMD and inline assembly support FutureRust(tm) to differentiate it from the currently released and available Rust. We can have a discussion about how fast FutureRust will be vs C, but this discussion is about Rust vs C. Or rather clang 3.6.2/gcc 5.2.1 vs Rust 1.9.0 since language performance is very implementation dependent.
> On non-SIMD tasks Rust/C are neck and neck https://benchmarksgame.alioth.debian.org/u64q/rust.html
In 5 of 10 benchmarks, C is twice as fast as Rust. In one of the benchmarks where it is neck and neck, like pidigits (https://benchmarksgame.alioth.debian.org/u64q/performance.ph...) it appears to be so because both the C and the Rust variant are wrapping libgmp. GMP is written in C.
>In 5 of 10 benchmarks, C is twice as fast as Rust
fannkuch-redux why? SIMD
fasta-redux why? SIMD
spectral-norm why? SIMD
reverse-complement why? SIMD
N-Body why? Oh you guessed it SIMD
Seriously read the source code. Remember on HN where a lot of people constantly say the benchmark game is really crappy. This is why. All 5 of these tests boil down to raw FLOPS. Which C/C++ having access to SIMD instructions wins at.
The fact that Rust/C performance difference works out to just the ability to emit vector instructions says a lot about everything else in Rust. The fact that Rust can dereference, pass variables on the stack, call functions, and make decisions as fast as C renders your core point completely moot.
You are just being incredibly pedantic for no reason. And your argument holds no water. Everything Rust does is identical to C except one barely used corner case. They use the exact same model for computation, they both live in the Cee-LangVM. Post compilation they are functionally identical (except Rust makes stack manipulation easier).
Does any of that make sense to you?
:.:.:
Also Rust/C both calling the GMP without a time difference is a good thing. The Rust->C FFI is literally non-existent in practice. Dipping into C code from Rust (and vice versa) has no penalty. The same can't be said for HUNDREDS of languages.
Rust is also slower in binarytrees, regexdna and fasta. SSE is not one "barely used corner case" because huge amounts of performance critical code takes advantage of it.
Edit: To explain why I don't believe you when you say that "Post compilation they are functionally identical [in performance]" is because if it were so, you would just transliterate the C solutions to the Rust equivalents and it would run as fast as C. Since that hasn't been done and is trivial to do, my conclusion is that it doesn't lead to the same performance.
Did you know Rust was quite a bit faster than C in regexdna merely a few months ago? It didn't get slower because of Rust. The algorithms employed are radically different. My hope is that the regex library has already regained performance, but until the benchmark game is updated (which is on us, not the benchmark game maintainer), I suppose we'll have to suffer the pedants!
Or perhaps, you might look at single threaded performance and wonder, maybe there is something more interesting going on than a naive surface analysis of C vs. Rust! :-) https://benchmarksgame.alioth.debian.org/u64/rust.php
And by the way, transliterating a regex library isn't trivial. I invite you to transliterate Tcl's regex library. Let me know how that goes. ;-) So I think your reasoning is specious at best.
> It didn't get slower because of Rust.
Do you mean the program became relatively slower because of changes you've made to the regex crate?
Wasn't the program relatively faster because you wrote the regex crate to use Aho-Corasick for the matches required by the regex-dna task?
> Do you mean the program became relatively slower because of changes you've made to the regex crate?
Yes. The underlying reasoning is complex. When the regex crate got a lazy DFA (similar to the one used by RE2), the vast majority of regexes got significantly faster. Some got slower. This one in particular from regex-dna:
Before the lazy DFA, compile time analysis would notice that all matches either start with `>` or `\n` and do a fast prefix scan for them. Each match of `>` or `\n` represents a candidate for a match. Candidates were then verified using something similar to the Thompson NFA, which is generally pretty slow, but the prefix scanning reduced the amount of work required considerably.>[^\n]*\n|\nOnce the lazy DFA was added, the prefix scanning was still used, but the lazy DFA was used to verify candidates. It's faster in general by a lot, but, the lazy DFA requires two scans of the candidate: one to find the end position and another to find the start position. That extra scan made processing this regex (on the regex-dna input) slightly slower.
I've since fixed some of this by reducing a lot of the match overhead of the lazy DFA, so my hope is that it's back to par, but I haven't done any rigorous benchmarking to verify that.
> Wasn't the program relatively faster because you wrote the regex crate to use Aho-Corasick for the matches required by the regex-dna task?
Aho-Corasick is principally useful for the second phase of regex-dna, e.g., the regexes that look like `ag[act]gtaaa|tttac[agt]ct`. (In the last phase, all the regexes are just single byte literals, so neither Aho-Corasick nor the regex engine should ever be used.) Performance here should stay the same.
On that note, I have a new nightly-only algorithm called Teddy that uses SIMD[1] (which replaces the use of Aho-Corasick for those regexes) and is a bit faster. I got the algorithm from the Hyperscan[2] project, which also does extensive literal analysis to speed up regexes.
To clarify, this optimization is generally useful because a lot of regexes in the wild have prefix literals. Even something like `(?i:foo)\s+bar` can benefit from it, since `(?i:foo)` expands to FOO, FOo, FoO, Foo, fOO, fOo, foO, foo, which can of course be used with Aho-Corasick (and also my new SIMD algorithm).
One also must wonder how well a C program using PCRE2's JIT would fair on the benchmarks game. From my experience, it would probably be near the top. It's quite fast!
[1] - https://github.com/rust-lang-nursery/regex/blob/master/src/s...
> One also must wonder how well a C program using PCRE2's JIT would fair on the benchmarks game.
Let's hope some C and C++ programmers take up the challenge ;-)
Please don't point people to u64 -- it's no longer updated. (Note the rustc version.)
Slower by a tiny amount, and still faster than other C implementations. It's within the error box.
Also iirc there are improvements to those benchmarks in the pipeline, idk what happened to them (Veedrac and llogiq had something in mind).
Sure, you could hand-translate C in many cases (not regex), but that would be far from idiomatic. Most of the rust solutions try to still look Rust-y.
Regarding sse, if you care about performance and sse use a nightly compiler. That option exists. Rust nightly is still Rust.
You can also just bundle Rust w/ LLVM and have it JIT compile your application on start up which'll yield huge performance gains too.
But people may get salty about binary image size.
>> Remember on HN where a lot of people constantly say the benchmark game is really crappy. This is why.
Because the benchmarks game shows some programs to be faster, and you agree those programs actually would be faster? :-)
>> All 5 of these tests boil down to raw FLOPS.
Where exactly are the floating-point operations in fannkuch-redux Rust #2 program ?
Where exactly are the floating-point operations in reverse-complement ?
("Seriously read the source code" ?)
>> fannkuch-redux why? SIMD
Look how many other programs, written in various languages, are shown ahead of the fannkuch-redux Rust #2 program.
Maybe you can write a better Rust fannkuch-redux program (even without SIMD).
Nightly Rust is also "currently released and available," and a significant part of the ecosystem can take advantage of it.
Besides, the vast majority of the work to close the gap between C and Rust in the benchmark game was from people optimizing the benchmarked programs, not from any language or compiler changes. There is no inherent 2x slowdown in any meaningful sense.
The inherent 2x slow down just works out to `__m128` vs `f64`. C can double Rust's FLOP thought-put.
The fact function calls, if statements, passing variables to functions is identical speed to C is lost on the parent poster. These core points prove the Rust/C are equal speed.
Or it is faster than C (http://benchmarksgame.alioth.debian.org/u64q/performance.php...). Depends which link you click on.
From the looks of it, that Rust program spawns 20 threads and does the computations in parallel. The C program does it all in one thread and doesn't even utilize sse intrinsics. I know full well that The Computer Language Benchmarks Game isn't a perfect source for programming language speed arguments, but what you can you do.
So it's okay to claim Rust is slower than C by cherry picking SIMD benchmarks (rust can do simd btw, just not on stable), but not okay to claim c is slower than rust by cherry picking a parallelization benchmark? If you don't think that benchmark is fair, submit a parallel solution in C.
The benchmarks game is far from being even a useful source. It gives order of magnitude answers, and that's pretty much it. Using it (cherry-picked!) to back up a claim that Rust is 2x slower than C is disingenuous. "but what can you do" -- don't make absolute arguments about something using imprecise data.
Actual rust in real world programs may actually end up being faster than C (see Yehuda Katz's talk on fast_blank in rust). C often needs to be hand-optimized. Rust, with its zero cost abstractions, often doesn't need to be; a naive program in rust would probably be faster than the same in C.
Remember that fundamentally rust compiles the same way c does, and your rust code shouldn't have any more overhead. (except drop flags -- a minor cost -- which are something you might hand-implement in c anyway). We also use llvm, so we get mostly the same compiler optimizations.
I wrote "what can you do" because you are supposed to use some modicum of common sense when reading the numbers on The Benchmark Game. E.g in one of the benchmarks PHP beats both C and Rust, so you need to apply common sense to understand that that result is an outlier.
I didn't cherry-pick; in 5/10 benchmarks, C is twice as fast as Rust.
> rust code shouldn't have any more overhead.
But it appears that it have.
> We also use llvm, so we get mostly the same compiler optimizations.
That is not a guarantee for efficient code. For example, in my testing, g++ is over 50% faster than clang++ in certain template-heavy scenarios.
>I didn't cherry-pick; in 5/10 benchmarks, C is twice as fast as Rust.
Making the claim that C is twice as fast as Rust because of 5/10 benchmarks in the "benchmark game" shows an incredible lack of common sense to me.
In 5/10 benchmarks, the benchmark games claims Go has equal if not better performance than Rust. Am I supposed to believe now, that a managed, garbage collected, 6-year-old compiler, language is as fast as as language without a runtime running on LLVM?
Don't back up your claim with flawed benchmarks.
>> Don't back up your claim with flawed benchmarks.
:-)
" How fast is Rust? Fast! Rust is already competitive with idiomatic C and C++ in a number of benchmarks (like the Benchmarks Game and others)."
>> In 5/10 benchmarks, the benchmark games claims Go has equal if not better performance than Rust. Am I supposed to believe…
Believe that those Rust programs gave those measurements, and those Go programs gave those measurements (when compiled and measured as described on the website in tedious detail).
It does matter how the programs are written!
Write better Rust implementations for those tasks and contribute them --
> Making the claim that C is twice as fast as Rust because of 5/10 benchmarks in the "benchmark game" shows an incredible lack of common sense to me.
Actually, 2x is likely the lower bound of how much faster well-written C is over Rust. Rust developers have an interest in promoting their language so they will make sure their test programs runs as fast as possible. C doesn't need that kind of marketing.
For example, the Rust solutions were all updated in 2015 while the C solutions hasn't been touched since 2013.
> In 5/10 benchmarks, the benchmark games claims Go has equal if not better performance than Rust. Am I supposed to believe now, that a managed, garbage collected, 6-year-old compiler, language is as fast as as language without a runtime running on LLVM?
It doesn't run on LLVM. It takes advantage of LLVM to compile ELF executables. I said that common sense should be used.
>> For example, the Rust solutions were all updated in 2015 while the C solutions hasn't been touched since 2013.
Not true:
(There may have been others that have subsequently been removed.)Jun 02, 2016 revcomp.gcc-6.gcc Apr 13, 2016 fasta.gcc-7.gcc Sep 26, 2015 revcomp.gcc-5.gcc Oct 01, 2014 fannkuchredux.gcc-5.gcc Apr 27, 2014 fastaredux.gcc-5.gcc Apr 08, 2014 mandelbrot.gcc-9.gcc Jan 19, 2014 mandelbrot.gcc-7.gccWell, yeah, Rust hit stable in 2015. Its a new language, that's when those benchmarks were first written or fixed to work with the stable compiler.
Trust me, most of the C solutions there are very hand-optimized. Less than the rust ones in some cases.
While Rust may need that marketing, C folks have had years of time to play the benchmarks game. And there are many more C programmers than Rust programmers. I think enough effort is going into those C programs.
Like I said, in practice Rust code sometimes is faster than (otherwise it is as fast as) C code because it is easy to write the fast version using zero cost abstractions.
(the benchmarks game is a different realm of optimization, in day to day usage you do not spend that much effort eking out every last cycle; you write code that doesn't have major perf issues and use it)
No, it does not appear that Rust has any overhead over C, except in cases that use SIMD. That is not a generally applicable result.
> some modicum of common sense when reading the numbers on The Benchmark Game
yes, this involves checking what the benchmarks are actually measuring. In this case, it is how much faster SIMD makes things. Factor that in, or rewrite the programs with SIMD in rust, and it should come out to be the same.
> But it appears that it have.
Have you not been listening? It doesn't. The speed differences you quote are due to simd. Rust has simd support, just not in a non-nightly compiler.
> Have you not been listening? It doesn't. The speed differences you quote are due to simd.
That should be easy to demonstrate!
Please quote the lines in the source-code of these fannkuch-redux and reverse-complement programs that show SIMD use --
http://benchmarksgame.alioth.debian.org/u64q/program.php?tes...
http://benchmarksgame.alioth.debian.org/u64q/program.php?tes...
Using the nightly builds of any programming language in production is insane. That's why we call it FutureRust to differentiate it from what is production ready Rust.
That Rust might have SIMD intrinsics in the future matters little to people trying to seriously use Rust today. And in several benchmarks C handily beats Rust even without intrinsics. Such as the fannkuch-redux one where it is about 2x faster.
Don't use the latest nightly then. Use the nightly from 4 months ago that corresponds to today's stable, ensuring that it doesn't have any extra soundness patches that need backporting. Usually the case.
I don't see "several", I just see one. Most of the non-simd benchmarks have almost exactly the same numbers for Rust and C. Perhaps the Rust test for fannkuch isn't optimized yet (looks like the C one has some extra logic about how to split up the chunks, whereas Rust blindly parallelizes)? I already optimized one Rust program, I'm not going to sit and optimize every benchmark out there -- we have tons of counterexamples of fast benchmarks already. It seems like you won't agree as long as one nonoptimized benchmark exists. In that case, good day to you. I'm done here.
Yeah, given that the CPU load for the C lines are listed as
on a quadcore, I'm going to say "not one thread".100% 95% 95% 95%Edit: to be helpful, rather than just obnoxious, (<3) the C version uses open mp pragmas, so it looks single-threaded.
You're right - for microbenchmarks, the difference between Rust and C is going to be in how the solution is implemented, because the languages have such similar performance characteristics. This is why your original comment that Rust is not as performant as C is quite silly. You look rather hypocritical turning around and pointing it out when someone links to a microbenchmark on which Rust outperforms C.
Both examples are cherry-picked, is the point. Neither is particularly helpful in representing the whole story.
That said, Rust makes it ridiculously easy to use parallelism if a task can support it, which is a strong advantage in its favor.
> but what you can you do.
Not make sweeping generalizations based on one benchmark you didn't write?
Well, if you are familiar enough with the languages in question, you can look for what appear to be submissions that appear to use mostly idiomatic expressions of the language. Alternatively, if you are more interested in maximum possible performance, you can confirm both are making use of advanced techniques for quicker execution, but that's probably much harder to judge.
I'm not sure thats the best source you can use. C is listed 5 times on there, the slowest taking 15 seconds. Likewise C #3 times similarly to Rust. So is the case that C is actually faster, or did the guy who wrote C #5 optimize the code and neglected to give the same optimizations to Rust.
Rust can, somewhat is already and ultimately will beat C on performance, due to way better guarantees on pointer non-aliasing. See eg. http://stackoverflow.com/questions/146159/is-fortran-faster-...
Other than that Rust has way better zero-cost abstractions, so in practice allows writing faster code, as in C there are sanity limits after which you give up and write slower, but easier to manage code, as macro processor sucks, and type system is trying to stab you in the back at every step.
Good example is `qsort`: http://www.tutorialspoint.com/c_standard_library/c_function_...
What are we looking for in the qsort example? Is there a Rust equivalent we should compare to?
The C code is forced to use a function pointer and hence do dynamic virtual calls for each comparison, while Rust and C++ can use generics/templates to get static dispatch (and hence inlining, constant folding etc.). You can see C vs. C++ in http://www.martin-ueding.de/en/programming/qsort/index.html , and Rust is likely to be similar to C++.
LTO will allow qsort to be inlined, etc.
Most of the slower benchmarks were fixed and just need to be merged iirc.
I don't have much faith in microbenchmarks. Usually all they measure is how much effort the author put into overoptimizing code.
There's a pretty big chasm between "only 2x as slow as C" and "not needing optimal performance". That's even assuming it is generally true and a constant 2x factor.
How many skilled assembly programmers you know that are able to write better code than the collective intelligence embedded in current compilers? Even if you have a few of them handy, then aren't their resources better spent in hand-optimizing the compiler output for critical sections only?
Isn't Rust's biggest advantage is safety , I make websites with golang , I can't stand the nil-ness and all the mutation mess I easily get myself into, could use a safer type system.
Sometimes you just need safety and correctness .
Since people were so angry I used The Benchmark Game as a source for benchmarks here (https://github.com/logicchains/LPATHBench/blob/master/writeu...) is another micro benchmark showing gcc & clang handily beating rustc. Though the timings are 1+1/2 year old. Their relative performances might have changed significantly.
That's pre-1.0. Rust has changed a lot.
Also, given how prone microbenchmarks are to depending on hand-optimization over the compiler quality, benchmarks should have been contributed to by the community -- I don't think anyone in rust has heard about this one.
Oh also, Rust is as fast as C/C++ there. It's just not faster than C++Cached, _which is a different algorithm_. That's the problem with microbenchmarks, you end up measuring differences in the algorithm used.
No. In the first table Rust has 1874 and C++/clang 1722. The latter number is lower. C++ with clang beats Rust.
In the second table all C and C++ versions beats Rust. 618, 749, 755 and 735 vs 877. That is a very big difference.
You can also run the fucking benchmarks yourself and see for yourself. I have linked to lots of benchmarks showing C spanking Rust. None has shown any fair benchmarks were Rust is as fast as C.
That's a ... very small difference. And again, probably due to implementation differences. I'm not claiming C doesn't beat Rust, I'm just saying by very little -- Rust is practically just as fast, within the margin of error that microbenchmarks have. You have been belting out claims that Rust is 2x slower -- clearly false. Rust may be 5% slower -- which ... doesn't really matter.
Look at wycats' talk on fast_blank. That's a real world example that's faster than C. Rust used to be faster than c on the regex benchmark at one point, as burntsushi pointed out.
> That's a ... very small difference. And again, probably due to implementation differences.
735 / 618 = 1.19 So Rust is at least 19% slower than C even without involving SIMD intrinsics. You wrote "your rust code shouldn't have any more overhead" but in all the benchmarks it has!
> You have been belting out claims that Rust is 2x slower -- clearly false.
Clearly not, since it is on all the SIMD-using benchmarks.
> Look at wycats' talk on fast_blank. That's a real world example that's faster than C. Rust used to be faster than c on the regex benchmark at one point, as burntsushi pointed out.
Because it is comparing different regex engines, not language performance. I said you should apply "common sense" to The Benchmark Game's numbers.
Here's the thing, you guys can easily prove me wrong. Prove that Rust has zero cost abstractions by taking any small C benchmark, transliterate it to Rust code and profile it. If it is as fast, I'm proven wrong. If it is slower you are proven wrong.
Rust can use simd too. It doesn't in those benchmarks. Please apply the common sense you keep harping about. Claiming rust is 2x slower because of that benchmark is a falsehood.
Re:regex: my point exactly. Most microbenchmarks are prone to slight differences in the implementation causing issues (and you can rarely translate code exactly, especially to something like rust which often requires a different structure of code from C. Same for any two other langauges). 19% is well within this error box.
The fast_blank thing is this example. fast_blank is a carefully hand optimized C extension whose main purpose is being super fast. A mostly naive Rust one-liner beat it (not by much iirc, perhaps 10%, but thats within the error box im talking about). It didn't use parallelism or anything fancy. They weren't even trying to beat C. I provided this proof already.
I could try fixing that benchmark you linked to -- the rust version looks like it could be optimized further. Not sure if its worth it, really. I don't put much stock in microbenchmarks for anything other than order of magnitude comparisons.
> Rust can use simd too.
No it can't. Either accept that the nightly build of Rust is not the Rust we are talking about or stop discussing with me.
> Re:regex: my point exactly.
The problem with regex libraries are that they are to big so therefore doesn't reveal so much about inherent language performance.
> Most microbenchmarks are prone to slight differences in the implementation causing issues (and you can rarely translate code exactly, especially to something like rust which often requires a different structure of code from C. Same for any two other langauges).
Yes, obviously the implementation defines performance. That's what I wrote in the other thread part: "this discussion is about Rust vs C. Or rather clang 3.6.2/gcc 5.2.1 vs Rust 1.9.0 since language performance is very implementation dependent"
And fwiw, you can easily transliterate C code to C++ or to asm.
> 19% is well within this error box.
What error box? 19% is a huge difference.
> The fast_blank thing is this example. fast_blank is a carefully hand optimized C extension whose main purpose is being super fast.
I don't know what fast_blank is. Is it this https://github.com/SamSaffron/fast_blank/blob/master/ext/fas... C code wycatz managed to rewrite faster in Rust? That C code isn't well-optimized at all...
> I don't put much stock in microbenchmarks for anything other than order of magnitude comparisons.
Does that mean it is impossible to prove to you that C is at least 2x faster than Rust since twice is less than one order of magnitude?
What's nightly in Rust today will become stable soon enough (idk the timeline for SIMD). But OK. Stable only. In that case, rust is 2x slower than c code that can be optimized by SIMD. Not much of an issue, really, and it proves nothing about rusts overhead except that you can't rely on autovectorization. Not really a big deal.
> 19% is a huge difference.
IMO that in the realm of microbenchmarks, it really isn't. You clearly disagree, not much I can do about that.
> That C code isn't well-optimized at all...
Go ahead and fix it then. You've been telling me much the same. I already mentioned that the other benchmark you linked me to wasn't optimized.
> Does that mean it is impossible to prove to you that C is at least 2x faster than Rust since twice is less than one order of magnitude
I use the term loosely, 2x is certainly alarming. As long as you rely on simd benchmarks I will disagree though, since in most cases a lack of that optimization isn't the reason your program is slow. If you really really care about performance, use nightly rust; there's no cost to that. I have yet to see production C code that uses SIMD everywhere possible, just in some tight loops. That is not going to create a 2x difference in performance unless the tight loop dominates all else. That is not most use cases.
> Not much of an issue, really, and it proves nothing about rusts overhead except that you can't rely on autovectorization.
Not much of an issue unless you actually need the performace ofcourse. Ime, simd intrinsics is everywhere in code optimized to run as quickly as possible on x86. That about half of The Benchmark Game's benchmarks uses sse proves that point.
> Go ahead and fix it then. You've been telling me much the same. I already mentioned that the other benchmark you linked me to wasn't optimized.
That requires investing a lot of time in understanding how Ruby's internals and especially its string objects works. I don't have that time. The LPathBench on the other hand is self-contained and updating it shouldn't be more than a few hours of work for a decent Rust programmer.
> Not much of an issue unless you actually need the performace ofcourse. Ime, simd intrinsics is everywhere in code optimized to run as quickly as possible on x86. That about half of The Benchmark Game's benchmarks uses sse proves that point.
My point is that the Benchmark Game is not representative of real world code. The website says as much. Because the benchmarks use sse everywhere does not mean that most code, even perf-sensitive code will use simd everywhere.
Again, if you need simd, use a nightly. There's little to no drawback there.
I fixed it up to run on modern rust (https://gist.github.com/Manishearth/5fc73c405641162f0712951c..., compile with cargo build --release), and the numbers I get are:
(Ranges are just what I got from 5 runs, nothing scientific)
Rust: 610-630
c: 706-716
c_fast: 919?
cpp_clang: 669-694
cpp_plain: 717-728
I'm on a new (i7, 16gb) Mac so I don't yet have g++ around (nor do I know how to obtain it without messing things up; I'm used to linux), everything here done with clang.
Of course, this isn't an indication that Rust is faster than C. But it is an indication that it can be just as fast, and a reinforcement of my point about microbenchmarks having large error bars.
Edit:
On my older x86 linux laptop (with gcc):
Rust: 844-987
c_fast: 808-860 (perhaps clang somehow made c_fast slower than c on the mac? shrug)
c: 982-1025
cpp_plain: 977-1019
cpp_gcc: 925-947
I think I've proven my point.
> My point is that the Benchmark Game is not representative of real world code. The website says as much. Because the benchmarks use sse everywhere does not mean that most code, even perf-sensitive code will use simd everywhere.
Your point is incorrect. simd is everywhere in performance sensitive code, like in memcpy, memset, strlen, strcmp, image&video decoding...
> I fixed it up to run on modern rust (https://gist.github.com/Manishearth/5fc73c405641162f0712951c..., compile with cargo build --release), and the numbers I get are:
Note that the C benchmarks are all compiled with `-g -O2`. I'm not the author of that benchmark suite and it appears whoever is has abandoned the project.
If I fix the compiler switches (-O3 obviously) and recompile, the numbers I get are:
I'm using Rust Nightly because I can't be bothered to install more than one Rust compiler.Rust: 705 C_fast: 630That the numbers you are getting aren't stable suggests that you are using shoddy benchmarking techniques. Try and run them with as few applications open as possible.
Here are my updates to the c_fast benchmark:
https://gist.github.com/bjourne/4599a387d24c80906475b26b8ac9...
With this c_fast's number is 532. That is a fair bit faster than Rust and I'm sure someone who has more time than me and is more skilled at optimizing C code can improve it further.
I'm compiling with: `clang -O3 -march=native -mtune=native -fomit-frame-pointer c_fast.c -o c_fast` and my cpu is an "AMD Phenom(tm) II X6 1090T Processor"
That comparison is misleading for exactly the reasons others have said: the algorithms differ, as can be easily seen in their very different data structures.
A naive, line-by-line port of your fast variant to safe Rust (which I unfortunately am not allowed to share, but didn't require much thinking nor much time), without bothering with prefetching, gives me numbers more like:
I'm using --release for Rust (so no CPU-specific optimisation), and the same invocation as you for C. Everything except my editor is closed when benchmarking, and I'm on a Intel(R) Core(TM) i7-4870HQ CPU @ 2.50GHz.Rust-fast: 533 C-fast: 685You seriously really can't cite benchmark results when you don't show the source.
I'm really really sorry (I want to keep my job), but seriously, the code I benchmarked was a trivial reimplementation of your code. The get_max_cost_small2 function that is benchmarked is so small and simple that someone else doing it is likely to end up with something identical!
I'm not trying to act in bad faith: as a member of the Rust core team, that would be braindead and stupid on my part.
Feel free to use my email address (easily findable) and mail me the source. Otherwise, no deal.
I literally cannot share the source, I wish I could but the reality is my job does not let me. You're being unreasonable given how ridiculously simple the benchmarked section of the code is: it would not take long for even a Rust beginner to reimplement something equivalent, especially since it doesn't touch on any of the "hard" parts of Rust (no need for explicit lifetimes etc.).
As I said before, I have nothing to gain and everything to lose by lying to you.
Oh, and transliterating C and C++ is an exception to the norm. C and C++ are historically linked and quite similar in many ways. Rust does not have this relationship with C. You could easily transliterate C code to unsafe rust code, but that sort of misses the point, doesn't it? :)
> Because it is comparing different regex engines, not language performance. I said you should apply "common sense" to The Benchmark Game's numbers.
Then don't also use regex-dna as evidence that Rust is "slow":
> Rust is also slower in binarytrees, regexdna and fasta.
You can't have it both ways.
> Rust is still twice as slow as C which is still a fair bit slower than if a skilled assembly programmer had taken on the task.
Really? Are there really people who write (a lot of) assembly in order to get code "a fair bit" faster than C? What on earth are they working on?
VM implementations and garbage collectors.
Rust reduces the amount of state I need to keep track of in my brain.
I doubt it, the mental overhead of doing "safe memory programming" in Rust is very high.
Edit: all good replies, want to clarify and forgot to mention that I was comparing to languages with a GC, since I'm seeing Rust being used for lots of stuff, in a general purpose programming language sense (like creating web frameworks for example). Also, for non-very-low-level stuff I guess this cognitive load will be less if/when they introduce an optional GC.
Designing memory-safe programs in C requires a programmer to reason about the same domains as doing so in Rust, but C doesn't double-check you to make sure you get everything right. With no guard rails, C is a lot more stressful.
Re: reducing mental state for a programmer, algebraic datatypes in general decrease the size of the state space of your program by making many illegal states unrepresentable. Without advanced forms of dependent types (maybe quotients), you can't make all illegal states unrepresentable, but you shrink the size of the state space hugely compared to writing everything as product types (as you would in C). A programmer has to reason about all the possible values their variables can take on, so it pays to minimize the cardinality of that set.
So what actually happens is that you develop habits to enforce invariants that lead to correct operation. This isn't nothing, but it's also good practice in other languages and the more you do it, the better you get at it and the less stressful it is.
Why are we throwing away all the work done on static & dynamic analysis tools for C programs in this kind of discussions? Programmers are crippled just for picking C? Come on..
The benefits of advanced static and dynamic analysis tools for C shine through on questions of semantic correctness (look at Coverity, Frama-C and PVS Studio), not memory safety (though they do reason about memory safety). You can achieve perfect memory safety (no false negatives and arbitrarily few false positives if you write appropriate abstractions around unsafe) with comparatively simple static analysis built into your compiler... but only if your language is designed to permit it.
In C, perfect static analysis for memory safety is impractical, and dynamic analysis is time-consuming and cannot preclude false negatives. We should work on porting tools which heuristically warn about semantic correctness concerns from operating on C to checking Rust programs, and this is probably necessary in order for some C or C++ programmers/projects to switch, but it doesn't pertain to the question of how much mental overhead there is to writing memory-safe code in either language.
Because ultimatelty they're imperfect.
Sure, C+static analysis is good enough for many situations. But it can't compare with the guaranteed safety offered by Rust.
> Because ultimatelty they're imperfect.
Everything is imperfect, it's not a good reason to discount anything.
I don't remember the source.. but somewhere someone said that the borrow checker would get in the way until you've learned to a certain point, then after that point developers tend to think in terms of the borrow checker by default and it _works for them instead of against them_.
Besides, you're still going to be doing safe memory programming regardless of whatever language you use. (unless you're just saying "writing broken code is easier")
This is consistent with a lot of people's experiences with Rust. Some people haven't even noticed when the mental bit flipped for them, "Oh wait, I just realized I haven't fought the borrow checker in a while..."
Of course, some people still don't like it. Not every language can be to everyone's liking, a plurality of languages is a good thing. Plus, we do have some stuff in the pipeline to increase the number of programs the borrow checker will understand; some people can get frustrated when they want to write a valid program that gets rejected, but this is going to be the case with any kind of static analysis.
The mental overhead of doing "safe memory programming" is high already. The difference between Rust and, say, C, is that Rust forces you to do "safe memory programming". C lets you get away with unsafe memory programming.
C lets you deliberately do unsafe memory programming, which is maybe OK if you actually know what you're doing. But C also lets you think that you're doing safe memory programming, when that is not the case.
Not my experience, fwiw. Rust does a great job of handling the mental overhead (am I allowed to mutate this? who owns this? what is the contract for that?) for me.
Exactly. Compiler-enforced ownership and lifetimes _dramatically_ reduces the mental overhead of memory management compared to C. It also saves time from having to run things under Valgrind and ASAN just to make sure I didn't mess up. The extra time spent getting Rust code to compile is considerably shorter than the time required to debug something Valgrind found.
I feel the same way about C++ smart pointers, which are remarkably simple to use and understand.
While definitely an improvement over raw pointers, ultimately C++'s smart pointers still fall short in several ways: they can be null, there's no lifetime checking for references (so you can still use after free), there's no object freezing (so you can still have data races), etc.
Exactly. Compiler-enforced ownership and lifetimes _dramatically_ reduces the mental overhead of memory management compared to C. It also saves time from having to run things under Valgrind and ASAN just to make sure I didn't mess up.
This is basically why I ditched Rust as soon as I found out about Nim. No mental gymnastics making me feel like I should be part of a group mind, super fast, super small binaries, optional GC, and memory unsafe stuff allowed with standard pointers and alloc/free available with or without the GC on elsewhere. I suspect Rust would be better for avionics and other high-value systems though, replacing Ada...
I like nim for the same reason (and use for for small programs), though I'm using D at $work because D is more "mainstream" and established at this time.
Can you give some examples?