Deconstructing K&R C Is Dead (2015)
c.learncodethehardway.orgLow level languages are tough. Gaining a mastery of C does require knowing quite a few strange rules and quirks and it's certainly a bit harder than learning Python. The C FAQ does a good job of illustrating some of the more confusing parts. Sure, I wish I could write Go instead, but that isn't going to happen on the many embedded systems I work on.
This is a rather strange and insulting article. I'm not sure why Zed can't help "old programmers" nor do I understand why he's angered that individuals know about undefined behavior in C. Is there any background to this or did he have the misfortune of being insulted on IRC?
Edit -- I googled for a bit and discovered this was in response to someone doing a pretty good job technically reviewing the book for free! http://hentenaar.com/dont-learn-c-the-wrong-way Perhaps the title was a bit inflammatory.
Zed's rebuttal is at https://zedshaw.com/2015/09/28/taking-down-tim-hentenaar/ and is a great example of how not to react to constructive criticism. My favorite part is his safercopy function and the lack of size_t.
And finally, to leave us all with a quote from Zed's rebuttal:
"Over this next week I’m going to systematically take down more of my detractors as I’ve collected a large amount of information on them, their actual skill levels, and how they treat beginners. Stay tuned for more."
Wow.
Read the review over at hentenar.com, though I've never read Zed's work.
All I can say is the order of topics, the choice of topics and the quoted explanations would make for a very confused beginner. Especially the crusade he seems to have against strings and functions called incorrectly. That makes me think he should be teaching the language, not the language he wishes it were. Of course these are selective quotations so I can't draw too many conclusions.
Going on my time teaching C, I wouldn't even mention Duff's device or safer, better strings at this level. There's better ways to introduce defensive programming, along with a discussion of the pros and cons.
Oh, I'm past 50 so am clearly "doomed" and beyond help. Not that I'm sure what I need help with. Oh well. :)
> nor do I understand why he's angered that individuals know about undefined behavior in C
If you read the first part of that same sentence, it should give you a clue.
> Low level languages are tough
I disagree. Low level languages, especially C, are the easiest to master. K&R book is the only book you need to read to know everything about C. All you need after you understand the fundamentals is a bit of discipline.
C++ on the other hand is extremely difficult to master. Just have a look at the rules for Rvalue references and you will see what I mean.
It may be easier for a complete novice to write some code that doesn't crash in C++ than it is in C, but not mastering it, or even be good at it.
> K&R book is the only book you need to read to know everything about C. All you need after you understand the fundamentals is a bit of discipline.
I am a huge C fan but this is not true at all. C has tons of pitfalls, especially with modern UB-aggressive optimizing compilers. There are a lot of rules you need to be aware of that are not naturally-occurring results of the fundamentals.
> especially with modern UB-aggressive optimizing compilers.
You put your finger on the problem: "modern UB-aggressive optimising compilers". C, the language, is actually quite simple (if not easy). The crazy stuff that compiler writers have been doing recently while aggressively mis-reading the C standard is the problem and does make things very complicated.
Why "misreading"?
From 1.1:
"The X3J11 charter clearly mandates the Committee to codify common existing practice."
Their emphasis, not mine. So is there a mandate to use the definitions of the standard to invalidate common existing practice? Clearly not. Yet that is what is happening.
More from the standard (defining UB):
"Undefined behavior gives the implementor license not to catch certain program errors that are difficult to diagnose. It also identifies areas of possible conforming language extension: the implementor may augment the language by providing a definition of the officially undefined behaviour."
Does it say "Undefined behaviour gives implementors license to add new optimisations that break existing programs"? Clearly and unambiguously not.
Your interpretation of "codify common existing practice" would imply that no new compiler optimizations could be implemented since 1990 (when the first version of the standard was published), as any optimization could potentially change the observable execution behavior of an erroneous program that contains UB.
> More from the standard (defining UB):
Your quote is not from the normative text of the standard, but from the non-normative rationale. Note however that it explicitly says that programs that contain undefined behaviors are erroneous, and that the implementation is not required to emit diagnostics for the UB. Pretty clearly this allows implementations to optimize erroneous programs into whatever they think is funny this week.
The normative text of the standard is pretty unambiguous:
http://www.iso-9899.info/n1570.html#3.4.3undefined behavior behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this International Standard imposes no requirements> Your interpretation of "codify common existing practice" would imply that no new compiler optimizations could be implemented since 1990
Utter nonsense. I use that word carefully, but in this case it is absolutely appropriate.
Compiler optimisations per an old but very useful definition aren't allowed to change the visible behaviour of programs (in terms of output, obviously they are allowed to change execution times).
For example, even just a couple of years ago the compilers I used would execute a loop that sums the first n integers. Nowadays compilers detect this and replace the loop with the result. While this isn't particularly useful, because probably the only reason you're summing the first n integers in a loop is to do some measurements, it is (a) a perfectly legal optimisation and (b) happened after 1990.
Unsurprisingly, you left out the second part of the (later) definition:
Notably absent is "use the undefined behaviour to shave another 0.2% off my favourite benchmark".NOTE Possible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message).> Unsurprisingly, you left out the second part of the (later) definition:
It is not part of the normative definition, which says "for which this International Standard imposes no requirements". In ISO standards, notes are without exception non-normative.
Although I think they really should add your proposed text as an additional example, as their current set of examples is evidently confusingly incomplete :-)
>Note however that it explicitly says that programs that contain undefined behaviors are erroneous
No it doesn't say that. It says that they are either "nonportable" or "erroneous". I'll take "nonportable" for 400, please.
As the "rationale" document points out, implementations are free to do something well-defined in the cases that the standard considers UB. For example, an implementation may document that it detects out-of-bounds array reads and these always return the value "0", and a hypothetical "C" program could rely on that. But implementations explicitly aren't required to do that, hence code that relies on a particular interpretation of UB in a particular implementation is nonportable, since it is a program written in an extended dialect of C, not ISO standard C.
Options like GCC's -fwrapv/-ftrapv and -fno-strict-aliasing are examples of language extensions that are essentially implementation defined UB.
Edit: Of course you could argue that things where hardware difference are a likely motivation such as signed integer overflow ought not to be UB in the first place, but instead left as implementation defined in the standard, but in that case your issue is with the C standard committee, not with implementers.
Out of curiosity, do you have an example?
Maybe I live in a C reality distortion field. :)
There are so many to choose from. Here is one I just thought up:
Can you spot the undefined behavior?void free_circularly_linked_list(struct node *head) { struct node *tmp = head; do { struct node *next = tmp->next; free(tmp); tmp = next; } while (tmp != head); }This is a great example because if it wasn't presented as "spot the UB", I'd expect very few people would raise a concern.
I've written up a demo with your code, running it through several analysers:
https://gist.github.com/technion/1b12c9b4581e915241d9483c5c2...
The tl;dr here is that tis-interpreter is a fantastic new tool, as it correctly complains about this.
Edit: I also note a departure from yester-year, where every linting tool would only manage to complain about unchecked malloc() returns.
The `tmp != head` comparion is UB because `head` is a dangling pointer after the first loop iteration, right?
Yep! To do this properly requires something more like:
void free_circularly_linked_list(struct node *head) { struct node *tmp = head->next; while (1) { if (tmp == head) { /* Has to be a separate case since even assigning * a dangling pointer is UB I believe? */ free(tmp); break; } else { struct node *next = tmp->next; free(tmp); tmp = next; } } }I'm not sure what you are trying to achieve by using the infinite loop. There's a more direct way.
Great example, by the way!void free_circularly_linked_list(struct node *head) { struct node *a = head->next; while (a != head) { struct node *b = a->next; free(a); a = b; } free(head); }
Why is it UB?
Let's say head value is "10" and the memory at "10" is {..., next: "10"}
After the first iteration we will have:
Head: "10" Next: "10" Temp: "10"
With "10" pointing to freed memory. But why do we care? We are not dereferencing it, are we?
(I think I am missing something very obvious)
I think you’re missing the fact that "There are a lot of rules you need to be aware of that are not naturally-occurring results of the fundamentals."
I was indeed. Thanks for the insight!
Because the standard says even comparing a dangling pointer is UB, which was haberman's point about the standard being non-intuitive.
Thanks. I wasn't aware of that. I stand corrected!!
For someone who wants to learn C from the ground up, do you have some kind of learning path or books you'd recommend?
I was a big fan of this post from a few days ago. Has a great list of resources and different areas to cover: http://blog.regehr.org/archives/1393
Thanks, that's what I was looking for.
These days I'd say the ISO standard is the only "book" you need to know everything about C.
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf is the most recent draft before the official, purchase-only C11 was published according to http://www.open-std.org/jtc1/sc22/wg14/www/standards. I don't know if it's identical, but it should be close, and it's free.
K&R book is the only book you need to read to know everything about C.
Absolutely not.
I wouldn't say that C is easy to master, but it's not very difficult either.
The problem with C is that even a C master can't necessarily write correct code, because C is a very programmer-unfriendly language, making developers remember to do various actions manually and perform error-prone calculations.
C++ is definitely harder to master (after many years, I can't say I master every corner of the language), but it's much easier to write correct code in C++ and it will be just as fast, run on as many platforms, etc, etc.
C lost this battle a long time ago, it's surviving because of nostalgia, still having good street cred and inertia. The number of domains where one must use C is shrinking and now that we also have Go and Rust this will accelerate. All for the better, really.
> C lost this battle a long time ago... The number of domains where one must use C is shrinking
I doubt that. Kernels, drivers, embedded devices (not IoT), GNU world, are all highly C oriented. Want to develop for a customer with unknown unix variant? Want to develop a tool everyone are going to use, either on Linux/BSD/Solaris? C is the only option.
> but it's much easier to write correct code in C++ and it will be just as fast
Writing correct and fast C++ code at the same time was never an option; even today, with "safe" pointers, people are still confused how to correctly use shared_ptr<>.
> now that we also have Go and Rust this will accelerate
Some places where C is still a strong contender:
* good tooling - debuggers, memory leak detectors, years of experience with compilers on various platforms
* well understood language - C has dark corners and they are documented well
* interfacing with everything else - from devices to libraries and languages
Rust can piggyback on almost any C tooling (emits DWARF debug info), and has very strong C interoperability, if C can talk to it, rust probably can too.
This can be summed up as: Languages like Python are easy to learn but hard to master whereas C is hard to learn but easy to master.
This is not true because K&R does not address multithreaded programming at all.
I can't really speak for Zed's expertise and/or value to the programming community. From what I gather, a few of his projects are widely used (Mongrel comes to mind), and he seems to know his stuff pretty well. I also identify strongly with his Programming Motherfucker[0] rant.
But man, the guy is insecure to the point of requiring therapy or something. He seems obsessed with his image and status, and the slightest criticism will cause him to lash out in an immature and ridiculous manner. Past rants have him making lewd comments about penis-sizes and challenging others to a physical fight[1].
It's a shame, because if he just relaxed a bit and took criticism gracefully, he'd probably find himself to be a bit more valuable to the community and employers, and would actually be a pretty decent dude. Instead, his writing seems to reek of a constant need to validate and defend himself.
This is probably an unfair comparison, but I can't help but think of Terry Davis: a brilliant programmer hindered by mental issues. Schizophrenia is obviously not the same as insecurity, but I think the situation here is somewhat similar.
[0] http://programming-motherfucker.com/
[1] http://harmful.cat-v.org/software/ruby/rails/is-a-ghetto
Being passionate about things and having opinions can mean that you eventually burn out on something and for the sake of your own mental health have to move on.
I don't think Zed's doing anything wrong. He's saying what he thinks needs to be said, he's challenging the complacent, and he's not pulling any punches. If you don't like his attitude there's plenty of other people to listen to. I appreciate that he's out there making noise, getting people to re-think their assumptions about programming.
If you live life by particular principles sometimes you have to take the hard road. You can't argue it hasn't been an interesting path.
Maybe being "passionate" is not a healthy state of mind. The community blabbers so much about passion which is nothing more than an almost uncontrolable emotional state.
Screw passion, I'd rather have discipline and a healthy interest instead.
Casually blaming character differences on assumed mental issues is pretty crappy.
Wow, [1] is pretty far out there, what with taking on the worldwide conspiracy to deny him recognition as History's Greatest Genius. Has he at least mellowed out since 2007?
Yes>Has he at least mellowed out since 2007?Not by much though.
I know I really shouldn't be picking on the mentally ill, but the thought processes of extreme narcissists are honestly more alien to me than those of schizophrenics, who at least react in an understandable way to being told by God that what we call "reality" is actually an illusion.
Whereas the guy we're discussing seems to actually think that Paul Graham, billionaire, wakes up every morning thinking "how, today, can I enable the vicious internet slander campaign against Zed Shaw? (I am so intimidated by his genius)."
Er… nothing?What would happen if I decided to pay you back for HN Paul? What would happen if I started honestly reviewing your startups’ products? If I just picked the worst ones, and then started tearing them in half? What would happen if I went on every HN hiring post and started posting dirt about the various shit HR practices your companies have? What if I took all this writing and got my friends you’ve fucked over to help me broadcast it? What if I started posting this writing as replies to many of your comments? What if I started offering to advise new coders, the millions I teach a year (yes, millions Paul) to avoid all of your company’s startups? What would happen if I just started putting anti-YC ads on my properties? What if I started telling everyone how you take 7% and don’t give startups any real guidance? What would happen if I started talking about the crazy bullshit I know has happened at YC startups I’ve worked for and others have told me about?
Despite having a veneer of a good comment, well written, sourced with links, and starting with some (faint) praise - it's actually just an ad hominem, and not appropriate here.
It's not really ad hominem. More like constructive criticism of his rhetorical style.
> It's not really ad hominem. More like constructive criticism of his rhetorical style.
No, it's a blatant ad hominem.
The guy invested his time and effort trying to improve the world by writing a technical book, which he then proceeded to give it away for free, and to this we see people like barbs replying with personal attacks accusing the author of being mentally disturbed to the point of requiring therapy.
This is a personal attack at its worst.
Perhaps the issue here is the C programming language and how teaching it can be improved, not what insults and personal attacks a random user online is able to throw at the author of a technical book.
People have more to lean from writing on undefined behavior than puerile complains regarding comments on penis sizes and ironic accusations of immaturity.
The term "ad hominem" is typically used to describe the fallacy of attacking the person making an argument, rather than the argument itself. Sure, I'm discussing his character, but I'm not trying to win any argument here - he may be correct in what he's saying.
> The guy invested his time and effort trying to improve the world by writing a technical book, which he then proceeded to give it away for free
And I think this is certainly laudable, especially since they seem to have helped so many people. But he also called the Rails community "pricks, morons, assholes, and arrogant fucks who didn’t care about the art or the craft." and I think he should be held accountable for that, amongst other things.
I wrote a comment about the author's behaviour in public forums and in blogs, something I think he should be held accountable for, and something which I believe hurts both him and the communities he participates in. I believe this is relevant, and I'm entitled to discuss this here.
> The term "ad hominem" is typically used to describe the fallacy of attacking the person making an argument,
This is a discussion on a book on the C programming language written by someone, and here you are going full throttle on your personal vendetta against the author while saying absolutely nothing regarding the book or the programming language.
> Sure, I'm discussing his character
Precisely.
Go vent your frustrations somewhere else.
Please stop dictating what this discussion is about, and what people can and cannot discuss here. This is perfectly relevant.
Thanks for putting your finger on that.
In commenting on how Zed thinks too much about his self-image, you criticise how he comes across image-wise...
It's pretty difficult to sail upwind by heading directly into the wind. So, don't point into the wind if you want to get further upwind.
I don't have an opinion on the actual topic, but whether someone's goal is others' perception of them, or they are just poorly optimizing it as a proxy for their sense of self worth, attacking every criticism head on could undermine how others perceive them, or it could waste their time and mental energy compared to other things they actually care about more than the measure of their achievements based on others perceptions.
> I don't have an opinion on the actual topic, but whether someone's goal is others' perception of them
Perhaps if you don't have an opinion on the actual topic, you should refrain from commenting on it the best thing you have to add are a series of ill-advised ad hominems.
How mature do you think it is that you admit not to know or understand something, and dismiss it as crazy?
This is obviously a bitter rant, and devolves into uncomfortably ageist territory about halfway through.
I do agree that we should be moving away from C and C++, though. It's pretty simple, really: C was a pretty good language in 1978. We didn't know a lot of things in 1978 that we do now in 2016. It now makes sense to revisit those decisions in light of nearly 40 years of practice. The so-called "PL Renaissance" has given us a whole host of new languages which have steadily chipped away at the dominance of C and C++, and I think this is a healthy trend that ought to continue.
I'm ready for the hate, so here we go... C was not a well-designed language in 1978.
The fact that C arrays decay to pointers without any bounds is single-handedly responsible for a huge chunk, possibly even the majority, of all RCEs, worms, malware, and exploits. Ever. In the history of computing.
It was a bad design.
It was a bad design in 1978.
It was known to be a bad design in 1978.
Other languages knew that checking array bounds was important, including for security. The internet made the impact of using C much more devastating but people were exploiting buffer overflows in the 80s to great effect. Some of C's predecessors/contemporaries passed a length as the first part of an array so bounds-checking was possible, though that has the downside of not being able to pass slices of an array without copying.
C could have included an arrayref type that was a length + base pointer, and let array l-values decay to an arrayref instead of a pointer. Then taking a slice of an array would not require copying elements. You could still take the address of an individual element. This would not have required much work to implement, even in 1978! Maybe the first compilers didn't insert array bounds checks, but at least the entire design wouldn't preclude them. Let's say you even spell arrayref as []. It would mean sizeof() works on arrays passed to functions.
void wat(int[] values) { for(int i = 0; i < sizeof(values); i++) { printf("look ma, no buffer overflows! %d", values[i]); } }
(Yes, I know this is not K&R syntax)
Maybe you can forgive C for the stupid header compilation model (why let the compiler do what you can make the programmer do by hand?). You can understand why they might not have foreseen the need for namespaces. D&R didn't invent the macro system so that's not even their fault.
What is unforgivable is the horribly stupid design of C's arrays.
I actually think it would be beneficial if the standards committee added arrayref now. It won't fix all the busted C code but at least you could start improving the #1 problem. Compilers could eventually adopt a flag to prohibit arrays from decaying directly to pointers. You'd probably have to introduce lengthof() to avoid confusion and use some other syntax to declare one, maybe array(int) or something.
I suspect this has a -lot- to do with performance.
When C was designed, and even today, there are systems without pipelining, where it is expensive (in time) to de-reference a memory address and follow that pointer.
I do not argue that the design you suggest would be safer, and even have advantages for slicing; but that's really not the kind of program that C was intended to service writing.
Also, C is supposed to scale down to //really// simple systems. Systems that lack indirect addressing modes, caches, MMUs, etc. It is literally intended to be a thin veneer over actual assembly for those systems, and why so many operations are specified in terms of /minimum standard unit size/ (for portability of that almost machine code between systems).
What you advocate is more like what C++ actually /should/ have been; a reason to use something more than C to gain advances in safety and ease of design.
>I suspect this has a -lot- to do with performance.
It's questionable whether people wanted that performance though, at least when it resulted in less security. About bounds checking in ALGOL 60: https://en.wikipedia.org/wiki/Bounds_checking
A consequence of this principle is that every occurrence of every subscript of every subscripted variable was on every occasion checked at run time against both the upper and the lower declared bounds of the array. Many years later we asked our customers whether they wished us to provide an option to switch off these checks in the interest of efficiency on production runs. Unanimously, they urged us not to—they already knew how frequently subscript errors occur on production runs where failure to detect them could be disastrous.
> It's questionable whether people wanted that performance though, at least when it resulted in less security.
There's no question about it, the "ANSI C Rationale" makes it very clear what they considered "the spirit of C"[1]:
> - Trust the programmer.
> - Don't prevent the programmer from doing what needs to be done.
> - Keep the language small and simple.
> - Provide only one way to do an operation.
> - Make it fast, even if it is not guaranteed to be portable.
> The last proverb needs a little explanation. The potential for efficient code generation is one of the most important strengths of C. To help ensure that no code explosion occurs for what appears to be a very simple operation, many operations are defined to be how the target machine's hardware does it rather than by a general abstract rule. An example of this willingness to live with what the machine does can be seen in the rules that govern the widening of char objects for use in expressions: whether the values of char objects widen to signed or unsigned quantities typically depends on which byte operation is more efficient on the target machine.
> One of the goals of the Committee was to avoid interfering with the ability of translators to generate compact, efficient code. In several cases the Committee has introduced features to improve the possible efficiency of the generated code; for instance, floating point operations may be performed in single-precision if both operands are float rather than double.
[1] http://www.lysator.liu.se/c/rat/title.html Quoted section is found here: http://www.lysator.liu.se/c/rat/a.html#1
"The block structure of ALGOL 60 induced a stack allocation discipline. It had limited dynamic arrays, but no general heap allocation. The substantially redesigned ALGOL 68 had both heap and stack allocation. It also had something like the modern pointer type, and required garbage collection for the heap. The new language was complex and difficult to implement, and it was never as successful as its predecessor."
-- http://www.memorymanagement.org/mmref/lang.html
Adding runtime bounds checking of automatic storage arrays (i.e. arrays on the stack) is relatively easy in C, at least until the compiler runs into illegal type punning. The real problem in implementing these compiler safeguards comes with crossing translation units, or with heap blocks. There's a reason languages like Rust and Go rely heavily on static linking and stack allocation; it's more difficult or more costly to implement those safeguards when the compiler can't see all the source code, or pointers pass through an opaque layer. Nothing in C precludes automatic bounds checking of all array access, via fat pointers or lookup tables. Fabrice Bellard's Tiny C compiler implemented precise bounds checking for both automatic and dynamic storage-allocated objects a decade before UBSan and ASan. Even deriving an invalid pointer crashed the app at the precise point where it happened. That widely-used C compilers don't do that is a strong hint there are other, real-world constraints in place.
Also, in language like Java it's not uncommon to see people reinventing dynamic heap allocation using char arrays, susceptible to all the same overflow problems. When you see people doing that, that should be a hint that a language like C might work well.
I don't understand all the C hate. Then again, I have no problem employing various languages according to the task, or creating DSLs. I suppose if I was wedded to a single language or to the idea of a single language, C would look much worse to me.
> There's a reason languages like Rust and Go rely heavily on static linking and stack allocation
This is untrue: Rust certainly does not do any optimisations linking statically by default, nor is there a difference between putting an array on the stack or on the heap. While it is true that code can benefit from whole-program optimisation, it isn't the default in either language, just like it isn't the default in C.
Languages which bake in automatic bounds checking at every access rely on optimization to recover the performance hit. Without static linking, automatic GC, and other constructs that's very difficult.
LTO notwithstanding, once you add those more sophisticated constructs, iterating the language becomes more difficult. You don't hit upon the best method for implementing various types the first time, or the second time, or even the third time. glibc is backwards compatible for programs compiled over 15 years ago (GCC's fixinclude hacks notwithstanding). You'll never see that with Rust's or Go's standard library, just like you never saw that with C++.
My point wasn't that static linking was necessary. My point was that static linking is indicative of other tradeoffs that most people don't understand. Static linking isn't just about making packaging easier. It's also about making it easier to write and implement the compiler and standard environment.
My more abstract point is that people who think C is on its last legs don't understand the whole picture. There's nothing intrinsic to C that makes it unsafe. Febrice's compiler was perfectly capable of implementing the C standard to the letter. What makes C unsafe are the requirements found in the niches where C exists, and those requirements don't magically disappear because the name of the language changes.
Rust supports unsafe code, but implementing code in Rust which is rigorously robust in the face of OOM situations, or where you need to implement use-case memory management strategies requires relying almost exclusively on unsafe code. (Try using Rust without boxing, for example, as is necessary if you want to catch OOM.) If you don't need those things, you probably don't need a low-level language, either. I love C, but I also love language like Lua with lexical closures and stackless coroutines. To me, languages like Rust and even C++ exist at a middle ground that is very unappealing to me.
C isn't standing still, either. Strategies like SafeStack (see http://dslab.epfl.ch/proj/cpi/) can provide substantially the same safety guarantees as Rust in terms of real-world attack vectors, without having to modify any existing C software, and without giving up performance.
None of this is to say languages like Rust are useless. Just that the harms and inevitable demise of C per se are, IMHO, greatly exaggerated. And if and when a language like Rust grows in usage, I doubt it will supplant C so much as open and populate virgin territory.
> C isn't standing still, either. Strategies like SafeStack (see http://dslab.epfl.ch/proj/cpi/) can provide substantially the same safety guarantees as Rust in terms of real-world attack vectors, without having to modify any existing C software, and without giving up performance.
That paper indicates that you do in fact give up performance, and the performance is comparable to existing SFI techniques. SafeStack itself is insufficient to prevent UAF problems with the heap. CPI prevents them, but with significant overhead. And you still don't get full memory safety.
> Try using Rust without boxing, for example, as is necessary if you want to catch OOM.
It's not necessary, you can plug in a custom allocator that works differently and use boxing as usual.
There are plans for more robust custom allocator APIs that make this even easier to handle.
Also, really, even if Rust didn't have this, the situation wouldn't be worse than C. In C you have to malloc and free things manually. In Rust you can do that too. Rust's abort-on-OOM is an stdlib thing (which can be overridden as previously mentioned).
> Languages which bake in automatic bounds checking at every access rely on optimization to recover the performance hit.
The performance hit is generally negligible, especially with abstractions like iterators in Rust that avoid them entirely, and standard optimisations that can lift the checks out of loops... optimisations that do not need any of the things you say that the compilers want. The cost of calling code in a different dynamic library (e.g. getting the dynamic symbol address and then doing the actual call) is going to be much greater than whatever bounds checks it does in almost all situations.
> There's a reason languages like Rust and Go rely heavily on static linking and stack allocation.
As I just said, this is factually false. Static linking is entirely orthogonal to bounds-checking optimisations (neither Rust nor Go do whole program optimisations when linking statically, so it can't be the motivation for it), as is putting data on the stack. GC seems even more irrelevant, especially to Rust which doesn't have one.
> My point was that static linking is indicative of other tradeoffs that most people don't understand. Static linking isn't just about making packaging easier.
But it isn't indicative! In Rust's case, linking statically is for packaging: the reason is the ABI is unstable, so dynamically linking is very annoying to manage and many of its benefits are inhibited.
> There's nothing intrinsic to C that makes it unsafe.
The forever-growing list of CVEs caused by basic mistakes in C code says otherwise. Things like overrunning a buffer or reusing a freed pointer are not at all caused by domain specific constraints, they're the price one pays for using 40 year old technology. You can see this in modern tools that try to assist with getting safer C: they are often using things that didn't exist when C was created. (And, don't get me wrong, C is here to stay, even if all new C development was stopped today, and so efforts to make it safer are very good, but at some point we have to face the reality of C/stop the C-apologism.)
> Febrice's compiler was perfectly capable of implementing the C standard to the letter.
This is essentially meaningless for two connected reasons: the major problem with C is the holes in the standard (undefined behaviour)---not compiler bugs---and, people want fast code, they need optimisations, which often exploit undefined behaviour.
> (Try using Rust without boxing, for example, as is necessary if you want to catch OOM.)
Boxing or not is irrelevant to safety: using Box allows in fact more aggressive `unsafe` code (one can rely on address-stability to correctly sidestep the compiler's normal checks). Rust-the-language knows effectively knows nothing about the stack or heap when reasoning about safety: it does reason about stack scopes, but it doesn't care where the data is actually positioned in memory: Box<T> is isomorphic to a plain T in this respect.
In any case, the power of Rust is the ability to wrap code into safe abstractions: if there is a particular feature the standard library doesn't provide (yet), external libraries have the power to create APIs that have the same level of safety, maybe with a bit of `unsafe` internally. You can see this even in "use-case memory management" situations like a kernel: http://os.phil-opp.com/modifying-page-tables.html
The lack of bounds checking is one of the biggest problems in C, but there are worse problems (use after free) that nobody has even thought of a solution for.
> That widely-used C compilers don't do that is a strong hint there are other, real-world constraints in place.
Yes. Those constraints are self-inflicted wounds caused by the fact that C wasn't designed for this. If you have a proper iterator API, a culture of unsigned array indexing, widespread use of a size_t equivalent instead of int for loops, etc. etc. these issues vanish.
> Also, C is supposed to scale down to //really// simple systems. Systems that lack indirect addressing modes, caches, MMUs, etc. It is literally intended to be a thin veneer over actual assembly for those systems, and why so many operations are specified in terms of /minimum standard unit size/ (for portability of that almost machine code between systems).
This is C snake oil, sold by the C community that usually omits the fact that computers build a decade before the PDP-11 like the Burroughs, already had much better systems programming languages like ESPOL and NEWP, two Algol derivatives.
Algol 68 already had slices and modules (Algol68 RS) among many other nice features, running in early 70's hardware.
There are a few other examples.
Maybe you can forgive C for the stupid header compilation model (why let the compiler do what you can make the programmer do by hand?).
This model enables binary-only distribution of libraries, you get the code as a .a (or lib, .so, .dll or whatever) and the API declaration as a header file.
You can write code against a library without having the library, using only the header. You can't do the final linking of course, but you can write the code.
The alternative, I guess, would be to embed this information in the library itself, and have the compiler extract it, which sounds as if it would have been scary from a performance point of view 40 years ago (and also somewhat hard).
Why do you think that parsing a compiler-generated binary representation of an API is more expensive than parsing a human-readable textual representation of the same API? Also, have you ever built a large template heavy C++ program?
When I've mentioned what you propose to other programmer I always get baleful looks and a statement that, 'that's not how C is supposed to work'. Mention bounds checking, same thing.
I used to think it was hopeless, especially as each new language that came out required garbage collection or worse targeted the JVM. Perhaps cloud services will motivate people to fix this stuff since now computing costs are a hard line item on the books.
It sure wasn't a bad design for Unix Implementation Language (tm)
> C was a pretty good language in 1978. We didn't know a lot of things in 1978 that we do now in 2016. It now makes sense to revisit those decisions in light of nearly 40 years of practice.
We surely did know that Burroughs was selling an operating system written in ESPOL, later NEWP in 1961. Nowadays Unisys still sells them as MCP.
We did know that the Flex machine was written in ALGOL 68RS in 1980.
We did know that VME was written in S3 in 1970.
We did know that Pilot was written in Mesa in 1977.
We did know that Lillith was written in Modula-2 in 1997.
There are lots of other examples.
The main difference was that UNIX and consequently C, source code were available for free because AT&T could not sell it, while people had to pay for the other ones or they were behind research walls.
Correction to myself, Lillith was written in Modula-2 in 1977.
It's remarkable, though, that it's taken 40 years to make much head way in replacing C.
There's still a lot of C code out there, and a lot of new C code still being written.
> It's remarkable, though, that it's taken 40 years to make much head way in replacing C.
I don't agree: it's been a long process, but the trend is unmistakable. It's hard to remember now, but in the early '90s C and C++ were completely dominant. Nowadays they're much more specialized: you're as likely to build your company on Java or even Python/Ruby as you are to build it on C++. People talk about how it's hard to hire C++ engineers nowadays, while in the '90s "C++ engineer" was pretty much synonymous with "programmer". And so on.
That's quite normal, because the market today is much larger and diverse. e.g: It doesn't make any sense to build web apps in C or C++ and this type of software is very spread nowadays but was virtually non-existing back then.
The interesting question is what will mobile devices, the IoT and embedded devices in general be programmed in? C and C++ are popular choices today, so the trend is not really "unmistakable".
Yes, but the majority of those devices, at least the ones with enough KB, use C or C++ for hardware integration or some interpreter, with the rest of the stack in something else.
Everything else usually has other languages available as well, one just needs to search for what is out there.
> Nowadays they're much more specialized: you're as likely to build your company on Java or even Python/Ruby as you are to build it on C++.
Some of today's dominant platforms are developed mostly in Java (see Android), and web development targets the LAMP stack. This means that the business is centered in ventures that exclude most languages, not because there is technical merit on other alternatives.
I'm sure it's possible to gather some individuals that are more than willing to badmouth Java and Python with a passion with the same ease we see here people complaining about C.
Guess which language was used to write your OS, browser, games, compilers, runtimes and so on. If you count the number of hours that people spend on software written with C/C++, I think all other languages would be left in dust.
Lots more will be written until there is a viable replacement for C in embedded systems.
A few alternatives possible to buy today:
Oberon for ARM Cortex-M4, Cortex-M3 Microcontrollers and Xilinx FPGA Systems
http://www.astrobe.com/default.htm
Pascal and Basic for lots of micro and pico-processors
http://www.mikroe.com/compilers/
Ada,
http://www.ghs.com/products/ada_optimizing_compilers.html
http://www.ptc.com/developer-tools/apexada
Java for MCUs
Exactly. It's beyond my authority to change the language in use, but I would love to have alternatives to argue for. C isn't a terrible option, or is perhaps the least terrible option, but it's not leaving my corner of the corporate universe until a proven alternative establishes itself.
What in particular makes Rust (or Go or Swift) unsuitable?
GC languages are almost certainly not even in consideration for most embedded applications. There aren't good strategies for general garbage collection that don't insert random pauses for starters, and you also have a non-negligible impact on RAM usage on systems where RAM might be a premium.
A lot of the things rust brings to the table aren't always relevant on embedded platforms. Dynamic memory allocation on embedded is the exception, not the rule. Everything is statically allocated, so memory management is relatively simple -- everything sticks around forever.
The things that make C/C++ good for embedded are sorta what make it unfortunate for general purpose use. The things that make Rust/Go/Swift good for general purpose use make it unfortunate for embedded use.
> A lot of the things rust brings to the table aren't always relevant on embedded platforms. Dynamic memory allocation on embedded is the exception, not the rule. Everything is statically allocated, so memory management is relatively simple -- everything sticks around forever.
In that case, you can simply not use the dynamic allocation features of Rust, just as you can simply not use malloc() in C.
Absolutely, but then what does it buy you over C/C++? It has a few 'nice to haves' over that, but in this domain those won't tend to be nice enough to motivate most people to bother setting up toolchains purposed for it.
Rust is not a GC'd language, it essentially uses RAII to deterministically determine at compilation time when memory will be freed.> GC languages
Rust fully supports running entirely without dynamic allocation. There is a subset of the standard library defined explicitly for this purpose.> Dynamic memory allocation on embedded is the exception> Rust is not a GC'd language
Right, I was referring to Go and Swift with the comment about GC. Since rust didn't have that problem, I mentioned why major adoption might not be forthcoming.
I understand that you can statically allocate all you want in Rust, but it's memory safety, particularly with regards to object ownership is one of its major selling points. Object ownership and lifetimes are trivial when everything is static.
The other arguable selling point to rust is its standard library, but much like C++ the standard library would be left aside in most embedded applications.
So it doesn't buy you much of anything at all, but it takes some work to setup, plus you are fighting the momentum that C/C++ has. Unless there is some other compelling reason to use it, I don't expect much adoption.
Doesn't the design of Go, the language, basically require that the implementation involve a runtime with a GC system? So that would make it a non-viable choice for programming in applications where memory footprint and real-time performance must be tightly controlled.
Rust is a better choice, and it's designed with this in mind. It's still young, though, and I think there might be some as-yet-unsolved issues (these are things I've vaguely heard of and could be totally off-base) like binary size, ease of dealing with raw pointers, etc.
If I was doing this sort of programming for a personal project, I'd probably try using Rust, because I like it.
Dunno about Swift, though IIRC the current reference implementation may also currently rely on GC.
The majority in the size of typical Rust binaries is the huge amount of space (400 kb or so) that it takes to statically link jemalloc. But if you're building for a device that doesn't support dynamic allocation then you're not going to be including jemalloc, so binary size shouldn't be a problem.> like binary size, ease of dealing with raw pointers, etcAs for raw pointers, they're exactly as capable as raw pointers in C, though they're deliberately more verbose as well, because even in embedded contexts one should be favoring references over raw pointers, since references are still fully checked for safety even in embedded mode and yet are represented by raw pointers at runtime and hence have zero runtime overhead.
Rusts binary size issues are just a matter of defaults, and they're an additive bloat, not multiplicative (If the corresponding rust binary for a 5kb C binary is 200kb, then the corresponding rust binary for a 100kb C binary is 305kb). Most people don't care about a few extra kilobytes in their binary, so Rust has chosen defaults that make some things easier but also add some extra binary size. You can turn these off and get tiny binaries if you wish without much effort.
Ironic, but the one think they are missing is an easy convenient way to call C libraries.
Rust is very promising language, but unless you plan to write everything from scratch, you have to depend on 3rd party driver implementations for most stuff. Databases for example. There isn't a single database vendor that has Rust drivers.
Calling C from Rust is very convenient. All you have to do is declare the structs and function signatures and then it's like calling any other unsafe function.
IMHO it is OK if you are only going to call a few simple one, but for calling a few hundred complex ones (callbacks with variable argument list, etc..), it becomes a bit cumbersome.
BTW I am not implying that is Rust's fault, actually I can't think a syntax that would make it less verbose, and I am a huge Rust fan.
Not really, there are tools that can take C header files and spit out bindings. Rust also supports vararg C functions even though Rust itself doesn't support varargs. Writing safe Rust wrappers around these unsafe C bindings can sometimes be tedious, but this isn't worse than using C directly (which is inherently unsafe).
I didn't say it is not supporting them. I said that is a bit cumbersome to have to write the shims for them.
I wasn't aware of the tools that do this automatically. Just found one and it looks promising.
Yeah, my point was that thin shims are free, and "thick" safe shims take effort but as far as comparing with pure C is concerned you only need to compare the cost of thin shims. C doesn't have safety so the concept of a thick, safe shim is nonexistant.
actually I can't think a syntax that would make it less verbose.
something like C::function_from_c() is one option
For very extensive C APIs like this (Lua, GTK, etc.) we typically see people create thin, Rust-friendly wrappers to reduce the amount of manual API calls that need to be written out.
Go is unsuitable as a replacement for C because it is garbage collected. End of discussion.
Rust is unsuitable as a replacement for C because its memory management is poorly thought out (ie. its a joke). Here's the relevant paragraphs from the Rust FAQ. Really?
"Rust avoids the need for GC through its system of ownership and borrowing, but that same system helps with a host of other problems, including resource management in general and concurrency.
For when single ownership does not suffice, Rust programs rely on the standard reference-counting smart pointer type, Rc, and its thread-safe counterpart, Arc, instead of GC.
We are however investigating optional garbage collection as a future extension. The goal is to enable smooth integration with garbage-collected runtimes, such as those offered by the Spidermonkey and V8 JavaScript engines. Finally, some people have investigated implementing pure Rust garbage collectors without compiler support."
> For when single ownership does not suffice, Rust programs rely on the standard reference-counting smart pointer type, Rc, and its thread-safe counterpart, Arc, instead of GC.
This is a library thing, not a language thing.
If single ownership is enough for you, go ahead and use it. But if you need a different memory management strategy, that is available too.
Rust, the language, provides a single clear memory management strategy. It also provides the ability to design your own abstractions for different strategies, and implements some of these in the stdlib.
C/C++ have refcounting and GC libraries too. Does that make them a joke?
You don't describe why you think it's a joke.
Steve - did you read the FAQ? They really don't know what direction to go. At least Swift stuck with ARC, being a necessity as they were tasked with merging Swift with the Objective-C runtime for interoperability. Rust's multiple methods of memory management makes it strange, at best, for programmers to decide how to build a program or an API. Are the owners of Rust planning on adding a GC? Really?
I helped edit the FAQ.
That's incorrect. A systems language needs to be flexible; it cannot dictate that everyone must do something a single way.> They really don't know what direction to go.Does the presence of libraries for refcounting or even GC in C (most famously Boehm) mean that C has an incoherent story around memory?
It does not. Each has their place. Need single ownership? Use a type that has it. Need multiple ownership? Use a type that has it.> Rust's multiple methods of memory management makes it strange, at best, > for programmers to decide how to build a program or an API.
Not really. There's a few different things here: first is integrations with other systems that have a GC. As an example, consider Servo: it has to interact with Spidermonkey's GC, since it interfaces with JavaScript code. Consider the opposite system: Rust embedded inside of another language, let's say Python, where you want to be able to talk to Python's GC for various reasons.> Are the owners of Rust planning on adding a GC? Really?The second is something like Bohem: if a system wants to use GC for some reason, they have an interface to add a GC'd type. But Rust proper, the language, will not have GC. It's completely contradictory to the goals of the language.
> Are the owners of Rust planning on adding a GC? Really?
To address this specifically: Current plans for Rust are mostly along the lines of adding the bare necessities in the stdlib to allow GC implementations to be written.
There are mostly-niche use-cases for having a GC in Rust. I've written some of the motivation here[1] (note that that blog post is about a pure library GC independent of Rust, which is different from what is planned; but the motivations are similar).
One major use case is if you want to talk to a language which has a GC. Say you're writing a native extension to a Ruby or Node and want to deal with the GCd types within Rust code in a safe way without pausing the GC. Or if you're writing an interpreter for a GCd language. Or you're writing some code that deals with complicated cyclic graph-like datastructures.
These are all pretty niche, but the workarounds in these cases aren't pretty so it's nice to have some form of GC capabilities in Rust. This is not a price you pay by default, and it's not something that affects anyone but the people who need these types. It will probably take the form of some low level APIs that use LLVM stack rooting to collect roots, and some traits in the stdlib, which can be used by an independent GC library (not part of the stdlib) or a language bindings library (also not part of the stdlib).
Rust itself will never get a GC as part of the language.
[1]: http://manishearth.github.io/blog/2015/09/01/designing-a-gc-...
Steve wrote the FAQ and most of the docs
I would hesitate to say I write the FAQ these days; after Brian's efforts to revamp it, it was VERY MUCH a community effort.
This post is a joke without a material argument to support your claim.
> C was a pretty good language in 1978. We didn't know a lot of things in 1978 that we do now in 2016.
We also have nearly 40 years of infrastructure built on C, which needs to be maintained and updated.
This is the same old argument advocating for rewriting everything from scratch just because someone somewhere managed to develop a new flavor of the month.
There are plenty of reasons why the whole world still has a heavy demand for COBOL and FORTRAN developers, and the development of new flavor of the month isn't a good enough reason to eliminate this demand.
> This is the same old argument advocating for rewriting everything from scratch just because someone somewhere managed to develop a new flavor of the month.
I'm not saying rewrite everything for no reason. I'm saying that there are reasons, and we've gotten a very good idea of what those are over the last 40 years.
C can be replaced with a reasonable amount of effort with C++.
But C++ won't be easy to replace, and I'm not sure it needs to be, since rewrites are highly risky, time consuming and disruptive. With some luck and depending on how the language evolves we might be moving from C++ to a safer C++.
I agree, and that was one of my motivations to adopt C++ instead of C when Turbo Pascal wasn't any longer an option.
But using the C++ features that make it safer than C is only an option in small security motivated teams.
Sadly the majority of C++ teams, at least in the enterprise space, tends to use it as "C with classes" thus voiding most improvements the language has to offer over plain C.
You've hit on an important point - culture. Every programming community has it and it can enhance or hinder the adoption and usability of a language.
C++ is split between multiple factions. I'm doubtful that the one programming in C with classes is interested in learning e.g Rust.
I am one of the many stalwarts whose bookshelf contains a prominent copy of K&R C. But over the last 10 years or so I find myself referring to it less and less often. It's a huge problem that it stopped at the second edition. The 2nd ed was great in 1999. It is not great in 2016, it is only good.
> "You're right, but you're wrong that their code is bad." I cannot fathom how a group of people who are supposedly so intelligent and geared toward rational thought can hold in their head the idea that I can be wrong, and also right at the same time.
Zed, you're right, period. But I think you probably just hurt people's feelings because they revere Kernighan and Ritchie and this is one prominent item of their legacy.
> But C? C's dead. It's the language for old programmers who want to debate section A.6.2 paragraph 4 of the undefined behavior of pointers. Good riddance. I'm going to go learn Go (or Rust, or Swift, or anything else).
Amen. The union of those three are likely to address all use cases that C handled in the past.
BTW the blog post would be clearer if titled: " 'Deconstructing K&R C' is dead". Gotta love mixing up C with natural language operator precedence ambiguity. :)
The thing is, I think you can simultaneously have all of these opinions: (a) K&R were/are top-notch computer scientists; (b) K&R was a fantastically written book; (c) C was a great language in 1978; (d) we should be moving away from C in 2016. The fact that we didn't know as much about programming languages in 1978 as we do now in no way diminishes the significance of the work.
I think that C should rapidly be moving toward obsolescence, and I hold K&R in great esteem.
Agreed on all accounts.
>Zed, you're right, period. But I think you probably just hurt people's feelings because they revere Kernighan and Ritchie and this is one prominent item of their legacy.
This is hilarious, because programmers by and large love to pride themselves about being stoic, logical, and practical in lieu of letting emotion dictate what they do.
Since when do programmers give a shit if people's precious fee-fees contradict what is technically correct? (The best kind of correct!)
> Since when do programmers give a shit if people's precious fee-fees contradict what is technically correct?
Programmers at least the ones I've seen in my life are not from Vulcan :). In other words, we humans, are all driven by our emotions like it or not. The problem is that some people chose to believe that they are pure rational beings, therefore they are always right.
> Gotta love mixing up C with natural language operator precedence ambiguity. :)
Well, then, in that case you shouldn't you reall refer to it as K&&R
Maybe this made Zed feel better, but communicates almost nothing to any outside reader.
Not a single actual quote from any of his detractors, for the reader to judge for him or her self if their criticisms have any validity.
The categorical declaration of "I cannot help old programmers," without providing the evidence he has for this claim. Lots of name calling, though.
No link to the original content, to determine for ourselves whether or not it was fair to K&R's work.
I suppose Zed just meant this to be personally cathartic, and didn't realize he posted it on a public web site where other people can read it?
Maybe this made Zed feel better, but communicates almost nothing to any outside reader.
Yes. I can't figure out exactly what he's ranting about. He writes "I will make it clear that my version of C is limited and odd on purpose because it makes my code safe." Does this mean he defined a safer subset of C? (There are lots of those. I've taken a crack at that myself [1], but it's politically hopeless. Rust is the way forward.)
Why would anyone want to write K&R C today? It's awful. It didn't even check function parameter types. Struct fields were just offsets; you could use one on a pointer of the wrong type and the compiler wouldn't complain. (Considering that Pascal predated C by some years, and had a sane type system, this was kind of lame. But they were trying to compile in 64K of 16 bit words in one pass. That was an adequate excuse in the 1970s.) The first ANSI C at least had a sane type system.
[1] http://www.animats.com/papers/languages/safearraysforc43.pdf
> I suppose Zed just meant this to be personally cathartic, and didn't realize he posted it on a public web site where other people can read it?
He's done these kinds of rants repeatedly. It's his counter-productive style. I can't judge his arguments on a technical level, (I do think his introduction to various language guides are excellent.) but these kinds of rants surely just alienate more people than they persuade?
I looked at Zed's books but didn't find them either containing much of useful material, or written well enough to be worth reading. Granted, I'm not a novice to subjects he writes about. But still, I find it peculiar that his high opinion of his works seems to be rather detached from reality. The books are written in a simplistic and sometimes demeaning style and it's obvious that many people will not like it. But when someone writes an honest review, he seems to get too upset about it. While he obviously likes to critique other peoples' works (such as K&R), he's very sensitive to critique of his own books. Zed thinks that it's acceptable to insult the reviewer (Tim Hentenaar) in response. Reading his response made me cringe. Insulting reviewers is just not what respectable authors do.
Oh Zed. Really?
There is nothing wrong with carefully crafted C code for applications were it is the best suited tool. Sure, there are sharp edges. True you can write crappy, security nightmare code.
You do make some good points. I agree Go is fantastic. Rust is coming along as well. However, C still runs the world. That's not changing anytime soon. Not with the explosion of IoT and GPU type devices. And, hello Linux kernel and all the glorious command line tools on nix.
Try using Go or Rust (love both, x2 for Go) to allocate say a hundred GB of memory for some huge/fast in-memory data processing. Let me know how far you get.
Your rant is as polarizing as those who are blind to C's flaws (yes, there are a few). Stop saying "don't write C", that's just childish. Rather, what about "let's write better, less security flaw prone C."
As an engineer, one ought to choose wisely when choosing tools. This means pros and cons and balanced unemotional decision making. Not a holy war against a given tool.
And I am a professional programmer.
Let's do C where C makes sense.
(Edit: fixed typos)
> Try using Go or Rust (love both, x2 for Go) to allocate say a hundred GB of memory for some huge/fast in-memory data processing. Let me know how far you get.
I'm currently working on a couple of bugfixes for a Rust program I wrote last year which regularly allocates north of 500GB of RAM per-node on a cluster. It's wicked fast (regularly matching or beating comparable workloads implemented in C/C++), and Rust's ergonomics and safety guarantees made it very easy to extract much greater amounts of parallelism than the previous C++ version had, while never once having to chase down a bug from memory corruption, data races, or iterator invalidation.
> Try using Go or Rust (love both, x2 for Go) to allocate say a hundred GB of memory for some huge/fast in-memory data processing. Let me know how far you get.
Um, what's wrong with that in Rust?
> Rather, what about "let's write better, less security flaw prone C."
We've been trying this for the past 40 years and we've completely failed to stem the constant tide of new game-over security flaws. I think it's time to admit that if we couldn't do it in 40 years, we've failed.
Why would this be a problem in Rust? It literally doesn't impose any overhead on memory consumption, at least not any that C doesn't (e.g. padding). Dropbox has clusters of machines that manage exabytes of data whose core is written in Rust.> Try using Go or Rust (love both, x2 for Go) to allocate > say a hundred GB of memory for some huge/fast in-memory > data processing.> Try using Go or Rust (love both, x2 for Go) to allocate say a hundred GB of memory for some huge/fast in-memory data processing. Let me know how far you get.
There is no fundamental reason why this should be slower or harder in Rust. Rust generally compiles down to more or less the same code C does.
There are reasons why this could be slower in Go, but it really depends on what program you're writing, so it might even just work fine. If you don't hit the GC, for example (and Go gives you ample opportunities to not hit the GC), data processing should be quite fast. But it depends.
I'd love to hear real-world experiences with such systems in Go.
We have a few Go processes with high memory usage. For one in particular, while it's been higher in the past (~150GB), we're sitting at 40-80GB per node right now.
The busiest node traffic-wise had average GC time over the past 20min of 3.4ms every 54.5s. 95th percentile on GC time is 6.82ms.
That node is sitting at 36GB in-use right now, and has allocated (and freed) an additional 661GB over the past 20min.
Can't really speak to how fast this is vs other environments, but it's smooth sailing overall. /shrug
That sounds much better than the Java stories I've heard, which makes sense since Go is better at avoiding the heap.
No idea how it compares with others; and not sure if it is representative, but to me that sounds pretty decent.
> Stop saying "don't write C", that's just childish. Rather, what about "let's write better, less security flaw prone C."
A consistent theme throughout the article is that he's actually more interested in teaching people to write C well than fight with pedants. He's not torching his book, he's updating it and removing the contentious chapter. "let's write better, less security flaw prone C." is exactly what he's trying to say - the "don't write C" bit at the end is more about it being a dinosaur than a childish huff, though there is a little of that in that comment.
I went to the Internet Archive just to see what all the fuss was about. Hate to say it, but I agree with the detractors. He seems to be completely missing the concept of preconditions. If the preconditions are met, the code is good; if the preconditions aren't met, undefined behavior occurs. Most people programming C or C++ for more than 10 minutes learn to pay attention to these things. The chapter would have been much better if it would have stuck to the importance of validating preconditions, rather than simply pretending they don't exist.
Good riddance.
To be perfectly blunt, C does not need Zed Shaw "saving" it. He can go ahead and ignore it or end his book or rewrite the chapter or spout his vitriol over how stupid C programmers are; nothing he does will make an impact.
We have CVE for that.
> I cannot help old programmers.
Unfortunate he uses this categorization. The problem is a mindset that can exist in any generation.
I've read/maybe even briefly participated in a discussion about Zed's book a couple years ago, and the technical debate went like this:
Z: K&R's strcpy is broken, e.g., you can forget to null-terminate the string. Mine is safer.
Ohters: It's not broken, of course it'll do something unpredictable if you break its preconditions.
Z: strcpy is still broken.
Others: Your function will break too if you pass it the wrong length.
Z: This cannot happen, K&R strcpy is broken, mine is safe.
Damnit, I was tricked into reading something by professional troll Zed Shaw. The hypocrisy of him complaining about "the dark side of programming" is hilarious consdering he is a very good example of that. His style of debate is to insult and call people names who have offered non-judgemental and constructive criticism and I'm sure nothing I'm saying is news to anyone who has a passing familiarty with him.
K&R C was my introduction to real programming. I treated it ever since as a book, not a reference. Does that make sense? For me it was a glance into the mind of the creator of the language. Yes, some of the ways of programming were flawed in a way that results in today in many terrible things, not the least in which is death. But I was able to outgrow K&R, to learn better things, with its succinct language reference as my wings. I was fine not learning security oriented programming immediately. And I certainly enjoyed learning it with minimal snark and curse words. Maybe my method was harder, really? Learning things without you telling them to me? Maybe, don't really care. Maybe people don't like your essay because it's essentially shitting on a reference to a language. That is what K&R was in the beginning. It morphed into something else by your demands not the authors. Let my hero rest in peace. Learn to breathe. C is as dead as Latin is.
> C is as dead as Latin is
I wish that was true, but you will be surprised how many things you use everyday are written in C. Even the ones you would never imagine.
Node.js for example, a large part is in C. Redis, C. Memcached, C. PHP itself is written in C.
I'm sorry this wasn't clear; I was subconsciously waxing poetic. I meant to say that C's presence is constantly fading but its influence is widespread.
There's a difference between fading away and the universe expanding.
Once upon a time most Unix software was written in C, shell, and awk. Then Perl came along. Did that diminish C? No. Then Java. Did Java diminish C? No. Then Python. Did Python diminish C? No. (You can throw C++ somewhere in there; not sure where. Though IME C++ use really seemed to explode with Windows developers migrating to Linux.)
In each case the universe of software expanded, but C was never diminished. People who think Rust, Go, or whatever will diminish C are ignorant of history. Of course, maybe the predictions will bare out. But I seriously doubt it, and it will be despite their underlying premises, not because of them. Rather, much more likely is an expanded ecosystem.
As I explained else thread, there's nothing intrinsic to the C standard which makes it unsafe. Compilers are free to add bounds checking at every point in the program; in most cases it would be just as cheap as in C++ or even Rust. It would require much rebuilding and retooling, but not much rewriting existing software. (Relying on undefined behavior is dangerous not only because of optimizations, but because undefined behavior can also preclude automatic bounds checking.)
That C compilers don't do that is a function of 1) baggage and 2) other functional constraints, like strong ABI compatibility. But neither of those are set in stone. People who think C is hopelessly unsafe make the same mistake every C newbie (and some die-hard C-is-just-assembly people) do: conflating the language semantics with implementation and machine details.
People assumed that clang would quickly overcome GCC because it was so new and nimble. But clang still hasn't unequivocally really overtaken GCC, and certainly hasn't obsoleted GCC. Rather, the competition merely spurred GCC to evolve faster. I see much the same happening with C.
In the future, look to systems like OpenBSD, FreeBSD, and Alpine Linux, which are more free to upgrade their toolchain and runtime environments with backwards-incompatible changes, to field enhanced C environments with better bounds checking and mitigations. Approaches like stack canaries and ASLR are only the tip of the iceberg for what's possible.
It would not be as cheap as in Rust because Rust uses an explicit standard library feature (iterators) to obviate the need for bounds checks in the vast majority of loops to begin with. But in C indexing is pervasive within loops, so you'd need to come up with much cleverer compilers that could manage to prove that bounds checks were unnecessary (compilers can already do this in some cases, for C/C++/Rust, but it's not perfect).> Compilers are free to add bounds checking at every > point in the program; in most cases it would be just as > cheap as in C++ or even Rust.Likewise, one could make integer overflow in C well-defined, but this would also make C slower than Rust because the use of iterators means that Rust doesn't need to check for overflow on each loop iteration. Via language (or rather, library) features, Rust reclaims the performance that it otherwise would have lost to C by dint of being free of undefined behavior. I think you'd have a hard time doing this in C without rewriting every `for` loop in existence.
(y) :)
The specific section of his 'Learn C the Hard Way' book that he's referring to was mostly, as I recall, complaining that the C string functions defined in K&R will fail when you don't pass them valid data, and therefore, they're fundamentally broken.
Make of that what you will, but it seems to me that given all of the other ways that C can blow up due to programmer error, it seems reasonable to expect programmers to pass a valid string to a string function.
I'm with Zed on this one. Giving the programmer fewer things to have to remember, by design, is a de facto improvement. Forcing a human to repeatedly do a task which could have been designed out is evidence of a bad design.
Mind you, we're talking about the stdlib here. You can swap this stuff out. Some people do: djb is a fairly well-known example.
Yeah, I agree. This is similar to complaining that the plane was broken when you fly it into a mountain. I can see this being a minor issue when you're passing data to like strdup(2) from an unknown source, but you should be following the rule of "users do dumb shit." and check these things. C isn't designed to prevent you from hanging yourself because if it were to prevent you you wouldn't have enough rope to climb a mountain.
I have mixed feelings about this, but I cannot disagree with it.
a. I haven't written a program in C in over 10 years. I wrote software 5 days a week for those 10 years.
b. I wouldn't want to write a program in C now.
c. The first "high level" programming language I learned was C, from a book (not K&R C), while travelling in Asia, without a computer. It taught me well, but I immediately went on to other languages.
e. I can't shake the idea that there is some value to knowing that low level stuff, even though I don't use it much myself.
Maybe linux kernel hackers will keep it alive. I know game programmers use it a lot as well. But for the majority of us, it's kind of an arcane skill now.
> a. I haven't written a program in C in over 10 years. I wrote software 5 days a week for those 10 years.
That's fine. Perhaps the kind of programs you have been writing and not a good fit for what C is great at doing. That does not take away from C or its use for appropriate work.
I use it all the time in embedded systems. It's very common. Basically there is no good alternatives until you get to much bigger chipsets. However in the embedded world you tend to go with a limited subset of C. Especially no use of dynamic memory.
Kernel hackers will definitely keep it alive, even if you manage to avoid every other hugely prolific C codebase out there. Being an arcane skill makes it an appealing skill to learn. There's more kernel jobs then there are university graduates capable of properly writing C.
> b. I wouldn't want to write a program in C now.
When all else fails... come back and say this again ! But for the time being ignorance be bliss.
I'm sensing a huge incomprehension in a great amount of posts. The key is to know the purpouse of tools. C is a "close-to-the-metal" type language. You can control a low-level things, execution time, "number of hops" when writing data, etc. If you want a friendly language with "no segfaults, no memory leaks" then go higher level (which in many cases is a better choice, i.e. a GUI desktop application with no performance constraint). If you have a problems wrigint in C then you simply still can't C and using the wrong tool for the task.
"But C? C's dead. It's the language for old programmers who want to debate section A.6.2 paragraph 4 of the undefined behavior of pointers"
Someone has to build the low-level stuff. Dear boys in too-tight pants and a hippie mustache: your high-level things and gluten-free snacks does not grow on trees.
> Someone has to build the low-level stuff.
Some of us where already doing it in much better languages, before C had any meaning outside AT&T walls.
joke? the author is self-important in precisely the way that K&R weren't.
You had my sympathy until I read the "error prone shitty language like C".
Next time before getting pissed off about the response you get, think what could it be that you have said or done that may have triggered it.
To me, C is like PHP (ignoring for a moment that PHP was written in C).
You can document its shortcomings, its dangers and all the headache-inducing choices. But while you're doing that, people all over the world are building wonderful and terrible things with it.
So you've moving on to Go or Rust? Great! Good choices! But remember that there are people who may disagree and be wrong and also do something interesting with that wrongness.
No language makes it the least bit difficult to write bad code. This is not an argument in favour or against _any_ language.
FYI, this rant is from January 2015. Surprised to see it showing on HN today.
If anyone is interested in what he removed, you can find it here: https://web.archive.org/web/20150101224641/http://c.learncod...
What he removed is actually a pretty fine essay on the pitfalls of classic C. Maybe I'm just old enough to be the target demographic, but I found it pretty illuminating.
Zed is frustrating sometimes.
It wasn't obvious when this dates from, but we'll take your word for it and add 2015 to the title.
Thanks. Archive has the 'dead' article first indexed Jan 6th, 2015.
https://web.archive.org/web/20150106191636/http://c.learncod...
K&R taught fundamentals and a good style. It is timeless classic, because the principles doesn't change within successions of mass hysteria.
Plan9 dialect of C is another example. There is portable mk package, with includes core libs (libbio, libutf, etc. which also served as core libs for earlier versions of Golang) to appreciate what C supposed to be.
I would paraphrase - attention seeking by attacking classics is a poor style.
Part of his point is that K&R isn't good style. It's a clear, consistent, and well-demonstrated style, which made it popular, but that doesn't make it good.
Do remember that this guy wrote:
"I’ve more or less kept my mouth shut about some of the dumb and plain evil stuff that goes on in the Rails community. As things would happen though I’d take notes, collect logs, and started writing this little essay. As soon as I was stable and didn’t need Ruby on Rails to survive I told myself I’d revamp my blog and expose these fucks."
and:
"After Mongrel I couldn’t get a gang of monkeys to rape me, so forget any jobs. Sure people would contact me for their tiny little start-ups, but I’d eventually catch on that they just want to use me to implement their ideas. Their ideas were horrendously lame. I swear if someone says they’re starting a social network I’m gonna beat them with the heel of my shoe."
So that is very much his style of writing.
I don't think bringing out a list of generically outrageous things someone said in the past rises to the level of discourse we're trying for here.
We detached this subthread from https://news.ycombinator.com/item?id=11727718 and marked it off-topic.
The man has a long history of writing angry rants. You don't think that would influence how people might read his current rant? You don't think the readers of Hacker News might like to know about his past history of similar behavior? You don't think it helps interpret the level of anger in his current rant?
I do think all those things. But the cost of moving in the direction of personal vendettas or witch hunts is higher than the benefits you listed (if benefits they are).
I don't mean that's what you intended, but that's the direction it points in, which it isn't in the long-term interests of HN to allow.
He had this big rant about Debian that was about Debian trying to lock people in because of 'business' or something absurd like that. Debian is a non-profit! I found it irritating because it wasn't even "he's being a jerk, but he's right". It was just plain wrong.
Punctuation seems to be dead. It took me a while to understand the title.
Lots of insecure butthurt and resentment in this article, and not much substance.
I can't see how you could say C is dead when there isn't really anything that can replace it.
I'll take it as dead when the Linux kernel, or it's futuristic replacement, is written in something other then C.
If you are talking about at the user-space level, then yes I can see that. But you shouldn't assume your single use case, higher level user space apps, is the only use case.
In what way, specifically, is Rust unsuitable for building a kernel?
There's no argument that the Linux kernel is currently written in C. But that doesn't prove that nothing exists that can replace C.
We can't say if Rust is a suitable replacement unless a a team tries to write a kernel in Rust, and then comes up with a comparative result.
Right now C is only the tried and true solution. The rest are possibilities only.
There are two ways to interpret your post. The first is "there's no kernel written in Rust that is as complete as Linux". The second is "Rust is unsuitable for a kernel". The first interpretation is obvious and completely uninteresting; the second is something you haven't supported at all.
People are writing a kernel in rust[0], and a pretty good unix-like at that. I don't like rust, but it's a fine language for that and in general. [0] http://www.redox-os.org/
Well if you can implement a kernel that does everything that Linux does, and is entirely backwards compatible, then I'll say "ok, the reign of C is over". Till then, C is going no where.
This childish rant is embarrassing. With millions and millions of lines of C code basically running the internet and of vital importance to countless devices, calling it a s*y language is beyond ridiculous.
Asbestos was also used in millions of buildings and was vitally important as insulation. It was also something that was a bad idea and something that we needed to move away from.
I agree with where you're coming from. Another analogy that comes to mind is knob-and-tube wiring (https://en.wikipedia.org/wiki/Knob-and-tube_wiring).
It's an older home wiring technology that works fine for years if undisturbed, is still present and working OK in homes all over, was invented in the early days of electrified homes, requires considerable skill to install properly, tends to be unsafe if not handled skillfully, is expensive and delicate to modify, has no hidden components, allows interesting wiring layouts because conductors are separated, ...
One could go on with the obvious parallels. (I learned on a PDP-11.)
Yeah. The weird thing is that in other industries, people have no trouble admitting that the old stuff is often problematic and needs to be replaced. In the supposedly forward-looking tech industry, though, we stick with our tools from 1978 and stubbornly resist admitting that we have learned anything since then. It's strange.
I don’t buy that at all. There are huge amounts of path-dependent cruft throughout all human endeavors:
Some of this stuff is decades old. Some is thousands of years old.- A base ten number system - Lack of useful structure in the symbols and names for numerals, and lots of weird inconsistencies in number names - Inconsistent, confusing, and arbitrary names/notation for basic mathematical operators and functions - Use of inferior Gibbs/Heaviside vector algebra instead of Clifford/geometric algebra - Very poor notational conventions in many advanced math/physics fields - A highly irregular calendar - Poorly designed measurement systems - English spelling - Very distorted dominant world map projections - Most nutrition “science”, including federal dietary guidelines - Bogus forensic “science” used to imprison innocent people - The methodology and writing style used in political science - Many essentially debunked economic models which continue to be taught - A legal system chock full of incidental complexity and inconsistencies - Inadequate species taxonomies - Poor color models used in art/design - Even worse, specification of colors using proprietary, arbitrary Pantone chips - Lots of poor/obsolete metrics used for evaluating lighting - Audio mastering with heavy-handed dynamic range compression - Lectures as primary pedagogy in high school/college - Grammar drills as a method for teaching foreign languages - Modern zoning requirements in many countries - Many unsafe and inefficient street design requirements - The rigid design of modern shoes (let’s not even start on heels) - Terrible user interfaces for most household appliances - Mediocre user interfaces for many musical instruments - An inefficient and dangerous typewriter / computer keyboard (which persists on tiny phone screens!?) - Unhealthy design of office furniture, car/airplane seats, child strollers, etc. - .....Things look different from the outside than they do on the inside. The old saw of "science advances one funeral at a time" is true in lots of fields, you just probably see a lot more examples in programming because you spend more of your time there.
Only, it isn't knob and tube or asbestos.
It's much more like comparing crawling (machine code), walking (assembly), C (bi-cycling), and higher level languages (faster to write, more built in safety features, etc).
Each is a good fit for a given role, and sometimes you need to get through tight spaces where using one of the lower impact methods is more effective; or maybe you just can't afford something 'nicer'.
Use the correct tool for the job.
Except that we now have better bicycles (Go/Swift/Rust).
My C programs have crashed much more often than my bicycle...