50 years of C, the good, the bad and the ugly [video]
streaming.media.ccc.deA different take on the matter: I write code mainly artistically, and I found that C is one of the best languages available to code as a form of art. It allows to be both brutal and honest, or abstract and deceiving. Not many languages are so semantically powerful.
It’s mostly semantically very poor to be honest. C really is a nicer assembly in a lot of way. It’s extremely limited. You have branching logic, functions call, pointer arithmetic, a way to define data structures which are really memory layouts and that’s pretty much it. I guess you can appreciate that as an aesthetic statement but what you wrote would apply equally to any language with more complex semantics.
C would be a better language if it really lived up to its old ideal of being a "portable assembly language". That stopped being true as compilers started optimizing undefined behavior (e.g. signed overflow). Instead of a "+" in the source code yielding an honest-to-goodness hardware add instruction, it could be "optimized"--i.e. constant folded, CSE'd, strength-reduced, value-range-analyzed, among others--by an optimizing compiler in ways that completely ignore what the hardware instruction actually does upon overflow. That meant the C language became more like a vague suggestion of assembly and it became pretty much impossible to code to the machine anymore.
> if it really lived up to its old ideal of being a "portable assembly language".
What "old ideal" ? Where do you believe this "ideal" was expressed? Should I expect to find it in the First Edition K&R perhaps? Or in the documentation for the original C compiler? In the ANSI standard ?
C was never this mythical "portable assembly language" that's just something people say about it, mostly to ridicule it, you are engaging in Nostalgia.
From Dennis himself,
"The language is also widely used as an intermediate representation (essentially, as a portable assembly language) for a wide variety of compilers, both for direct descendents like C++, and independent languages like Modula 3 [Nelson 91] and Eiffel [Meyer 88]. "
That same document claims, "C's approach to strings works well" which, I mean I admire Dennis Ritchie but he's just wrong.
Moving goalposts, you were asking who said it, the language authors did.
If they are right or not, it is another matter.
Regarding that matter, he was wrong. I wish he'd never written the garbage that is C. It's a stain on his reputation.
I'm willing to forgive those who move on to better ideas, but we are really stuck with this 50 years later. When most of PL land knows better than to keep digging in this hole.
But Dennis doesn't express it as an ideal, and that document is written really late -- it's citing a 1991 document in the part you just quoted, so that's after C89.
In the absence of CPU instruction which does the saturated ADD, how do you solve the overflow problem in a bare-metal language without introducing the performance hit?
It depends on how you define your language semantics. If you choose 2's complement wraparound semantics, which is what nearly all CPUs in the world have standardized on, there is nothing for the compiler to do. It's a pure operation, a single instruction, and can be constant-folded, strength-reduced, CSE'd, moved, dead-code-eliminated, etc. If you want a different semantics at the source level, e.g. overflow is an exception, then the compiler needs to emit additional code.
For Virgil I chose to settle on 2's complement because it's what hardware gives and has no overhead. It comes with the full compliment of fixed-width integers of widths 1-64 and odd-width ones come with zero-extension or sign-extension as necessary to make the underlying hardware width unobservable.
> If you want a different semantics at the source level, e.g. overflow is an exception, then the compiler needs to emit additional code.
Yes, and which is exactly what I hinted at with my question to your comment. With the HW we have today, implementing such semantics without an extra hit is not possible and which is why I thought your comment wasn't completely fair but coming from more theoretical stance.
Checking for overflow is basically adding a conditional branch (usually to out-of-line code) after arithmetic. It's very cheap, almost free, if there are enough execution units, I-cache isn't a bottleneck, etc etc. But yeah, it is theoretically an overhead, which is why I avoided it for Virgil.
I suppose range analysis can eliminate many overflow checks, e.g. in counted loops, but it's not completely zero cost.
> Checking for overflow is basically adding a conditional branch (usually to out-of-line code) after arithmetic.
Yes, but now as well you have to return a value to communicate the overflow to the call site. And call site has to check for that value and that's yet another and another ... and another branch. Depends how deeply you want to propagate that error and decide how (?) to deal with it, this will grow the code size, which can contribute to the higher frequency of I-cache misses and page-faults, but it can also inhibit compiler optimizations.
On the CPU level, I think there also could be an attached cost as well in case some of those branches end up as entries either in branch-target (BTB) or branch-order (BOB) buffers or both. Sizes of these buffers are quite scarce so ending up with the unfavorable ratio of check-for-overflow entries vs entries occupied by other type of branches found in the code is something that will put more pressure to our branch-prediction unit. More "important" branches will now more frequently start to lack their entry in the branch history simply because of the fact that we started sprinkling check-for-overflow branches. And yet we know that branch misprediction is the costliest operation (15-20 cycles) we can encounter in the CPU pipeline.
Also, I think a bigger picture must be observed in this context. E.g. what is the percentage of arithmetic operations some big real-world sized binaries contain? I'd figure that in average it would be a sizeable amount, and in ones with a lot of math even more so. And then I wonder what we could observe if we applied the check-for-overflow transformation to all such signed-arithmetic operations.
I'm aware that there are some artificial benchmarks showing that there's no cost attached to branches which are essentially never taken but it makes me wonder if that cost would really be zero if we exercised that change on the actual code instead. For at least the reasons from above.
The performance hit of overflow has nothing to do with lacking assembly instructions. It's all about having to preserve error states whenever overflow occurs and inhibiting optimization.
I don't follow. If we had the saturated ADD instruction supported in hardware we wouldn't have to deal with saturation logic in software. Compiler would simply be able to emit SADD instruction whenever we asked for it to and would be given a chance to spit out more optimized code because that would essentially be a branchless code.
Reality though is that we don't have such hardware and we have to deal with it in software, so, instead of a single instruction emitted there will be a bunch of them, and of which there will be certainly branches involved.
> that’s pretty much it.
Only if you ignore
1. variadic arguments as in printf,
2. function pointers which are a very elementary (type-unsafe) form of closures,
3. a nice way of casting to void
4. setjmp and longjmp goodies (?) which allow you to code up co-routine libraries and exception handling mechanisms
I'm sure there are more such facets I am missing.
The elegance of the specification may be questionable, but the scope of what C tried to achieve is breathtaking. It actually is superior to most of its improvements.
There is no elegance there. Casting to void is not a plus. That’s just C having no proper type system.
Function pointers as an elementary form of closures, come on, what’s next? Closures are defined by capture. It’s nearly as fun as pretending C as coroutines because of setjmp.
How can a statement like C being semantically poor even be seen as controversial? For god sake, we are talking about a language which semantically doesn’t even have proper arrays.
> The elegance of the specification may be questionable, but the scope of what C tried to achieve is breathtaking. It actually is superior to most of its improvements.
Seriously? It wasn’t even a good language when it was released. Lisp and Pascal were far better. It won because of compiler availability and adequate performance on limited platforms.
HN really is a joke sometimes.
> HN really is a joke sometimes.
A lot of what I'm seeing from your comments here is you can't handle people who have something positive to say about C.
They're not saying it's the one true way or something. Just that they like it in some respect.
And your attempts to dismiss that and call "HN" a joke for harboring someone who thinks this way look kind of childish to me.
Then you are seeing what you want to see. People are straightforwardly arguing that C is semantically rich which is indeed laughable. I have addressed the point under but clearly a lot of you don’t understand what language semantics are.
Considering I was having interesting discussion about the subtleties of the Hindley-Milner type system on this same website a decade ago, yes, I do think HN is becoming a joke. The joke is on me however because apparently I keep commenting for reasons which are not always apparent to me I must confess.
There are plenty of interesting C works out there, as well as data structures or algorithms that C can express elegantly.
Small example, linked lists. I don't think non-C linked list code tends to be as straightforward as I've seen in C.
Or the character-at-a-time style of string processing. It's kind of unique to C.
You can say there is stuff about that you don't like. That's fine. Linked lists suck with modern CPU caches anyway. C strings have lots of misadventures in terms of buffer overflows. But it's unique and interesting. Lots of elegant things have been written this way. Your unfamiliarity with it doesn't make it "a joke" to point this out.
I have my own gripes with HN. Try mentioning politics and it brings out all sorts of fascist-sympathizing crazies. But saying good to neutral things about C (while not even universally praising it) is not one of those issues.
I am extremely familiar with C. I used to work on a static analyser for it. The issue is that there is very little to like in how C is designed from a PL point of view.
You are confusing what the language can do and what is semantic is.
C strings are just a contiguous allocation of byte and a bunch of functions which interprets it as ascii characters and stop on a specific value. That’s literally the worst representation of a string you can have.
I think it's a nice representation, allowing easy recursion.
It's a good format for storage and communication.const char *my_strchr(const char *str, int ch) { if (*str == 0) return NULL; if (*str == ch) return str; return my_strchr(str + 1); }Name any other string data structure and I will cite you all the disadvantages compared to the C string.
- If the format contains pointers, you have to marshal it to a flat representation to communicate or store it. The C string is already marshaled.
- If the format contains size fields, their own size matters: are they 16, 32, 64. What byte order? Again, needs marshaling. You might think not if going between two processes on the same machine. But, oops, one is 32 bit the other 64 ...
I shudder to think of what FFIs would look like, if C strings were something else. It's the one easy thing in foreign interfacing.
C programs themselves sometimes come up with alternative string representations, for good reasons; but those representations don't interoperate with anything outside of those programs, or groups of closely related programs.
C null terminated strings are not magic. Protocols between the sender and receiver have to be defined. You face exactly the same problem regarding byte order, size consideration and marshalling with C strings.
You are basically arguing that C strings are a good default because they are the default. If they were something else, well, we would have saner FFI in a lot of place.
C strings are a terrible default. They are fundamentally unsafe. Mishandle the null char for any reason and you now face a serious security issue.
> You face exactly the same problem regarding byte order, size consideration and marshalling with C strings.
Not char/byte C strings. You might be thinking of wchar_t strings.
> Mishandle the null char for any reason and you now face a serious security issue.
With what string data structure can applications peak into and mishandle an implementation detail and not risk creating a security problem?
Same you have a structure with the length and a pointer. Mishandle the length field or pointer and you have a problem.
It certainly is a problem and it's common for C applications to take on responsibilities for manipulating the internals of the string structure.
> Not char/byte C strings. You might be thinking of wchar_t strings.
No I’m talking about the classic byte arrays C pretends are strings. If the receiver you are talking to is not expecting a chain of bytes where each byte is a character and one special value means the chain is over, well, you will have to transform what you are sending to what your receiver is expecting.
If somebody's expecting something other than a C string then we unsurprisingly have gratuitous complications. If everyone agrees on the C string though, it just interoperates as-is. Code compiled for some 16-bit microcontroller can just shove the C string into its transmit buffer, and on the other end, code compiled for 64 bit big-endian Power PC receiver just uses it as-is.
C is the biggest mistake in the software industry since the beginning of software. The people that worship C only do so because they haven't seen the better paths the software industry could have chosen.
E.g. Donald Knuth hadn't seen better paths.
"I think C has a lot of features that are very important. The way C handles pointers, for example, was a brilliant innovation; it solved a lot of problems that we had before in data structuring and made the programs look good afterwards. C isn't the perfect language, no language is, but I think it has a lot of virtues, and you can avoid the parts you don't like. I do like C as a language, especially because it blends in with the operating system (if you're using UNIX, for example).
All through my life, I've always used the programming language that blended best with the debugging system and operating system that I'm using. If I had a better debugger for language X, and if X went well with the operating system, I would be using that."
Dec 7, 1993, Computer Literacy Bookshops Interview.
This is ridiculous hyperbole.
Nobody "worships" C and people who have positive experiences with C often have exposure to higher level languages.
You can find strengths or upsides in C and still acknowledge faults, and still acknowledge merits elsewhere.
This routine only works with ASCII, yea?
It will also find ASCII characters in UTF-8. That's a common situation because delimiting characters are often in the ASCII range.
We can make a more complicated example that nevertheless uses recursion and essentially the same way, which looks for a UTF-8 character. We can examine and decode a prefix of the string as a UTF-8. If there are bad bytes or the character doesn't match then we recurse.
We can also write a wide character (wchar_t) version of the function which looks the same. That will handle all of Unicode on sane platforms, and the basic multilingual plane (BMP) on Windows.
If that static analyzer was itself written in C, you have one program's worth of C experience, which is very little.
> as straightforward as I've seen in C
To be fair, the most straightforward definition of the linked list is generic, and C completely lacks such facility.
> fascist-
Sounds familiar.
Generic facilities can be opinionated. There are various ways to skin the cat, with different tradeoffs.
Template mechanisms cause code bloat and encapsulation violations: having to reveal the entire implementation to the clients so they can instantiate it for whatever type they want.
Some generic mechanisms have dumb restrictions, like you can make a "list of integer", whose elements, sadly, all have to be integers.
In C you can envision what you want genericity to look like at the detailed bit-and-byte memory level, and how you'd like to be able to work with it at the language level, pick some compromises where the two are at odds, and make it happen.
It is false to say that C completely lacks any generic facility because it has union types. Using a union we can put several types into a structure along with a type field which indicates which. We can overlay an integer, character, floating-point number, or pointer to something outside of the node.
> C completely lacks such facility.
Generic linked lists can be done with pointer casts. See the way the Linux kernel does it.
The list manipulation routine takes a pointer to a member, and the more specific type is opaque. The code that knows what the actual structure is can derive it from a node pointer.
A structure can even have multiple node pointers in this scheme.
> Sounds familiar.
I cannot decide whether it's an honor or an insult to be called a _fascist sympathizer_ for opposing the authoritarian suppression of speech.
I didn't want to address this because it's a tangent, but I understand that free speech protects expressing fascist thought. I can still disapprove of it and not want to be around a lot of it, consider high concentrations of it to be a sign of an unhealthy community, without saying it should be somehow illegal.
Much of what you say is true. But I came to C from Pascal. Pascal circa 1982 was unusable without all the numerous extensions each vendor tacked on. C worked out of the box. I could get things done in C without constantly fighting the Pascal and Fortran compilers of the day. This was true despite the terrible C compilers of that time.
Pascal wasn't ready for real-world use because it came from an academic setting and was initially designed to run P-code in a VM. It's successor, Modula-2, fixed all of Pascal's issues and added a lot of great functionality.
I remeber that Small-C and RatC were hardly usefull without tons of inline Assembly.
As proven by books like "A book on C" from 1984 (Robert Edward Berry and B. A. E. Meekings).
You really don't understand, but that's okay. C is a wonderful language, and that's the end of it really. You just try to think C as a language that SHOULD HAVE all the stupid bells and whistles you find convenient. Many of us more advanced coders have found that those stupid bells and whistles are, in fact, inconvenient. Casting to void is a powerful technique and the limits are literally endless. C is semantically rich, way beyond your dull imagination.
The argument is not about what can and can’t be done in C. C is Turing complete and low level. You can do anything in C. The debate is not even about if it’s good idea (I don’t think it is but that’s separate).
The question is about C semantics. Given the reply I get it’s pretty obvious that some here don’t understand what language semantics are. It’s about the amount of concepts you can express in the language. Haskell - a language I personally despise - is semantically very rich. So is modern C++ for what it’s worth. C simply isn’t.
It’s even deceiving sometimes because it has the apparence of having some semantic elements (arrays for exemple) which are not there in reality and are really only syntactic sugar on top of other semantic constructions(pointers).
You’re whitewashing history if you think Pascal was superior to C. There was no conspiracy among compiler developers as you imply.
Also you complain about semantic richness of C, but then only point to semantically rich languages you despise. A curious reader would wonder: why care about semantically rich languages then? An observant reader would wonder: so aren’t there times where semantic richness is not useful?
What conspiracies are you talking about? C won because its compiler was widely available for free. That’s pretty much a fact. Pascal was indeed a superior language but limited by the proliferation of extensions, paid compilers and its compiler performance.
I’m not even complaining about the semantic richness of C. I’m just stating the fact that it doesn’t really have semantic richness. Then again don’t get me wrong. I do think C is a terrible language and I say that having worked on a C static analyser. It’s full of avoidable undefined behaviours and silly sharp edges.
Thankfully, nice languages with rich semantic exist. I was just pointing some I don’t like to separate the issue of semantic richness from likability. I enjoyed working in Ada a lot. Ocaml is awesome. I have never used Rust but from what I have seen that seems nice.
Semantic richness is useful because it helps programmers express what they want to do in way which are clearer and therefore more likely to be correct.
Pascal, at least the original specification, does have a few flaws. It was intended to run as P-code in a VM so you wouldn't expect it to be as good as running directly on the hardware.
The successor to Pascal, which fixed it's problems, was Modula-2, which is vastly superior to C. Unfortunately by the time it came out and started becoming available, it was too late for it to compete fairly with C. Ada is comparable to Modula-2, especially before Ada added OOP, but Ada was too late to compete with C as well.
It's taken 30 years but finally we are starting to see safe languages, eg. Rust, compete with C and C++. I don't know whether Rust will be the language that overtakes C and C++, but if not, I believe a successor to it will and C and C++ will become like COBOL, that no ones uses to write any new code.
Pascal was and is continually evolving. A strong argument can be made that Object Pascal was the "successor" of Pascal, and that happened in 1985/86. After Apple and Wirth developed Object Pascal, Borland added the OOP extensions to their version soon afterwards. Turbo Pascal had quite a long run, and some are still using it. The Delphi dialect of Object Pascal came out in the early 1990s and Free Pascal came out in the late 1990s.
Many of the so-called issues with Pascal were resolved way back in the mid to late 1980s. In regards to Pascal's competition with C, this arguably evolves around AT&T and Unix. It's AT&T and their huge government, industry, and business influence that pushed C over the top.
Modula-2 has many new features that make it better than Pascal. It is not merely a Pascal with some issues fixed.
One small thing I like about Modula-2 is an improvement in the syntax where BEGIN is not required after conditionals and loops.
The first thing you need to learn about semantics is that it's a singular: "what semantics is", not "what semantics are".
The nice thing about C is that if you're opinionated about what you want your ideal language to look like, you can use C in that pursuit. You can write your run-time in C and parts of the implementation. C is good for interfacing with operating system platforms.
You can fully bootstrap yourself off C entirely, or stick with C for the run-time and perhaps other parts. If your language isn't very good at expressing low-level manipulations, you can keep C in there as an option for writing those kinds of modules.
Some languages have used C as a target language for compilation.
C can be an enabler in the pursuit of other languages.
C is a garbage language. It has a lot of undefined behaviour. It isn't safe. It isn't even strongly typed. It's only marginally better than a macro assembler. C programmers consider that a virtue but programmers who want to write safe, robust code, know C is garbage.
"One man's garbage is another man's treasure."
Those people are called hoarders.
Closure is not an anonymous function. It "encloses" variables that were in scope at the point of closure creation. Therefore function pointers cannot serve that purpose as is.
I agree, I should have used anonymous function specifically instead of a closure (for example, Ruby/Smalltalk closures are blocks, not functions.) Thanks for the correction.
C is also relatively poor at providing “a way to define data structures which are really memory layouts”. Integer sizes (historically) and padding are implementation-defined, AFAIK bit fields are underspecified in that you cannot specify in what bits of a byte they end up.
So true.
I wish that C had a more rich way to define struct layouts and low-level representations for integral types.
I really like how Ada does it:
https://en.wikibooks.org/wiki/Ada_Programming/Representation...
If you need to read or write specialized hardware registers, being able to define a data structure with a custom representation is very nice and can save significant time and effort.
I wish C had never been born.
Can you share some examples of your "art"? I'm trying to wrap my brain around the concept of writing code for any reason other than work or trying to build something.
Some background here: http://antirez.com/news/133
C++ should be even better in those categories.
Art is often about constraint, and mastery of simple tools.
I identify with @antirez's sentiment. I know both C and C++ very well. I find C definitely more artistic. C++ is utilitarian.
Prolog is the other language I code in "artistically". It too is quite simple and constrained, though it is on the opposite end of the high/low-level spectrum as C.
What is your take on Perl and Ruby? They seem to be the two languages where the communities themselves talk the most about poetry and elegance.
Perl and Ruby are very expressive. Python on the other hand feels like a scripting language designed by a 1990s corporate Java developer.
/unpopular-hot-take
I mostly agree with this. I'll come to python's defense though. Python is a gorgeous language wearing an ugly hat. The hat has __multiple out of place brims__, all of which are about two underscores wide.
It's not a hat, it's a snake that swallowed a Java engineer :)
The name came from the author's love of Monty Python's Flying Circus, so perhaps it's an experiment in low-key absurdist humor or surrealist art. Like chindōgu[1].
Perl (5) I find quite complex, the opposite of C and Prolog. I suppose the elegance some find comes from packing lots of behavior into very few characters of code. When I want to code as art though, I prefer simpler, less "magical" tools.
I've never used Ruby but from what I've seen I understand it to be similar to Perl in that regard.
I agree that perl can be complex, its feature list is longer than C. Trying to be artistic in it, to me, means trying to make it so my code reads as much like pseudo-code as is reasonably possible. At its best the users are able to read the business logic portions and understand what is going on.
Of course good naming and abstractions are of key importance, and comments and inline documentation are important finishing touches, which so many programmers don't seem to have the time for. It doesn't hurt that Larry Wall, the designer of perl, was a linguist. Perl was meant to be flexible and expressive.
And everyone knows that there's nothing worse to read than obfuscated perl. Perhaps some think that is artistic in a different way, to make the shortest and most unreadable program possible. Not my thing, personally.
True; on the other hand, personally, I think that normally C and C++ shouldn't be even mentioned in the same context. (My C++ code bears virtually no resemblance to my C code.)
This shouldn't be downvoted. C++ is a superset of C; it must be capable of being at least as "brutal and honest" and "abstract and deceiving" as C.
I would argue that C++ can be dramatically more deceiving than C --- see inheritance and operator overloading, just to name two.
C++ is not a superset of C, there are plenty of legal C constructs that will trigger errors when compiled with a C++ compiler.
C is not a superset of C either. C++ is a superset of a C version, even if they then went on to add stuff that wasn't added to the other
C++ is a sane defaults superset of C.
But most of them are new constructs, not found in the venerable C90.
Operator overloading is the one thing that makes C++ strictly better than any language that does not allow it.
Overloading is miserable. C++ has to provide this as an overload because it still, decades after standardisation and half a lifetime after it was created, doesn't have a way to just extend types.
If you can just extend the types then you can provide operators that way and there's less opportunity for ambiguity. See Rust.
Also so many of the C++ operator overloads are broken instead of just not existing, which tempts you to overload things instead of saying "Nope, that's a bad idea" and just walking away altogether.
Example, boolean short-circuiting AND and OR. If we write
in C++ then that OR is short-circuiting, so that() won't get called if this() is true.if (this() || that()) foo();But if the return type used overloads the boolean OR operator, short-circuiting is disabled, and now both this() and that() are always called...
How so? To me, operator overloading seems like a non feature, that at best create more ambiguity.
Without it, you cannot express arithmetic wrappers or simd wrappers. So far, no language with any users has managed to provide adequate arithmetic and simd types built in, despite several attempts like Odin and CosmiC/HolyC. I'm not sure it's even possible to design such types that satisfy everyone, and they must be extensible somehow because satisfying anyone is a moving target.
Note how terrible doing any amount of math is in C.
Also without operator overloading, it wouldn't be possible to implement lazy evaluation, which along with the use of expression templates happens to be one of the most crucial aspects that any linear algebra library will want to take advantage of in order to generate the most optimal code.
> without operator overloading, it wouldn't be possible to implement lazy evaluation
I don’t understand that. You can have functions that return promises that then can get passed to other functions returning promises, and leave it either to the compiler or to an expression evaluator you write (that ideally runs at compilation time as much as possible) to optimize away anything not needed. For example (pseudo-code)
in the end, could do the equivalent ofvector3D a = … vector3D b = … promise<vector3D> c = addLazily(a,b) print c.x
I think you could make a modern C++ compiler do that for this simple example.print a.x + b.xOf course, you can do it that way as well. However the problem with promise approach is that it is introducing extra dynamic memory allocations beneath and these cannot be optimized out or elided. At least, not to my knowledge. With expression templates and operator overloading you're basically avoiding exactly that as much as possible.
Where do you see dynamic memory allocations being needed in that example? I only see locals that a good compiler can fairly easily optimize away.
Also, there’s a simple bijection between expressions with operators and two-argument function calls:
Because of that, I don’t understand why operator overloading should give better optimization opportunities than function calls.a + b * c +(a, *(b,c)) plus(a, times(b,c))Unless you're talking about some theoretical promise, C++ promise implementation imposes dynamic allocation.
Operator overloading is syntax; lazy evaluation is semantics.
That is true but a little bit unfair comment because I expanded with the context in the very next sentence. Or perhaps you're aware of alternative way of how to implement lazy evaluation without expression templates and operator overloading with all of its advantages?
What are your goto sources of knowledge for C? I’m learning it to write embedded stuff and so far it’s mainly the k&r book, one from no starch press, and a udemy course. Thanks!
I think you’ll like Fluent C: Principles, Practices, and Patterns by Christopher Preschern (O’Reilly, 2022).
The books you have are excellent ways to learn the language, and this one complements them in that it lists common practices around, for example, error handling, and other common topics. It shows different approaches and mentions practices from real-world open-source C projects that follow them.
It seems like a good book to continue exploring C after learning the language.
Thank you!
Not the GP, but cppreference.com [1] is the best reference I've found for C (and C++) in 20+ years, short of just reading the standards themselves.
Thank you!
Can you expand on that?
I bought a C compiler at my job in 1984 because I thought it was the future, then spent nearly a decade writing MacOS apps in C. I even added object extensions to it (for our use) in 1989 because C++ was not an option yet.
I worked with Objective-C in the late 90s and again in the 2010s, which is basically C with funky object stuff.
I don't miss it at all. C is very low level and so easy to write bad code in if you don't have solid discipline, the language doesn't help at all, which was not really a design decision back then. The first C compiler we used didn't even support prototypes.
I exclusively use Swift now.
What about MPW, MacApp and Metrowerks PowerPlant?
Only a mother could love Swift's syntax. The classic K&R C, on the other hand, was the very definition of simplicity and elegance. (Objective-C is a whole another topic, of course.)
Classic K&R C? Where you declared the function arguments Pascal style out of line and had to rely on separate “lint” programs to check for typesafety in function calls?
Surely you mean C89, where functions have some type safety but C’s variable declaration syntax remains a horrible inelegant kludge?
Have you written much code in Swift?
I really don't understand this take at all. Swift has fairly standard syntax as far as modern languages go.
His final conclusion is that C has to go, just like COBOL, Fortran, and PL/I. I wonder how long it will take before C will be gone totally when you realize how much COBOL and Fortran are still around. Not so long ago, I came along a module for Python 'SciPy.interpolate' that happens to be programmed in Fortran.
C is the best abstraction for many problem domains and that isn't going to change. I understand why folks coming from higher-level languages would dislike it, but for anyone coming from assembly it's a godsend.
The speaker discourages C for new projects, but that says more about the problem domains they work in than C itself. C is what it is because the hardware and assembly language are what they are. Folks who want a safer C should design a new hardware architecture with a "safe" assembly language and a new low-level language that targets it.
Anyone discouraging a particular tool without actual context of the problem being solved gets zero respect from me.
It's a massive red flag that they don't know enough to be useful.
C is great for the things C is great for, however small that range may or may not be now and in the future.
Any other stance is reductive and misleading.
Exactly what, in the year 2023 C.E., is C great for?
I’ve been writing C since 1991 and I can’t think of anything where I wouldn’t start a new project in some other language. There are many interesting choices of varying maturity in the low-level systems programming space: Zig, Rust, Crystal, D, Swift (if the standard library ever gains support for system-level programming). Even the “better C” subset of C++ will allow you to avoid certain classes of security bugs completely.
I don’t think C is even merely adequate for anything at this point; defaulting to memory safety is table stakes in any domain where C was once dominant.
The consequences of C in long-lived software are somewhat known. Granted, the track record of C is mixed, but many important projects have been successful with it.
Rust is exciting and proving itself about as rapidly as such a language could be expected to do. But there are big questions around how all of this plays out in a long-lived project. There's a good argument that C's simplicity is an asset here.
I like Rust a lot and have done some cool stuff with it. I believe the challenges of long-lived projects will be solved. But at the same time, I admit that there are a lot of other factors (including non-trchnical ones). If you pick a language that doesn't last for whatever reason, and you have a couple decades worth of code, the options are grim.
> Exactly what, in the year 2023 C.E., is C great for?
Implementing low-level code that is important enough to prove correct, without going through the additional effort of de novo proving your entire toolchain is correct.
Memory safety / security is important for a subset of all possible applications. It is not important for _all_ applications.
If I use a micro controller to control a string of leds I do not care about security/memory safety. But I do care about being as close to the metal as possible, and being able to understand the compiled code.
To control a string of LEDs, something like Forth (or some other p-code virtual machine) is going to be just as good as C. PIC microcontrollers have always had BASIC as an option for example.
But many microcontrollers these days will likely have a TCP/IP stack, perhaps even crypto, even if it is to control a string of LEDs via MQTT or Modbus TCP.
Forth would also be a nice choice.
Give you another example. A watch, not an apple watch, but a simple Casio watch. One that does time, alarm and a stopwatch. Not connected to anything. What is important here is the battery life, so the less code the better. C would be a fine choice here. All additional code to prevent security breaches would be a complete waste here.
Give another example. As my day-job I develop embedded software for the railways. Current system I work on operates the brakes when the train goes too fast. Not connected to the internet, and no connection would even be allowed. Written in C. One because it is simple to understand and the developer can focus on getting the functionality correct. Secondly because there is a wide choice of additional tooling and standards that is required to get the application certified.
Only because liability still isn't a thing across all levels of software development.
Thankfully returns in digital stores, warranties in consulting projects, lawsuits in business losses and cybersecurity bills are slowly changing that.
If people rely on the LEDs working correctly or if the controller has any kind of connection to another system (which could be leveraged by an attacker to penetrate deeper into the network) it seems to me that security and memory safety would still be important.
Not everything is or needs to be connected to the Internet.
There are plenty of things that are controlled by a micro controllers that do not have a network connection.
Not everything needs to be connected to the Internet, and not every micro controller is running a non-mission critical task.
A micro controller hitting a memory safety bug that causes the LEDs to present invalid output can have real world consequences. Even if the micro controller isn't directly actuating machinery, it might cause an operator to incorrectly act because they were mislead.
If the micro controller is running Christmas lights, it might be fine if it falls over, no one will die, but I would still call that a manufacturing defect.
Yeah, Ada is pretty great, it takes the best things from Pascal and combines it with unparalleled reliability.
> C.E.
There.
> Exactly what, in the year 2023 C.E., is C great for?
Writing readable code. I'm a huge fan of Zig and Rust too but at the end of the day they just aren't C.
C is a great language to program a PDP-11 in the 1980s. And that's it. There are SO MANY alternatives for so many different use cases nowadays, it's just hard to justify using C anywhere whatsoever.
It was practical for self-hosted PDP-11 development, which was tough. They could have cross-compiled a few other languages on a bigger machine, if only they had one.
> C is the best abstraction for many problem domains
It's really not, though. What C is an abstraction of is computer architecture as it existed by the late 1960s.
> It's really not, though. What C is an abstraction of is computer architecture as it existed by the late 1960s.
I don't follow. Barring micro-code, hardware is designed for executing either RISC or CISC instructions. If anything, the industry has matured and we see less esoteric ISA's today than in the 1960s.
A modern low level labguage would expose you to the concepts of cache lines and homogeneous and eterogeneous cores directly, with locality awareness to recognize false sharing. It would be something suitable to program the Cell architecture from PS2.
What about those of us working on microcontrollers that use TCM rather than cache, and that don't have multiple cores? C sure seems like a good fit…
Those are fair points, but they're probably too specific to the architecture to incorporate in C's general execution model. They're better exposed through platform specific programming interfaces like OpenMP or CUDA. Even a domain specific language, like GLSL, may be more appropriate.
I think that the general point being made here is that C’s general execution model, as you call it, really is just a PDP-like computer because that is what they had when making C.
The confusion I have is that the original commenter makes it sound like the abstract machine C is designed for is not relevant today. While modern desktop hardware may use various techniques to improve performance, like cache lines or specialized cores, they are an implementation detail of that abstract machine. I would not expect C to expose these hardware specifics because C targets a lowest common denominator, from consumer desktop hardware to microprocessors. If the criticism is that C is too general, then that's fair but also applies to every C competitor: Rust, Zig, D, etc... It's debatable how much these languages should include versus exposed by vendor specific API's. Perhaps we need a fork of C designed exclusively for modern desktop hardware.
> Folks who want a safer C should design a new hardware architecture with a "safe" assembly language and a new low-level language that targets it
Would that be LLVM's IR or MLIR (https://mlir.llvm.org)?
Languages like PL/I, NEWP, BLISS, Modula-2 did it much better, but they didn't come with a free beer OS.
I think it will go when we have a sufficiently popular and useful systems programming language that will replace it. It has to be a language that isn't just C with some extra bits, which is why SafeC and CheckedC aren't more popular. I actually think the closest language will be Zig. It's a much simpler language than Rust and people who like C really value that simplicity. It also removes a lot of Cs baggage that make it annoying to program in. But since Zig 1.0 is unlikely until 2026 I'd say we wouldn't start seeing Zig majorly displace any C programs until about ten years after 1.0, which would be 2036. Rust 1.0 was in 2015 and we are just starting to see it in the Linux kernel ~ 8 years out, so that seems like a good timeline. Then just give it another 40 years and I could see a future where Zig and Rust replace all code where C is currently used. Sure there might be some ancient legacy systems that use C, but just like COBOL, would not be something that you would come across unless you wanted to.
SafeC and CheckedC do nothing for temporal safety. Zig is in the same boat. There's not really a simpler alternative to Rust with the same featureset, even its direct predecessor Cyclone was in fact quite a bit harder to use. Rust itself is also improving very quickly and becoming easier to use over time.
Unfortunately, Rust's core design philosophy is fundamentally opposed to much of the design philosophy that made C and C++ so popular and flexible.
It's almost the exact opposite extreme on the pendulum, where C allowed anything while Rust limits to only what the language designers conceive as proper and not just safe.
Zig can gain memory management systems like Nim's ARC which works well for system design and adds temporal safety. On the other hand Rust's trait system likely will never become an "open ended" type system like say Julia's. Heck, even overloaded function types don't seem likely in Rust.
What task can you do in C that you can't in Rust (or Zig, for that matter)?
No, I'd count Zig as having more of an "open ended" type system / philosophy.
Though, it looks like Zig doesn't do function overloading either [1]. That's a disappointment. So you end up with `array_count`, `map_count`, etc instead of just `count`. In my way of thinking that's more work reduces readability. It's one of the paint points of C vs C++ to need `array_list_count` and `hash_map_add` instead of just saying `vec.insert(...)`.
The biggest ones for me in Rust is that it disallows extending traits for types you don't own, and the lack of function overloading. Neither of those are required for the borrow checker or safety, but it's a philosophical design decision.
One of the main drawbacks of function overloading is that it can make code harder to read and understand. When the same function name is used for multiple different purposes, it is confusing for developers who are reading the code. This makes it more difficult to maintain and modify the code in the future, as developers spend extra time trying to understand the various function definitions and how they are being used. Even finding what file the function is in can be a non-trivial task.
Another issue with function overloading is that it can make code more difficult to debug. If a bug is found in one of the overloaded functions, it can be difficult to determine which function is causing the issue. This can make it more time-consuming to fix the bug and can lead to frustration for the developer. I remember debugging an issue at OkCupid and we lost many hours due to debug information being collapsed for overloads, making it look like the wrong function was being called in the debugger.
Finally, function overloading can lead to code that is more prone to errors. When the same function name is used for multiple different purposes, it can be easy to accidentally call the wrong function with the wrong arguments, which can lead to unintended consequences or runtime errors.
In conclusion, good riddance. This is what makes Zig a great language, that it doesn't have garbage like function overloading.
> it is confusing for developers who are reading the code. … as developers spend extra time trying to understand the various function definitions and how they are being used.
This is a similar argument to Hungarian encoding, IMHO. A decent LSP makes it trivial to see which function is being called. The other issue you mention is a problem with the debugger/compiler, not function overloading.
> Even finding what file the function is in can be a non-trivial task.
Not really, control-click and you’re at the function def.
> it can be difficult to determine which function is causing the issue. This can make it more time-consuming to fix the bug and can lead to frustration for the developer.
Not any harder than “method” overloads, which it sounds like Zig does have.
Personally I find the opposite. Having 20 names for functions that all equate to `len` requires more mental overhead.
> A decent LSP makes it trivial to see which function is being called.
We're talking about replacing C here. Requiring everyone depend on a hefty LSP to make sense of source code is a big ask.
Source code is text. I firmly believe that all semantic information of a piece of source code should be expressed as text in said piece of code. I already see code where the language allows the programmer to omit types from variables for 'brevity', and the assumption then is that everyone working on that code is using a fancy enough text editor that can stick in extra labels to show the missing type information. Absolutely baffling to me how people find that acceptable in any way.
> The biggest ones for me in Rust is that it disallows extending traits for types you don't own, and the lack of function overloading. Neither of those are required for the borrow checker or safety, but it's a philosophical design decision.
The orphan rules are definitely necessary for coherence: otherwise, you could end up with a situation where two different crates try to implement the same trait for the same type, and there would have to be some (likely unwieldy) mechanism to resolve that.
Also, it's not that you can't implement any traits for types you don't own, it's that you can't implement traits you don't own for types you don't own. So you can still, e.g., create your own extension trait and implement it for whatever type you want. (But you can't do that while also creating a blanket impl for types implementing the original trait, which is a bit of a pain.) And, of course, if you need an object to implement a trait you don't own, you can define a newtype wrapper over it, but that can also be difficult to work with sometimes.
Perhaps this situation could be improved by one of the "crate-local impl" proposals that have been floating around. I'm not entirely sure how those would interact with existing implementation from the defining crates.
> The orphan rules are definitely necessary for coherence: ... and there would have to be some (likely unwieldy) mechanism to resolve that.
I think Julia, D, Nim, and others show its possible and generally easy to work with open ended type systems. I think Haskell does as well?
Though yes those come at the cost of possible conflict or user confusion, which is why I consider it a philosophical decision. It matches with the decision to not allow user code to use trait specializations in stable despite the stdlib having it.
Imports generally seem fine for controlling what gets used. Want a trait impl, import it into a module. Cargo crate features might also be a route to enforce package level decisions.
> And, of course, if you need an object to implement a trait you don't own, you can define a newtype wrapper over it, but that can also be difficult to work with sometimes.
Unfortunately that means you can't define 'default' or 'clone' traits for a type. That prevents you from using derives on your newtypes as well. That means manually implementing clone, or serde which is a PITA.
> Perhaps this situation could be improved by one of the "crate-local impl" proposals that have been floating around.
At least that'd make it somewhat easier to work with. It'd still not let end users / programmers to mix and match types and traits from different libraries without a lot of unnecessary work.
> I think Julia, D, Nim, and others show its possible and generally easy to work with open ended type systems.
Sorry, could you give an example of this? I can't find any way to extend existing types in those languages with some brief Googling.
> It matches with the decision to not allow user code to use trait specializations in stable despite the stdlib having it.
Keeping specialization unstable is much more a practical decision than a philosophical position: if they could, they would've stabilized it years ago. The problem is that specialization very quickly becomes unsound in combination with lifetimes. The compiler erases all lifetimes on types after checking them, since monomorphizing a new type for each lifetime would lead to an exponential explosion (this can't be changed at this point without redesigning the language). Therefore, users must not be able to specialize a trait impl on certain lifetime combinations (or certain lifetimes like 'static), since the compiler would have absolutely no way to tell which impl to use. And in turn, completely barring lifetime specialization becomes a daunting challenge with the existence of associated type projections and blanket impls. Specialization definitely isn't kept unstable just because they think users can't be trusted with it.
> Imports generally seem fine for controlling what gets used. Want a trait impl, import it into a module.
I don't think this would be compatible with blanket impls, since you couldn't just import every single potential impl in existence (and if you could, you'd run into conflicts). I suppose you could have a system of exporting impls, where to use a blanket impl you have to pass it another impl as input, but at that point you have new idiosyncratic system that would scare away users and would likely be far more noisy than a good newtype system.
> Unfortunately that means you can't define 'default' or 'clone' traits for a type. That prevents you from using derives on your newtypes as well. That means manually implementing clone, or serde which is a PITA.
If a foreign crate doesn't implement Default or Clone for its types, then how is the compiler supposed to derive it for your local newtypes? It can't just look into the foreign type's fields, if they aren't all public. Are you often having to work with fully public foreign types?
Overall, I get that the type system can be pretty frustrating as it stands today, but I don't see any better alternative than building better tools for defining newtypes.
> Though, it looks like Zig doesn't do function overloading either [1]. That's a disappointment. So you end up with `array_count`, `map_count`, etc instead of just `count`. In my way of thinking that's more work reduces readability. It's one of the paint points of C vs C++ to need `array_list_count` and `hash_map_add` instead of just saying `vec.insert(...)`.
Zig doesn't have function overloading but it does have namespaced functions, so you can define your types and your "methods" on them.
Rust seems to have a lot of C++ influence to me, specifically later C++ where RAII and smart pointers became the norm. This makes sense that it came out of Mozilla, who have always had large C++ code bases.
This is honestly my biggest problem with Rust. C is a small language where the std libraries and language are defined in hundreds of pages. C++ last I checked in the latest standard was getting to almost 2000 pages.
Rust does not have a defined standard yet, but I suspect it also would be quite large.
True dat. But I want to note:
- Swift has plans to incorporate temporal safety in upcoming releases. - Zig isn’t fully baked yet.
Rust will be king of that space for a while, though.
D (in betterC mode) is C but with proper arrays, modules, advanced metaprogramming, member functions, lots of memory safety features, compile time function execution, nested functions, etc.
Is there a reason Zig is taking so long to go 1.0? The language itself feels mature.
the self-hosted rewrite took a long time so it felt like the project had stalled for the past ~year. the team has started working on new features like the package manager, though.
Is there any other systems language that's even moving towards a mature ecosystem with a compiler like CompCert? It feels like this is still decades away for any realistic competitor to C.
There are certainly some key libraries that use Fortran still. But I wouldn't say it's still "around". As a language for new projects, it's extremely niche. That's probably what "go" means in your comment.
I assume you're being facetious with the "recently came across", since scipy is so common, but I will add that nearly every math-intense library is a clever wrapper for some Fortran code.
I would be perfectly happy to see many hardened C libraries become the foundation of the next gen systems/ embedded languages. It does bother me slightly when we abandon the past entirely and attempt to "rewrite it in X"
In part that is because once a piece of software has been part of a certification process it can be very hard to replace it by something newer, no matter how shiny or how much faster. You can see quite a bit of this in aerospace, civil engineering and so on. Nobody wants to be the one to replace the Fortran based FEA package with something novel and end up being liable for a bug.
No, I was not being facetious. I am not a regular user of Python, mostly have been writing/maintaining software in C++ and (in the past three years) C#. I came across it because a colleague (fluent in Python) had used scipy.interpolate in some experimentation and now the algorithm he came up with has to be implemented in C#, so I investigated whether scipy.interpolate could be called from C# and then found out that its source was in Fortran.
But just like Fortran is still used for nearly every math-intense library, mostly invisible for most of the users, I suspect that C will still be around in 50 years from now.
I agree, with his observation that C should be no longer your language of choice for new projects. I personally still prefer using C/C++ for my private software projects, simply because it is the language I am most fluent in. This year I used it for AoC.
Significant chunks of numpy and scipy, the "main" python packages for numerical algorithms, are wrappers around classic libraries such as BLAS, LAPACK and ARPACK, which are Fortran. Fortran was the de-facto language of scientific and numerical programming for a long time, back to the 60s even. These libraries are battle tested more than anything else out there, they literally have decades behind them, so they are industry standard and are used wherever possible.
Is that even possible as long as the Linux kernel is written in C? I wonder if telling people not to learn C will have a long-term effect on being able to find competent contributors to the kernel.
> Linux kernel is written in C
Not exclusively in C, not anymore: https://docs.kernel.org/rust/index.html
It might take another decade for a C-free build to be possible, though.
Only a decade? I don't think so.
AFAIK Torvalds has also stipulated that any Rust code needs to be mirrored in C.
This right here. Rust only JUST got added to the kernel in 6.0 and I doubt that it will replace 30 years of C development any time soon.
> AFAIK Torvalds has also stipulated that any Rust code needs to be mirrored in C.
Do you have a cite for this? I hadn't heard this at all.
Apologies, I can't find a source. I probably read it on LWN somewhere. Linus wouldn't dip both feet into the water at the same time. I'm confident he's said that the kernel needs to be buildable without Rust if need be.
> I'm confident he's said that the kernel needs to be buildable without Rust if need be.
FWIW that's not the same thing? That's simply CONFIG_RUST=n, not "You have to write this driver twice in two different languages".
Wow that sounds fantastically optimistic, or pessimistic depending on your point of view I guess.
Rewriting that amount of code in ten years sounds very very hard, at least.
A significant amount of the lines of code in the kernel are for drivers (7 of 12 million lines?).
A potential strategy would be to set a date where any new drivers must be written in Rust. Deprecate every non-Rust driver on a date after that. Focus on rewriting only the drivers necessary for current hardware platforms (and some sensible/arbitrary cut-off going back x years). Then set a final deadline for a New Linux kernel release that removes all deprecated/C drivers.
I would blindly estimate that entire process to take 10-20 years (not including the time needed for debating the whole thing).
Then somebody "just" needs to rewrite the rest of the code.
So, uh, any volunteers?
They said a C-free build. For that you'd need to "just" rewrite the core kernel, which will be 150-200k lines, and the drivers + arch specific parts for one system. Still a tall order, but a decade isn't unrealistic if Rust proves itself.
Now, rewriting everything, including all the legacy drivers? Yeah, never happening even if Rust succeeds utterly.
Once they get to that point, a new language will be available that makes up for the things Rust is still short on.
Maybe there's a case to be made for a multi-language kernel. Rust is the first step. Maybe it'll make sense at some point to keep Rust, but add another language.
I don't see such a feat taking place in the next decade, you need highly dedicated people. I can see new code being written in Rust but a C free build sounds like something that really won't be happening anytime soon.
I wish something like it could happen, but I am causyiously optimistic.
As long as there isn't a Rust compiler written in Rust (there is a transpiler to LLVM bytecode, but that gets compiled by a C++ compiler, same for GCC-rs) I don't think a C-free (or a C++-free) build will be possible at all, so I'm guessing it'll take somewhere in the 60-100 year range.
There are some open issues and features like inline assembly that aren't supported, but a moderately complete backend exists now:
https://github.com/bjorn3/rustc_codegen_cranelift
The compiler frontend is already written in Rust.
Honest question: what would the engineering rationale be behind dropping LLVM entirely and have rustc have its own code generation backend? There are two out there that might be interesting, one uses gcc and another is called cranelift, but it isn't the default and focuses exclusively on compilation speed for debug binaries.
Well people keep telling me that C/C++ code should be replaced by Rust for ~safety~ reasons, and LLVM is a C++ project, so...
Let's say it would avoid all the memory bugs in LLVM, that would doubtlessly make it worth throwing away the LLVM implementation (avoid sunk cost fallacy).
If anything, it's more likely that the desirable choice is to shift LLVM to Rust.
The Linux kernel comes with many old drivers that modern programmers have little knowledge about. It is probably more realistic to write a new kernel in rust (like Redox), targeting modern machines only.
By the time it's rewritten in Rust, old drivers will be dropped due to lack of demand for them.
> that happens to be programmed in Fortran.
SciPy is a highly optimized library and Fortran is faster than C for some tasks:
In a way, it's already happening. Many big tech companies are now defaulting to other languages than C and are actively discouraging the use of C or C++ for new code. They still have a big vested interest in maintaining existing code of course. That's not going to disappear overnight obviously.
Do you have examples? Is it all going towards Rust or are there other big tech-backed contenders?
I read recently that parts of the code for Artemis are written in Ada. Ada is also used on the ISS. Makes me wonder who besides NASA might be using Ada due to its' safety.
Can C really completely go away as long as there's embedded programming? Are there any other alternatives for that domain?
Rust is rapidly picking up steam in the embedded sphere :-)
Unfortunately it's an industry with a lot of stubborn old people so I don't see any major changes happening until they've retired but I'm willing to bet that async is going to revolutionize embedded development. Being able to await interrupts and doing efficient cooperative scheduling while writing straight forward code is a massive QoL improvement which is what embassy is enabling: https://embassy.dev/
There are embedded devices programmed in other languages, you know, and there have been for decades. In Ada, for example. Which is an even better fit to the domain than C (better ways to describe hardware idiosyncracies).
Ada is a much better and modern alternative for embedded, real-time and/or systems programming. Its mature, (although still evolving) and very scalable from very, very small to very, very large systems. It also supports interoperability with C and C++. It is a primary language in GCC. See https://ada-lang.io/ and https://learn.adacore.com/
Unfortunately Ada is held back by a genuine lack of awareness and some old misinformation baggage.
This! I have mainly programmed in C in my career (note: I also have experience with C++ on multiple projects and to a lesser extent with C# and Java). In almost all the embedded projects I have been on, Ada would have been a far better choice because it would have helped avoid whole classes of common software bugs and thus less costly to maintain. This is especially true with bugs reports and vulnerabilities that would be discovered on systems that were already deployed.
People normally think that C is the best for low level embedded programing, but from my experience, it pales in comparison to the amount of low level control and type safety that Ada provides, even when you restrict yourself to a small subset of the language (note: Ada is often used on barebone systems and without an Ada runtime). Given the interoperability Ada has with C, it means you don't have to rewrite everything in order to incorporate it into existing code bases.
As strange as it might sound, the success of Rust seems to have reopened some doors for Ada. People in both communities are aware of this and already collaborating on various projects to mutual benefit.
I haven't used Rust, but many of the comments and articles I have read about the benefits of the language and the strong desire for develop robust and/or safe software are virtually the same as what Ada supporters have been claiming for decades.
It's a good thing more people are realizing we really need to stop settling for C and C++, stop investing more money into tools that simply compensate for the weak foundation of those languages, and finally adopt safer alternatives.
Rust and Ada are both seen as 'safe' languages that prevent many unsafe actions by programmers.
Yes, compilers for C++, BASIC, Pascal, Ada are around, mostly commercial and people pay to keep those companies in business, for powerfull microcontrollers where real time GC is an option, there are Oberon, .NET and Java as well.
There are surelly other factors that weight in using C, but not for lack of options (in many cases).
It won't go away, but just like it happens with COBOL, Fortran, and PL/I, you won't see anyone dreaming of coding C until the end of their working days.
Or maybe they will, given how much consultants in those languages happen to be paid, as no one else wants to touch them.
> you won't see anyone dreaming of coding C until the end of their working days.
I do dream about that. Just being able to tag along while some programmers learn their Xth framework as a language.
Maybe someday I might concede and move on from C89 to C99.
Same here. Ive got so much experience and libraries ive written i can marshall. They've been ported across small and large platforms since before github and google. Ive been maintaining them since the 5.25" disk age. New C projects aren't precarious to me. Watching a bunch of architects with no embedded experience try to make C++ happen on microcontrollers in the other hand...
If I may ask, what projects do you work on where C89 is still useful? Embedded ROMs for tiny processors in some niche segment?
Ye well Linux is moving from GNU C89 to GNU C11 so maybe it is time for me to move on.
I don't think there is any reason to choose C89 over C11 other than fear of new compiler bugs and compatibility with old compilers?
But coding C is actually fun!
Sure. But debugging sure isn't. Null, null terminated strings, data races, segfaults...
No, no - debugging is actually where all the fun is!
I'm not masochistic enough, I guess.
C has enough actual fans that I bet it never really goes away, but it won’t be useable in professional contexts after a while other than for maintenance, because people outside tech will start to call bullshit on the EULA liability shield.
> C has enough actual fans that I bet it never really goes away, but it won’t be useable in professional contexts after a while other than for maintenance, because people outside tech will start to call bullshit on the EULA liability shield.
That's an extraordinary claim indeed; if people outside tech were going to call bullshit on EULAs as a liability shield, they would've done so in the last 50 years of software sales.
Reliability or the lack thereof has never been an impediment to some piece of software getting popular, but to me, you appear to believe that people want more reliability from software than they have been getting thus far.
Your belief is at odds with reality.
They already started with small steps.
- Return of goods in digital stores
- Warranties in consulting projects, requiring free of charge fixes up to one year
- Cybersecurity bills
It will only get better from now onwards.
The problem is that there isn't anything to replace C, that would be acceptable enough. C has become a quasi protocol, and used in exchanges between languages. One could argue it's better for newer languages to embrace and excel in interop with C, while providing more advantages, being safer in comparison, or greater ease of use. As is arguably the case with Vlang, Dlang, Nim, etc...
If anything, C may still be going strong for another 20 to 30 years.
A good bet is that a technology will be around at least as long as it has been around (I think there's also a 'named law' for it). So I fully expect C to be around in one way or another for the next 50 years, it probably won't be as important anymore, just as COBOL or Fortran are not as important as they used to be.
One reason for C to disappear completely would be if computer architectures would change so much that current programming languages no longer even map to those new architectures (e.g. all the existing programming languages would need to be dumped anyway).
https://en.wikipedia.org/wiki/Lindy_effect
> The Lindy effect (also known as Lindy's Law[1]) is a theorized phenomenon by which the future life expectancy of some non-perishable things, like a technology or an idea, is proportional to their current age. Thus, the Lindy effect proposes the longer a period something has survived to exist or be used in the present, the longer its remaining life expectancy.
While C and C++ have been around longer than COBOL was when building new systems it it was halted, at least COBOL does its job well while C and C++ don't due to their lack of safety. Also, COBOL is more difficult to replace than C and C++ are. I expect C and C++ code to start being seen as dangerous and money will be spent to eradicate it due to its potential to be a security risk in this online world.
My prediction is that C will die off as a language long before Fortran and COBOL. What keeps a language like Fortran alive is that it is used in an application niche where rewriting is done at best ship-of-Theseus style, and the friction of using a new language for a component far outweighs the benefits of doing so (weather models are a good example here). For COBOL, it remains in applications where the cost of a switch or rewrite includes a low risk of catastrophic, and therefore expensive, failure (see Southwest Airlines for a very public and recent example of such a failure).
C does have a similar kind of niche at first glance: systems programming is of course conservative, rewriting everything is unlikely, and of course, C is the language used for ABI. Except on closer inspection, that moat is remarkably shallow. Being the language of ABI means that every competitor language has some way to speak C guaranteed, so the friction of rewriting systems software Ship-of-Theseus-style is much lower (though still nonzero). Systems software is rewritten from scratch on a much higher cadence: in the last 20 years or so, most of the userspace system glue for Linux has been replaced (e.g., systemd, iproute2, pulseaudio, wayland). And we've learned over the past few decades that there's no practical way to fix C's fundamental unsafety issues with software engineering practices, and C's committee is too conservative to consider retrofitting the necessary features to be able to fix unsafety at a language level (to say nothing of getting people to use it).
There is already a small clutch of languages that can serve C's niches that don't have the same fundamental unsafety issues, and right now, we're sort of at an experimental stage of system software trying them out. It's not unreasonable to believe that within a decade or so, one or more of these languages would be considered a standard, safe choice for implementing new systems software--and the use of C in new projects will start dropping. At some point, the proliferation of non-C systems projects will make people point out that having these components talk through C's ABI is too limited in functionality, and a system will change its ABI from C to some other language. And once it is no longer the language of ABI, C will lack its moat that keeps it alive, and it will start dying, though its death will be a slow, agonizing death.
We can replace COBOL with a Java backend to keep the enterprise feeling going, but what would C's replacement be in this case?
Rust is already replacing C and C++. Ada is too. As the importance of safety increases, more safe languages will appear that can replace C and C++.
C++ for starters, would already be an improvment, provided string, array and vector classes with bounds checking, get used instead of raw C pointers.
Alongside RAII for resource management.
For most embedded stuff those are just extra hassle and not worth it. C and its raw pointers and are really all you need most of the time. The need to have some discipline and expertise to produce solid code with confidence is not a bad thing in the embedded domain. If the code truly is critical you need to dive deep into verification techniques anyway.
Sure we all know that the S in IoT stands for security.
Rust has already begun to replace C in the embedded space.
C syntax is already way too rich and complex, not to mention the bazillions of gcc extensions required to compile the linux kernel.
Namely, if it has to be "replaced", that would be with something with a much simpler syntax, which will require a bit more of finger power. We don't want to find ourself locked-in by very few compiler vendors (open source or not), that only because it is not reasonable to code a real-life alternative with a small team of averagely skilled devs in a reasonable amount of time.
This language should build on C though: no enum/typedef/_generic/switch/etc, only 1 loop keyword (loop{}) only explicit sized types, no integer promotion, no implicit casts (except maybe void* pointers but number literal casts should be) but explicit casts (compile-time and runtime, without that horrible c++ syntax), explicit compile-time const (we have only runtime consts which could be optimized as compile-time consts), enforce extern for functions (and don't try to put the binary format, elf/coff/etc, semantics into the language syntax or worse, the OS interface semantics), etc.
With enough discipline (and compiler warnings), we could get close to such language.
I did not check the latest and greatest rust syntax, but is what's above its explicit goals?
That said, I am a "everything in 64bits RISC-V assembly with x86_64/arm64 legacy ports kind of guy"... if RISC-V is successful (I wish). We could think of high-level language interpreters (coded in assembly, for instance a RISC-V coded python/lua/javascript/etc interpreters).
TBH that sounds a lot like Zig (it has two loop keywords though: for and while, but those are for different use cases - for is only for iterating over ranges, and while is the 'vanilla loop' for everything else). Zig does introduce a bit of syntax pollution for its comptime features though (mainly the 'inline' variants of existing keywords), and it adds some syntax sugar for the builtin error handling and optionals - so it's essentially a dismantled C which is then slightly extended into a different direction.
This is one too many loop keyword.
And something must be done about breaking changes of the syntax (or any never ending "features" additions).
We all know that some syntaxic constructs will be nasty in the end and will reasonably need fixing (basically, to keep them excrusiatingly simple from a compiler writing point of view). I am thinking about something like "syntax breakage provisions" from the language authors: for instance, no more than 4 iterations of major syntax breakages, then the language syntax will be frozen for forever.
Whatever, I am a everything in assembly kind of guy (with high-level language interpreters written themselves in assembly).
All that is a thought experiment for me, nothing more.
Doesn't Modula-2, Oberon or Ada fill your checkboxes?
These are good ideas. The simplicity here is what I am going for.
On the long run, it all depends on syntax stability which is tightly coupled with its complexity.
The "benchmark" to run to evaluate syntax complexity toxicity, for a compiled language, is how long an alternative compiler written by a small teams of averagely skilled devs, or even one individual averagely skilled dev, can stay a "real life" "working" alternative.
Mid-term/long-term planned obsolescence of computer language syntax is really...
Oh, and in my previous post: for compile time constants we can use static consts, my bad.
The more I learn about new and less new languages (Zig, Jai, C#) the more I like C and its true simplicity.
The language has a few irritating historical artefacts and the stdlib API is completely outdated and full of bad design, but there it is still versatile enough for my needs.
> the stdlib API is completely outdated and full of bad design
It is also completely unnecessary on Linux. I switched to freestanding C and discovered it was a much better language. Made programming fun again. All I needed was one system call function and some entry point code.
It's gotten to the point that it bothers me that gcc could potentially generate calls to mem* functions even in freestanding mode.
llvm does that too. It's really tempting to assume there's a libc available for the target architecture. On the list of things to clean up that hasn't hit top of stack for anyone.
The simplicity is a blessing and curse. I enjoy writing C and sticking to a hard discipline of writing unit tests for all functions where possible. Abusing `assert/1` in debug mode. But concurrency is a hard thing to build. Identifying critical sections, making sure they have mutex locks where necessary. I want to love Rust but it's hard for me to get used to the syntax. I've started using it and failed multiple times because it's terse. Inevitably I will try again and hope it pans out.
I like C89 for its simplicity too but after using the Jai beta for over a year I have a hard time seeing myself ever going back.
I'll probably give Jai a go when it is released, but I am a bit afraid the idiosyncrasies are already creeping up.
any particular examples?
> The combination of BASED and REFER leaves the compiler to do the error prone pointer arithmetic while having the same innate efficiency as the clumsy equivalent in C. Add to this that PL/1 (like most contemporary languages) included bounds checking and the result is significantly superior to C.
https://www.schneier.com/blog/archives/2007/09/the_multics_o...
"Multics B2 Security Evaluation"
https://multicians.org/b2.html
But naturally ignoring it was more fun,
> Although we entertained occasional thoughts about implementing one of the major languages of the time like Fortran, PL/I, or Algol 68, such a project seemed hopelessly large for our resources: much simpler and smaller tools were called for. All these languages influenced our work, but it was more fun to do things on our own.
"Speaking as someone who has delved into the intricacies of PL/I, I am sure that only Real Men could have written such a machine-hogging, cycle-grabbing, all-encompassing monster. Allocate an array and free the middle third? Sure! Why not? Multiply a character string times a bit string and assign the result to a float decimal? Go ahead! Free a controlled variable procedure parameter and reallocate it before passing it back? Overlay three different types of variable on the same memory location? Anything you say! Write a recursive macro? Well, no, but Real Men use rescan. How could a language so obviously designed and written by Real Men not be intended for Real Man use?"
Well, at least the Morris worm wouldn't have happened with PL/I.
He says something like
>"if you want to write new program in C now think long and hard and pick something else"
I would not use plain C to write enterprise backend servers. I happily use modern C++ for that.
For some very low power microcontrollers however I absolutely would. Amount of high quality free tooling and libraries beats everything else.
From a practical point of view: I've written enough firmware for very lowly microcontrollers like AT90USB1286. Runs like a charm (oldest for 10 years already), did not not require even single bug related update and zero complaints from customers. Changing the language in this particular case would bring no benefits but extra expense.
> I would not use plain C to write enterprise backend servers. I happily use modern C++ for that.
I tend to think unless perform/$ is really important one should use a managed language for that, that isn't Javascript.
But yeah, not really sure what some other language would buy me in the small embedded space that earns me my beer money. I recently had an issue where the corporate spyware was convinced make/gcc were up to no good resulting in minute and a half compile times instead of the usual 10 seconds. I spent a bunch of time with IT getting that fixed. Well that's the build time I'd get with Rust. So Rust is a big nope. C++ is not that slow but still slow. And C++ without malloc is well who are we trying to kid here.
>"I tend to think unless perform/$ is really important one should use a managed language for that, that isn't Javascript."
For my particular project managed language would simply not work. Way too slow. Besides the code base is relatively small and higher level language would not offer any compelling benefits.
>"minute and a half compile times"
Just checked. My particular project compiles under 1 second using Atmel Studio 7.
I often wonder why c did not deprecate the bad bits, like the dodgy string functions should at least be behind a switch to enable, like --enable-strcat or something. Even the printf bug when passing a single argument is easily fixed by requiring two arguments at a minimum, etc. Then leveling up the std library to force the use of bounds checked strings and buffers, again hiding the unsafe ones behind switches or unsafe keywords. This woukd allow backwards compatability, while making newer code safer by default.
The C language and the C stdlib are not the same things. You can use C without the stdlib.
You simply have to build or use a different framework offering similar or better APIs.
Yes, of course. But making the stdlib more sane would dramatically improve things. Deprecating the dangerous methods and requiring unsafe keywords to use them coukd still be done
WG14 never cared that much about security, as simple as that.
I'm trying to build a language that translates directly to C. I will just implement some features of the language in C, with some headers I can already find.
It feels like it's the best way I want to do this. That way, a C compiler can do a lot of work I really don't want to do, C already has backends, optimizers, etc etc.
All I want is a C-like language with native strings, hash map and list, tuples python indentation, vector math, and nothing else, and make it as simple as possible.
I'm a bit tired of new language trying to do new things, I just want something less verbose than C, but not as powerful as C++, with the feeling of python.
Yes I know about all of them, they're all a bit too complex for my taste.
They're great languages, but they're not what I want.
I understand. I wrote a language once, a while ago. To get the speed I wanted, I wrote it in assembly language - and it took over 2 years of fairly consistent effort to get something that really worked. (I would call it "Pythonic" but this was before Python existed.)
So, go for it, and keep us updated on your progress. Good luck.
or Vlang (https://vlang.io/), which can compile to C, and has a C2V transpiler.
Yes, it's quite a nice language, but it still has features I wish it didn't.
It's close to what I want to do, except it doesn't have python indentation, and it doesn't have tuples.
Great talk. Borland should have rated a mention though, their C compiler really popularized C development on Windows.
Most sane Borland customers were using C++ alongside Object Windows Library, not raw C alongside Win16.
Even Petzold embraced C++, even if superficially,
"This third edition has several changes. First, all programs are now compilable with either the Microsoft or the Borland compiler. All make files are generic and use environment variables for compiler flags, link libraries, and so forth. Second, all programs are now compilable in C++ mode. Although I don’t use any C++ specific features, compiling first in C++ mode is helpful if C++ features are to be added later to the code"
-- https://archive.org/details/programming-windows-31-3rd-ed/pa...
> Most sane Borland customers
Veiled insult noted ;)
I used both, but TurboC was a real game changer for me. I came from the ST using MWC and ended up in very unfamiliar territory on Windows, Borland TurboC made all the difference for me. It allowed me to be productive on an unfamiliar platform, compiles were absolutely lightning fast compared to anything the competition put out and was rock solid.
Edit: I just realized I still have ctrl-f9 more or less in my muscle memory. It's been decades...
One day later: some more thoughts on this, I think the biggest reason to pick Borland was bluntly put the price. There was simply nothing else that came even close to being that affordable and, more importantly, that would run on my relatively anemic machine. I was dirt poor back then, my computer didn't even have a case, it was essentially a motherboard and a power supply bolted to a steel frame of one of those office folder hanger carts (I don't know what those are called in English, but they were pretty common in the days of paper file folders). I just plugged cards into the top and a harddrive (a 20 MB MFM Seagate which cost as much as my car) was sitting loose on the same base with some cables to keep it connected to the motherboard. It was about as ugly as it could be but it worked and I was pretty happy with it.
Happy New Year by the way, if you are still up and reading this!
It is a real shame Borland went form the company providing best and affordable development tools ever to some big enterprise would have been but nobody needed it and to oblivion.
Yes, it is. Philippe Khan also strikes me as an all around sympathetic guy and a true serial entrepreneur in the original sense of the word.
I maintain that C is fine and optimising compilers converting 'bad' C into dangerously broken binaries is not fine. We don't need to replace C to make it safe, we need to take the edge off undefined behaviour justified compiler rewrites.
Saying C is fine, does not make it fine though. The facts speak for themselves: even the best programmers in this world which are extremely conscious about security have vulnerabilities in the software they wrote because of C. There were two remotes vulnerabilities in OpenBSD since it's creation, more recently there was a vulnerability in the ping utility of FreeBSD, etc.
C without undefined behaviors is not C anymore by the way. It would be something else.
Strictly C where your implementation has provided sane definitions of undefined behaviour is still C. The implementation is free to provide definitions of undefined behaviour, and no diagnostic required doesn't mean no diagnostic permitted. Offhand I think all the undetectable parts are a quirk of separate compilation which is solvable by linking an IR instead of machine code. That would probably be a high value project to implement.
I believe the main culprit is that the compiler guys want to optimize away Cpp templates, where programmer intent is not as explicit as in C.
C has enough UB of its own, no need to bring C++ into the picture.
I meant that the shared compiler code with Cpp is the problem. At least in gcc. I don't know about Clang.
I attribute it to benchmarks. Spec and the like. Where you can indeed make the benchmarks faster by leaning into UB. No signed overflow? Branch gone from that loop. Prove these two things can't alias because the pointer provenance model says they don't and you can rearrange the loads and stores. The collateral damage from that line of work is high though.
Ye you might be correct. I think they are overreaching a bit in the eagerness to optimize.
An important part of C history is that in the early 1980s the IBM PC and clones became the most popular computer in the world and it was not really compatible with C. You could say your program was "tiny" and limit it to 64KB or fill your code with "near" and "far" pointers (if using Microsoft tools) or use @ instead of * (if using QNX tools), but the cost was not being able to port to/from the VAX/68000 world. All this went away with the 386, but without this problem it is likely that C would have overtaken Pascal even sooner.
It is not hard to say what is unique about C: it and Forth are the only high level languages with seamless access to memory. If other languages offer it at all, like PEEK and POKE and Basic, it is far more awkward and interrupts your flow. That might be a good thing - the ESPOL compiler mentioned in the talk would print a big fat warning "YOU MUST KNOW WHAT YOU ARE DOING!" after any line in your code doing C-like tricks.
C is like your wife: you love her; you are afraid of her a little; sometimes you wish she was someone else.
Speaker asks: "What can and should replace it?" (i.e. if you started a new project today, what language should you pick instead of C, because C would be the wrong choice).
He goes on to list the following options (and says C++ does not count), but we'll only know far in the future which would have been the right pick:
- Rust
- Go
- Zig
- V
- Nim
- Swift
- ...
The problem with C++, is that while it offers the language features to write safer code than C, it is also copy-paste compatible with C (most of it anyway).
So while a security conscious group can write C++ code that takes advantage of those features, other group can basically compile C++ code that is hardly any different from C.
So it is a better option than raw C, if it is the only viable alternative (like in HPC), but for security conscious scenarios one of the others is a better answer, if available.
None of them are the wrong pick. They're all better than C.
I was in college early '90s in CS&E (Comp Sci & Engg) and we were taught Pascal first, COBOL and then C. Pascal and COBOL in the second year and C in the third year. C++ was already there, and gaining traction and I taught myself C++ (through the Annotated C++ Reference by Stroustrup). By the time I graduated most jobs were expecting C/C++ coding skills and my first job involved coding in C++.
Nothing compares to the raw power and control we get with C.
Around '97-98 Java was all the rage. Then Bill Gates did his internet pivot and we had .NET arriving in the '00s. The allure of being able to program in any framework language and interoperate looked like a good thing.
Over the last 20+ years I had been in the .NET world and just this year I went back to writing in C and it all came back. It is quite refreshing to be completely responsible for every aspect of your code's working. While it is sometimes tedious, nothing beats the power and control of working close to the machine abstraction.
Just my 2cents.
The most complicated and the most simple language at the same time.
I wonder what form a “typescript for c” language would take and if it would be as revolutionary as typescript was for JavaScript applications. I would assume that all the existing c tooling could still be used.