Jens Gustedt, author of Modern C, senior scientist at the French National Institute for Computer Science and Control (INRIA), deputy director of the ICube lab, and former co-editor of the ISO C standard, speaks with SE Radio host Gavin Henry about the past 5 years in C, C2Y, and C23. They discuss what has happened in the C world since we last spoke 5 years ago, including how the latest C standard is going and what to expect. Jens discusses how the latest changes in the Modern C book apply to you, how a C transition header can help you get up to C23 if you’re not there already, and presents a comprehensive approach for program failure. This episode explores C2Y, C23, bit-precise types, stdckdint.h, stdbit.h, 128 bit types, enumeration types, nullptr, Syntactic annotations, auto and typeof keywords, if let, as well as what’s being added and removed in C2Y (possibly called “C28”), and Gustedt’s four categories of program failure.
Transcript brought to you by IEEE Software magazine.
This transcript was automatically generated. To suggest improvements in the text, please contact [email protected] and include the episode number and URL.
Gavin Henry 00:00:18 Welcome to Software Engineering Radio. I’m your host Gavin Henry. And today my guest is Jens Gustedt. Jens is a senior scientist at the French National Institute for Computer Science and Control, Deputy Director of the IQ Lab, former co-editor of the IOC Standard and author of Modern C all editions.. Jens, welcome to Software Engineering Radio. Is there anything I missed in your bio that you’d like to add?
Jens Gustedt 00:00:44 Hello. No, that’s great. You got it.
Gavin Henry 00:00:48 Thank you. So, the goals for the listener today for our show is to understand what’s happened in the C world since we last book, if you can believe it. Five years ago, now back in 2020 when your book second edition came out, we’re going to understand how the latest C standard is going and what to expect if there’s anything coming up to explore the latest changes in your Modern C book and how they apply to the reader. Explore your C transition header to help people get set up on C23 if they’re not there already. Given it’s 2025 and you talk about a new comprehensive approach for program failure, I’d like to explore if we have time. So, let’s get started. So, what if you can recall has happened in the past five years in the C world? And bear in mind what happens in other programming languages in five years. Has a lot happened to C or not a lot compared to say Rust or JavaScript or how would you describe what’s going on?
Jens Gustedt 00:01:47 First of all, you have to know that C is really slow. So maybe that’s the reason also why we talked five years ago and not every year or something like that. Development goes slowly and we really only standardize things that are already implemented. So there’s a long cycle for new features to come in before you even see them. So compared to other languages, for example, which I most know the most, C++, they have a lot of more excitement and fuzz and things going on and do things and then they do them back and then they do another thing and then they do something which is not consistent. So, we try not to do that. We try to have a steady pace, slow but steady to go forward. One of the things which were outside of the C Standard was another say standard, a technical specification which talks about point of provenance. So, this helps for those who know what that is to deal with liaising of pointers and objects and things like that. And this is work that took us 10 years, something like that and was mainly led by PSUL from the University of Cambridge. So, this was a big effort which now is concentrated in one technical specification, which in some years perhaps might go into the C standard. This is really what one of the things that went on in addition to the C standard itself for the moment.
Gavin Henry 00:03:28 I’ll get that link from you in the show notes.
Jens Gustedt 00:03:30 Yeah, yeah, yeah, yeah. I’ll give you a link on that so we can share that. There had been a lot of discussions also about the security of C code compared to other languages. I was not always happy with these kinds of discussions because my main impression always has been there have been some major misunderstandings what Cs and what C can do and what C wants to do and what other languages make different and what cost benefit ratio is. C is a relatively specialized language where you can do many, many things. So, all security buttons are somehow off, and this is by design and that’s not an accident. So, if you don’t want that, you probably are programming in the wrong language in some sense.
Gavin Henry 00:04:19 Did the work on the pointer and object aliases help with security side of things?
Jens Gustedt 00:04:25 In the long run we hope that this will help. Yes, but this has to be integrated in compilers yet or stated a little bit differently. The compiler vendors really have to check if they are confirming to that model. So basically, everybody’s convinced that this is the right model, that basically everybody should be there and have that. But there’s so many details to check for the compiler producers that they wanted it to be a technical specification so they already have something written up but they have to check this over the next years implement stuff, which is perhaps missing and things like that.
Gavin Henry 00:05:04 Would it be practical to sort of explain an example of these aliases and how they help with security?
Jens Gustedt 00:05:11 The main issue which is in the title is provenance. So, provenance we define of a pointer is somehow to which object in a wider sense, this pointer is pointing. So every allocation and, say, if you call malloc or calloc or something like that, or if you have a declaration of a variable, gives rise to a different provenance. And if two pointers point to two different such things which were allocated at different times, compiler can know that they are really different because they come from something very far apart. And this can help by doing a lacing analysis and then being able to optimize because the compiler can prove that two pointers actually point to different things. So, this is basically the idea behind this. And then, getting this through all levels of the language and all levels of the compilers and everything, this is quite technical then in the details.
Gavin Henry 00:06:14 So this would do away with dangling pointers completely?
Jens Gustedt 00:06:18 This will not do away with dangling pointers by themselves. This will do away with, you have two parameters to the same function which are pointers and then the question is whether these can point to the same thing. And so, the compiler has to be careful or not whether when you change one of these things will change the other at the same time.
Gavin Henry 00:06:41 Okay. That sounds a lot like what Rust enforces with borrower checking and things like that and sort of garbage collect memory. Map languages might do where they track every variable.
Jens Gustedt 00:06:53 Yeah, this is related to these kinds of questions in C yeah, you only have these things implicit and so it’s basically a compiler who is then able to prove some things and this is where things get technically really nasty. So, we need a real good model and there’s some real formal specification in the mathematical sense in that paper was really difficult thing to do.
Gavin Henry 00:07:20 So we could get in a situation where once you’re using an update compiler, I could find all these bugs for you because it’s tracking everything at that point.
Jens Gustedt 00:07:28 And so one thing to note is also in the last five years compilers have been gotten really, much better than they were 10 years ago or something like that. They’re really able to track things much easier down and to warn you about stuff and things like that. So, there has been a real leap forward I think in what compilers are capable of, at least the compilers I know best which are these public domain compilers, GCC.
Gavin Henry 00:08:00 Exciting times huh?
Jens Gustedt 00:08:01 Yeah, it is, we are in exciting times.
Gavin Henry 00:08:04 So our layer standard is C23 and that’s what your book’s been updated to K24. Can you give us a pitch of what it is? C23?
Jens Gustedt 00:08:15 Yeah, so C23 is the first major addition to C since C11, so 12 years. So you have to have in mind at least a development cycle for C has been. So there has been an intermediate standard with force basically pfixxes in 2017 and so C23 is the first which has really new features compared to C11. Among these features are small ones and big ones. There’s really a lot. I could point you for example, one of the things you probably don’t even notice is that we reduce the possible sign representations to two’s complement before there had been other possibilities and now in C we restrict to two’s complement. Everybody uses two’s complement and anyhow you learn that in school and everything but now it’s official. Its C23 that this is the only way to represent signs for which changed some spelling of keywords. We added Cons X facility, we added auto and type of for type inference we added attributes which gives you the possibility to annotate, constructs in the language in a systematic way.
Gavin Henry 00:09:36 Yeah, we’ll be going over some of those shortly in the next section.
Jens Gustedt 00:09:39 Yeah, we updated the Unicode support. So, identifiers and C now are really ruled by Unicode. What’s possible to have in an identifier. There are clear Unicode rules for that and before we had some home brew tables of what was possible and whatnot. And so, we changed to be conforming to this other standard. We also added more facilities to the standard C library to transform different encodings. We add bit fitting utilities. Everybody probably knows how to do a pop count of so counting the bits of an integer and everybody probably does it wrong if they only have half an hour to program this. So, this gives you a unique way to do that with a function call and which is then optimized by your implementer for your platform what you have, we have checked integer metrics so you can check for overflow, and we have added Cons safety for some library functions. So, it’s just rough, very rough overview.
Gavin Henry 00:10:49 Thank you. We’ll dig in a bit more in the next section. So how does that work? Say you’ve just explained there’s a new function to help count bits in an integer. So practical wise that would appear in glibc among other practice?
Jens Gustedt 00:11:05 This appears or let’s say the different, all the compilers already had of have it with different names and so we are just putting some glue on top of it, which is a unique name. And glibc then would provide that in a header too. So you can,
Gavin Henry 00:11:25 You can get an updated version of GCC that comes with the headers.
Jens Gustedt 00:11:29 Yeah. So you can use that, portably for different.
Gavin Henry 00:11:34 And that function is then defined in the GCC source code? Someone’s written it?
Jens Gustedt 00:11:39 Yes. For most of these functions actually they exist for example in GCC they exist as built-ins. So, this will not even give rise to a function call at the end. This will be directly transformed into some assembly by the compiler. This is already built in with this prefixed underscore underscore built-in something behind GCC has a lot of them and what the header of glibc then only would do would give you basically macro interface which translates to this built-in of the compiler.
Gavin Henry 00:12:13 And the GCC team would’ve done all the work inside?
Jens Gustedt 00:12:16 Yeah, a GCC team had done this most of the work already before.
Gavin Henry 00:12:21 And then to validate that, they’d look up the C Spec and make sure the inputs and outputs match what the spec says should happen..
Jens Gustedt 00:12:26 Yeah. Exactly, exactly. So this is the bit of work they have to do because when you standardize things like that, everybody has chosen a different name for these kind of functions and so we have to agree upon one name for a specific function and then everybody has to do some homework getting things right, with exactly the right name and right order of parameters and things like that.
Gavin Henry 00:12:52 The stakeholders and all the different compilers would be represented in the C committee, is that how?
Jens Gustedt 00:12:57 Not all but some of them, yeah.
Gavin Henry 00:12:59 Some of them. And probably not for this function but for other functions. Would the internal implementation of these functions be obviously be different on what team has put them into GCC or Cline or wherever?
Jens Gustedt 00:13:12 Yes. Even for these functions, these could be different because they might depend a lot on the process that you have. So, the processor may an Intel versus an RM may have different primitives and so the goal for the compiler team would be to match that to the good primitive, which makes this quick and easy. And there are certainly different choices to make for the implementation for certain things. And what’s only guaranteed by the standard is the semantics of such also, basically that it does the same thing.
Gavin Henry 00:13:46 Would there be situations where your project or program would be faster compiling it, not compile speed, but actually runtime might be faster if it’s built with GCC or Cline because of those internal implementations?
Jens Gustedt 00:13:59 Yes, sure. There are always differences in these things between the different compilers.
Gavin Henry 00:14:04 And that’s something that you have to figure out I suppose when you do benchmarking?
Jens Gustedt 00:14:08 Yeah, if you do benchmarking you can figure this out.
Gavin Henry 00:14:11 It’s nice to think about how it goes from standard to implementation and actually getting in your hands and how that could be different as well. And if you’re on the GCC team or just a seed programmer in general, how do you recommend people keep up to date with not only what’s just in the standard, but do they just publish the standard and we’re all left to go and read it or is there another source of information that you can say these things would benefit you or does it just trickle down the stream tool or your compiler says something?
Jens Gustedt 00:14:42 No, it’s really a process of compiler people adding things to their compilers and because they have a demand. So usually when there’s new feature there are people that like to have that then compiler vendors implement this and if a feature appears in several compilers and the community says, oh yeah, this one is now something that we think that is stable, then we are going to standardize that. So, this is really a feedback loop in that sense.
Gavin Henry 00:15:14 Ah, okay. So, it’s back to front.
Jens Gustedt 00:15:15 The standardization often then is only yeah, giving it a unique name to that function or something like that.
Gavin Henry 00:15:23 That makes sense because you’re not just going to make up stuff in a standard just because you wanted something to do. Yeah, it’s because you’re trying to
Jens Gustedt 00:15:30 There’s tendencies to do so also for certain things. But for all what I said, at least for the C library, this is usually really the case that this comes from somewhere and is then standardized.
Gavin Henry 00:15:43 Okay, thank you. So is there going to be such a thing as C26 or is it going to be C34 or how do you think that will pan out?
Jens Gustedt 00:15:54 Yeah, so there will be no C26. Okay. We are working on C2Y, so C26 would not exist. C2Y where Y is not yet fixed is in the works. So, there are already a lot of papers accepted for that. There’s an intermediate version of the standard where the editor has already integrated most of these changes. But this would certainly take us at least two or three years still. So, this might be 28, 29, something like that.
Gavin Henry 00:16:28 That’s pretty good though because the last one C23, that’s another seven, six years for the next one.
Jens Gustedt 00:16:33 Yeah. So, six years would be our usual schedule, let’s say like that by experience. But this time this will not be just a Pfixx release this time this will be a real feature release also.
Gavin Henry 00:16:47 Okay. The one before was C11, so it’s quite fast.
Jens Gustedt 00:16:49 There was C17 in between which was this Pfixx release.
Gavin Henry 00:16:54 Okay. Yeah. And is there any major, you said there’s new features in C2Y?
Jens Gustedt 00:17:00 Yeah, there’s some new features which for example, one thing we get from C++ actually this is definition of variables inside if and switch statements. I don’t know if you know about that feature in the expression part of an if and C++ you can first declare variable and then evaluate that variable. For example, if you’re making the system call, you can assign the result of the system call to a temporary variable for which only lives inside the if statement and then check for the value of that.
Gavin Henry 00:17:37 Yeah, recently I’ve been doing mainly Rust programming. And that’s something you do in Rust quite often.
Jens Gustedt 00:17:42 Yeah. So, this kind of need feature which is not very difficult to implement, we have added things to the pre-processor for example, we now have the counter feature. So, you have the underscore underscore counter underscore underscore which gives you at each time, which it’s evaluated in the pre-processor a new number. So, you can do that for saying uniquely naming some variables which are produced by macros or something like that. We have removed imaginary types. So, before there was a concept of imaginary type which nobody implemented.
Gavin Henry 00:18:25 Is this one of the things that appeared in the spec first and then didn’t appear?
Jens Gustedt 00:18:31 Yeah. So there was at one point in history there was one implementation for that. But the way complex numbers were described were basically described as being real and then plus this imaginary part and how this imaginary part which is then imaginary in both sense of the terms would look like and this complicated the writeup of the standard a lot and made it really difficult to read at some point. And so we decided that we get rid of that and removed all that from the standard and have just complex numbers. We have added complex literals which didn’t exist surprisingly before. So now if you just have a real number with the suffix I, then you get a complex with this number as imaginary part, things like that. And there are a lot of things which are already voted in favor but still need adjustment before they can really be included. So there’s a still a long way to go for this. We have a real large backlog of papers which we can only treat so much in the session that when we’re meeting.
Gavin Henry 00:19:47 The only thing I remember about imaginary numbers is a university with engineering just INJ used to replicate.
Jens Gustedt 00:19:54 Yeah, yeah. .
Gavin Henry 00:19:58 Yeah. So, these things that are getting changed, what’s the process for that? You said they publish a paper; you review it as a board and then you chat about it. Or I think we might have spoken about this last time we spoke, but in the next two years or C28 or 29, what goes on in those years?
Jens Gustedt 00:20:16 As I said, basically for example like this counter pre-processor feature, this is basically already implemented in every compiler. So, this exists. So, you have to write a paper to already first advertise that while you want that and then you have to provide text how this would be integrated into the standard. And then this goes usually over some iterations with the committee. There’s discussion on the email list and then there’s discussion session.
Gavin Henry 00:20:49 Is this like a working group type thing? The mail list?
Jens Gustedt 00:20:52 They are working groups but most of the things are done in C are done in plenary. We are only about 30 people or so which meet regularly.
Gavin Henry 00:21:01 So somebody writes a paper and publishes it to that list for you all to consider?
Jens Gustedt 00:21:06 Yeah.
Gavin Henry 00:21:07 I see. Okay. I was just trying to figure out where people find those papers, but it’s all a proper process.
Jens Gustedt 00:21:12 There’s a website for the C committee where you can find all these papers.
Gavin Henry 00:21:17 We can put that. Okay, I’ll get that off you later.
Jens Gustedt 00:21:18 Yeah, we can put that in links too and you can see all revisions of all papers and it’s really long list.
Gavin Henry 00:21:27 Okay, I think we’ve covered our introduction a bit more than that. So, we might speed through the next section. So, I’ve called this section the latest C standard [C23] and upcoming changes. So, I think we’ve touched the upcoming changes quite a lot with C2Y, which sounds really cool. But let’s pull apart some of the things we just mentioned. So, I’ve got a few bullet points here that I’ll take you through questions. So why are these improvements to integers and enums needed that are in C23? And I’ve got a quote here where you say in the updated version of your book intro — or the marketing pitch for it, I think — there are new bit-precise types coined _BitInt[N], and new C library headers, for arithmetic overflow checks, for bit manipulation, possibilities for 128-bit types of modern architectures, et cetera. So immediately I thought, integers are integers; what’s changed and why are these things needed? So could you give me some insight to those?
Jens Gustedt 00:22:35 First of all, integers are not integers. There are 15 different integer types — standard different integer types — and you have a lot of problems with overflow, with conversion. So, if the conversion is narrowing or widening and things like that. So, because we don’t have a BigInt type as other languages have, you always are reduced to something, some given precision, whatever that precision is what you choose, you really need to be aware of that. If you are going too far to big numbers, then you have wraparound or something like that or stuff like that. You have the problem of, if you are doing for example, comparison between two integers when one of these integers is signed and the other is unsigned, what should be the semantics of this comparison be and so on. So, this is not completely trivial in C how to deal with these and therefore we talked already a little bit of this bit manipulation thing. So, this offers new opportunities, new interfaces.
Gavin Henry 00:23:46 So that would be the header?
Jens Gustedt 00:23:49 Yeah, this is this thing. Standard checked integer gives you an interface where you can check for overflow. So, you basically do arithmetic, say an addition between two integers, and these functions which are in this interface in this header, they will not only give you the result of that addition but they will also give you a flag back whether or not you had an overflow in addition or not. So, you can do checks, you can abort your program if the numbers are getting too large or whatever you want to do when this goes down the drain.
Gavin Henry 00:24:29 So up until this was standardized, people would be doing this anyway but just have created something themselves?
Jens Gustedt 00:24:34 Yeah, well again there were built-ins for compilers which did that. So, this wasn’t a portable, not so easily portable or people did their checks by themselves and then with all the possibilities of getting this wrong. So, if you have time and look up on stack overflow for example, the questions on how to do that, you have a lot of answers and a lot of them are raw. So, you have some special cases which are covered. And the idea here really is to have that done on one hand correctly, which is already a good thing. And then also using as for the bit manipulation part, using already the right assembly instruction for example for that because most processors already do that correctly. They have this overflow bit and you just have to do the addition in the assembler and then read the overflow bit whether or not this work or didn’t work.
Jens Gustedt 00:25:34 This is much more efficient and much safer. This is basically the idea behind. The other thing which you also mentioned is this thing with 128 bit types. The problem has been that we had an actually upper limit in the widths of integers, which was is the type uintmax_t, which on most modern architectures was 64 bits. And on the other hand, in many or there are really applications where you really like to have a wider types for example for cryptography or check some calculations or things like that. And the modern processes all implement these 128-bit arithmetic but there was no way to map them correctly into the C library. In the C language because we had this artificial restriction on how wide an integer could be. So, this was a lot of work of fiddling with the standard text actually to allow that to happen and now to offer the possibility to go beyond these limits that we have.
Gavin Henry 00:26:50 And does the term wide int, is wide the bit size?
Jens Gustedt 00:26:53 Yeah, wide is the bit size, yeah.
Gavin Henry 00:26:56 And do you get input from going back to the standard from CPU companies that say we now support this, we should put something in the standard or did they release a tool set with the CPU as well?
Jens Gustedt 00:27:09 Yes and no. We have one big player which is really invested also in the C standard, which is Intel, but they are mainly invested because of their compilers. So they’re really supporting Cline and they’re really interested in C from that perspective. We had this underscore bit in record N types which actually also came from Intel but for a very different reason that one might think they had this because they wanted to have such a type in FPJS and things like that where you really want to be able to design your integer type with every bit such that you can lay out your network of computing correctly in this FPJS exactly the resources thatÖ
Gavin Henry 00:28:02 It’s not an FGPA? I haven’t heard that word since university and I think it’s Field-Programmable Gate Arrays, I was thinking of. But that’s not what you said.
Jens Gustedt 00:28:12 Yeah, but that’s what I meant so okay.
Gavin Henry 00:28:14 FPGAs. Yeah I remember doing that in my degree. Do you think these wider types, are they getting driven by machine learning type stuff, AI type stuff or where’s the need for them coming from?
Jens Gustedt 00:28:27 The machine learning goes actually in the inverse direction, which is not yet covered by the C standard. The machine learning guide, they want to have floating point which is really short. So, for them float, for example, with this 32 bits is already much too large. They want to have floating points sometimes with 8 bits only or something because they really want to do approximate computations very, very fast and with really slow memory footprint for everything et cetera to do more and more. And then they only need to do some steps for the real position afterwards. So, they really have that and they’re currently — and this is one of the main reasons we haven’t standardized that I think for 8-bit floating point there are at least 10 different specifications by different providers which are all different in some aspect somewhere and what the mantissa is, or they present things. So, this is probably something we will see at some point to be standardized, but there’s still a long way to go because everybody now has their own 8-bit floating point somewhere and implemented and it’s not compatible to what the other guys are doing, and so it’s not easy to standardize.
Gavin Henry 00:29:53 No, that’s interesting. I thought maybe it was like the vectors and things they use and just to speed everything up.
Jens Gustedt 00:30:00 No, it’s more the crypto guys, things like that. So, it’s really interesting for crypto to at least to have bit operations on large vectors, and this works quite well for them.
Gavin Henry 00:30:13 Excellent. Okay, so I’ll take us through a couple of more bullet points but we’ve still got a bit to cover because I want to talk about additions to your book, so we might move a bit quicker through these ones. So, what is a null pointer constant in regard to the standard?
Jens Gustedt 00:30:28 Yeah, so before we had NULL all in capitals, which is just a macro which was somehow ill-designed or not well-specified. So, this can underneath the hood can be different things, and this led to potential problems, for example when you use that as an argument to print it for something like that. So, we had this sitting there; we cannot easily change things like that because it’s used by everybody. And so, we invented something new which hasn’t this ambiguity which now all-capital NULL has. So we integrate that really into the language and not into the macro pre-processor.
Gavin Henry 00:31:14 All caps NULL macro. Has that remained the same since C was first created 50 years ago or so?
Jens Gustedt 00:31:21 Yeah, basically.
Gavin Henry 00:31:22 So all of a sudden in the 20s they decided no we need something else. It’s never come up before?
Jens Gustedt 00:31:28 It has come up before. So C++ had, so this null pointer comes from C++. C++ has done this 12 years ago at least — I think in C++11 I don’t remember exactly — because they had exactly the same problem because they inherited that from us and they invented that, and so we just applied that same change to C.
Gavin Henry 00:31:54 Cool. And what are syntactic annotations with attributes?
Jens Gustedt 00:32:00 So, perhaps people know this underscore underscore attribute feature from GCC and Cline also implement that. Where you can add properties to a function, to a parameter, to a variable. You can, basically, there are a lot of attributes like that which were implemented before. The problem with the approach that they had was that this was syntactically not clear to which construct an attribute applies. So, you put sometimes you put the attribute in front of the function declaration or behind the function declaration, and these different places had different meanings and it was something yeah which engineers do. So, somebody invented something here and somebody there, and then you had a lot of add up of these kinds of features and the attributes — which is also a C++ feature which we adapted to C — is now a syntax annotation where it’s unique to which construct an attribute applies.
Jens Gustedt 00:33:12 Basically, big rule is just as for other things, it always applies to the left. So, if you put an attribute for example behind after a function declaration, after the closing parenthesis, this would apply to the function to the type of the function if you would put it after the identifier it would only apply to the specific identifier, and things like that. So, you can annotate things like that. There are some standard attributes. For example, something like deprecated, you can mark an identifier as being de deprecated and so compiler will tell the user who uses that at every compilation, hey guys, remember this one you should use. And then there are a lot of vendor specific attributes where compiler vendors have the possibility to invent everything they need. They want to add to their compilers for flow analysis or for whatever feature they want to have. But it’s clear by the syntax to which structure of the program this applies.
Gavin Henry 00:34:27 I’ve got four bullet points coming up. I’m going to just say them all out loud and then you can pick two of your favorites because we need to move on I think to talk about your book.
Jens Gustedt 00:34:36 Okay.
Gavin Henry 00:34:37 So the first of the four is what is generic programming with regards to type interference with auto and typeof? What is default initialization with curly brackets? What are compound expressions with lambdas? And what is so-called internationalization? So, which of those two get you excited the most that you’d like talk about?
Jens Gustedt 00:35:00 I won’t talk about compound expressions lambdas because they’re not even in the standard yet and I won’t talk about internationalization because we already talked about Unicode and things like that a little bit. So let’s talk about generic programming auto and typeof and also initialization with brackets, which are both or all three auto and the brackets are features we also inherited from C++. So auto is the possibility to keep your code consistent if you have a function which depends on parameter X and this has some type and you don’t remember that type, but you need another variable of accelerator variable somewhere which depends on that and which should have the same type, you just type Auto Y=X and then Y gets exactly the same type and value as X and then you could continue.
Gavin Henry 00:35:57 Yeah, it’s kind of like a dynamic language then, isn’t it?
Jens Gustedt 00:36:00 This just helps you to stay consistent. So, if anybody changes X during development then Y will change automatically with it. And typeof is a little bit more complicated. It takes an expression or a type as an argument and then derives the type from that. This is used or useful in a lot of macro development or things like that where you like to be able to deduce types or automatically from what you get as macro agreements for example. Before initialization with braces closes say a loop or whatever before initialization and C you always had to put at least one initializer into these braces. So, you had to say if you have an array of A, type double of size five, you would have to put at least one zero into these braces for initialization and the rest would then be before initialized and now you can just omit everything and just put the braces. And then by that saying that everybody, every member of your array is the four initialized. And this even works for the so-called VLA, Variable Length Arrays. So even for errors where you don’t know the lengths where it’s determined not at compiled time but at runtime you can do that. So, the compiler will produce code that does this initialization for you. You don’t have to do a for loop to initialize all your, which you had to do before.
Gavin Henry 00:37:42 To save the programmer some time?
Jens Gustedt 00:37:44 Yeah and ensure that it’s consistent. It’s always these two aspects actually that you really have to be sure that people don’t do too much errors.
Gavin Henry 00:37:56 So with these types of things, if you switch to using them, would you feel confident in using them or would you need to create some tests to make sure they’re doing what they should be or how do you advise people to get rid of code? Basically delete stuff because it’s now done automatically somewhere.
Jens Gustedt 00:38:12 So first, again with these braces, compilers implemented that since a long time. Because C++ also has that so it’s already there and the compilers works, people use it and now this gives you the official stamp that this is, how it’s supposed to be and you can change all your code to that without any difficulties. So even if you have to backport it to some little bit older version of your compilers, this should work.
Gavin Henry 00:38:44 Excellent. And you said compound expressions and they aren’t really standardized yet?
Jens Gustedt 00:38:49 Yeah. Unfortunately, I put a lot of work in lambdas, and it was not taken for C23. I’m not sure that we get something like that for C2Y. I’m not sure people are very divided on that and then it’s really difficult to make progress.
Gavin Henry 00:39:08 What’s the use case for it in C?
Jens Gustedt 00:39:11 The use case for?
Gavin Henry 00:39:13 The lambdas and compound expressions in C, why do you want it?
Jens Gustedt 00:39:16 Yeah, lambdas for they exist in many languages. So for example in C++ are very useful tool if you have callbacks which you want to pass into a fraction and in C now the official way is always to create outside of the function you are currently writing before you have to write this callback as a separate function far away from where you’re going to use it and then you pass a function pointer to that function as a callback somewhere else in your code. So, you get these things very far apart and that makes it difficult to maintain and everything. So, lambda would be some sort of short function which you could actually put in place where you pass it as a callback and so everybody would see immediately at that place. Here I’m doing callback for, I don’t know what press button or whatever and you see how that works and where that is all in one place.
Gavin Henry 00:40:18 I suppose a lot of arguments against these types of things is that your IDE will help you find things.
Jens Gustedt 00:40:23 Yeah this is part of the argument and then there’s part of more or less small design questions on how to do that, how to do that syntactically. The question is if you have a lambda in a certain place, what does this lambda see from the surroundings? Is it possible to access local variables for example, in the surroundings or not? Or what would be the mechanism to integrate them? And there are people, it’s a little bit exaggerated but you sometimes you think people would kill each other for these tools, these things.
Gavin Henry 00:40:59 Well people are very passionate about what they love.
Jens Gustedt 00:41:01 Yeah, yeah. Stated like that they’re very passionate about things.
Gavin Henry 00:41:05 Yeah, that’s the nice way to say it. I suppose we should talk about the bit that you’re most excited about and that’s your book.
Jens Gustedt 00:41:12 Yeah.
Gavin Henry 00:41:13 Modern C Third Edition. I can’t believe it. I’ve actually got the hard copy posted to me a month or two ago when it came out?
Jens Gustedt 00:41:21 Okay.
Gavin Henry 00:41:22 So was very excited to receive that. So yeah, let’s do the chat about that now. So given what we’ve just spoken about and we went quite deep in some new things that I wasn’t expecting to talk about which was great, what does the latest edition of your book bring to the C world?
Jens Gustedt 00:41:38 So it has been completely reworked for C23. So, integrate all the shiny new features from C23. If you scan, it’s difficult on the hard copy so you would have to go to the electronic copy if you search for mention of the string C23 in it you get about 200 mentions. So, it’s all over the place and if you really want to know what C23 is that a really good way to do, you see it every here and there. There is this little thing which makes things easier where you don’t have to think about that, where you can do things easier with another feature, etc. etc.. In that sense it’s deeply integrated now into the book. That really was also the motivation to do that.
Gavin Henry 00:42:27 It’s also really good that the standards are quite far apart in time. So, you get to update a book to the latest standard. It’s not like every couple years.
Jens Gustedt 00:42:38 Yeah, yeah, yeah that’s right. That’s right. So the process of generating this new edition was also quite lengthy. So, which I did in some parts in parallel to what we did on the standard still, it took a lot of time to actually appear at the end.
Gavin Henry 00:42:55 How many pages did you add?
Jens Gustedt 00:42:56 Oh I don’t know. I don’t know. There’s someÖ
Gavin Henry 00:43:00 I’ll work it out. I’ve got both copies here.
Jens Gustedt 00:43:01 Yeah.
Gavin Henry 00:43:03 I’m a bit of a fan boy .
Jens Gustedt 00:43:05 So it’s thicker , it’s thicker, that’s for sure. But I didn’t do page counting.
Gavin Henry 00:43:10 You talk about one of the pictures in your book that you’ve got easy transition header to get across to C23?
Jens Gustedt 00:43:16 Yeah.
Gavin Henry 00:43:17 Can you take us through that a little bit?
Jens Gustedt 00:43:19 As I said previously, many of the features already are there in compilers. Often there is a, this is written, this has a little bit different name or perhaps maybe even arguments are in different order or things like that. What this header does, which you can download with the examples from the book is to provide you already the interfaces that match that on your current perhaps not so updated compiler, I only tested that for GCC and Clang so I wouldn’t know how this works really. Works for other compilers. But let’s say that these are probably the major compilers nowadays. So to give you an example, there is a macro in C23 which is called unreachable and GCC and Clang already have macros built in unreachable. And so what my header does is just matches this new name to the, so this is little bit the way this works. I’m not sure that you even would need that nowadays. I’ve written that, what? Two years ago or something. Now these compilers have officially transitioned to C23 already with their newest version. So probably won’t even need that. But if you still are stuck with a Clang from two years ago or three years ago or something like that, you could use that.
Gavin Henry 00:44:55 Yeah in the C world that’s not unheard of is it?
Jens Gustedt 00:44:58 This is not unheard of exactly, but you can use that and you be sure that your future proof for future changes once.
Gavin Henry 00:45:06 Do you know how many versions of C standard that compilers keep around in their own source code? You know, if you march your code C11 or C89, still C89, but
Jens Gustedt 00:45:17 GCC still has all versions of C.
Gavin Henry 00:45:21 Wow, that’s amazing isn’t it?
Jens Gustedt 00:45:23 It’s a lot of work to keep that consistent, so yeah.
Gavin Henry 00:45:27 Wow, cool. So another thing that was really exciting in your book was what you’ve called a comprehensive approach for program failure that really piqued my interest. Can you take us through that if it’s possible?
Jens Gustedt 00:45:40 That’s nice to hear because I put a lot of brain effort into that and one of the reasons I started writing this up was because I had the impression that there were a lot of misconceptions about what failure is in C or what a fault is in C and things like that. So basically, I was categorizing failures, program failures into four main groups. The first one is wrongdoing. So it’s your fault when you do something wrong, I don’t know you reference an bad pointer or you do whatever bad thing you can do and see this is basically the easy part where you should, whenever it’s possible to get a diagnostic from your compiler or your program should really fall very quickly or something like that. Then there is the next layer of faults that lead to program failure, which are program state degradation.
Jens Gustedt 00:46:41 So for example, if you have storage, ex job, ex exhaustion. So, this is generally something which you can check. If you check for example the return of system calls. So, if malloc really gives you a now back then you know that something deeply went wrong with your program, there’s no storage available anymore and so you really should stop and do an emergency exit from your program or something like that. So that was a second level. The third level are unfortunate incidents. Unfortunate incidents are things that you don’t really notice, which are difficult to track and things like that. For example, like a race condition in between two threads in the threaded program. There are two entities doing their things and two thread for example and seen locally everybody’s doing the right thing but together they don’t work. This is something which you can basically only avoid by really careful design.
Jens Gustedt 00:47:56 So this fault program failures for unfortunate incidents usually only appear when you have a bad design. But what is a good and what a bad design obviously is a really difficult question. And then you have the fourth thing, which is a series of unfortunate incidents where you are basically caught in a bubble. You have been doing decisions on your own, you’re going around the block in the city for example, at each corner you decide because of something to go to the left and if you do that consistently you will always walk around the block and never get out. So you are caught in some sort of bubble because you’re taking local decisions. Decisions which don’t lead you, which don’t have you make progress at some point. So this is also a really difficult design question to answer. So this is what failure is that the problem then is that you have to really distinguish the error.
Jens Gustedt 00:49:07 So if you talk about, sickness or something like that, so this would be something which is wrong or goes wrong from the way it manifests from the symptoms. And this is something which is often conflated by people in C and generally in programming languages I think so. So symptoms can be compiled diagnostics, that can be compilation failure, that can be program termination or even worse, that can be real crashes, that can be erratic behavior, sound misfunctioning, all these different symptoms can appear if you have a failing program, you really have to navigate these and to know why your program failed and what to do against it to first of all try to do them to work that it never happens or if you can’t avoid it then see how to manage if it occurs and terminate your program for example in a controlled way.
Gavin Henry 00:50:08 Okay. Just to summarize those four categories you created, you got what you call user errors?
Jens Gustedt 00:50:14 Yeah, wrongdoings I call them. Yeah.
Gavin Henry 00:50:17 Memory system issues.
Jens Gustedt 00:50:19 Yeah, state degradation. So, memory system issues, but this can be, I don’t know, fight system full or things like that.
Gavin Henry 00:50:29 And then three would be unexpected events.
Jens Gustedt 00:50:32 Yeah.
Gavin Henry 00:50:33 And four would be a combo series of those.
Jens Gustedt 00:50:37 Yeah, a combo series of those which make you seem to do in progress but actually you don’t.
Gavin Henry 00:50:42 Yeah. That could be something like you see in the news when, AWS goes down or something and something triggers something else and it’s not supposed to spike that much and then that doesn’t feed back to that and it keeps going and keeps going. That type of issue. Okay. How do we approach these four things then?
Jens Gustedt 00:50:59 So the first is discipline for one’s wrongdoing, have a programming style, always keep to that programming style. Listen to your compiler. If they’re telling you here’s something fishy, there is probably something fishy and so you should do that. What the compiler suggests, generally you should against programs, state degeneration, you always should check return of system calls. So, every system call should either be in an if statement or you should take the return value into variable and then decide whether or not this was successful.
Gavin Henry 00:51:39 Does the standard or compiler help with that type of thing?
Jens Gustedt 00:51:42 Yeah, so usually the C library for example, you have for all faults you have return and you have, specified what happens, when an error occurs and so you can program against that.
Gavin Henry 00:51:57 No, but does the compiler force that you check those?
Jens Gustedt 00:52:00 There are possibilities that compilers are forced to check these, yeah. This is possible. So, these are the first two in that sense, this depends on the quality of your code directly, how you write it and that you follow style which is consistent in itself and you have the right tools, compilers or analyzers or something which give you hints on what to do. The other two are much more difficult to handle. So, they really need careful algorithms design. For example, race conditions between threads you really have to look and learn into and there’s a chapter in the book about threads and atomics and mutexes, and all that stuff, how you can avoid that race conditions occur from in the first place, for example. For this force, for these unfortunate incidents, it’s much more difficult because you have basically to have proofs on your algorithms that there is always progress. And this is not something which is in general or whether general rules or something how to do that. You really have to do something here too.
Gavin Henry 00:53:17 And you could probably apply those four rules that you’ve created to any language really, any programming.
Jens Gustedt 00:53:23 I guess so, yeah. Probably you could have that through, for different languages where you certainly have different kind and this is probably something where you could somehow rank languages whether or not the first two points are detected automatically or basically the syntax inhibits that you’re doing certain types of faults or something. So, C is probably on the lower end. So, you can probably end up trying to do that at some point improve C and the diagnostic that are given. For the last two, there’s basically all languages have these problems and basically all languages have the same amount of detail or whatever which they can help you or don’t.
Gavin Henry 00:54:16 How do you yourself when you’re creating this approach — say the thing with threads example — do you have a set of example programs that you write to prove these things or work around them or how do you translate this sort of real world experience you’ve gained into your writing?
Jens Gustedt 00:54:36 For all threading, for example, and problems like this, I think the C standard gives the right tools with atomics and muteness, and things like that. So, you really only have to apply them consequently and consistently to get things right.
Gavin Henry 00:54:55 I was just thinking maybe you have your own C project that you break things and fix things and break things just to have a play and compare.
Jens Gustedt 00:55:02 Yeah, so what I’m currently working on is having contracts for C, which is something like that to make some guarantees about things automatically, which is, I already published a paper for that but which hasn’t yet even been discussed by the C committee. But this would be a way to go forward to have actually more tools in the language in the syntax to guarantee certain things and have automatic more or less automatic proofs for certain things. So, this would be one direction to go, but this is early stage I’d say.
Gavin Henry 00:55:42 Well I think that’s probably time up for us. Is there anything in your book that we’ve missed that you’re really proud of or even in not in the new edition that you want to highlight that you think will make people’s lives much better in the C world?
Jens Gustedt 00:55:58 No, I think we covered it well. There’s so many things in the book, so you really have to think through it, learn it.
Gavin Henry 00:56:07 So obviously the C programming language and standards and world is powering on, but if there was one thing a software engineer should remember from our show, what would you like that to be?
Jens Gustedt 00:56:19 I think the most important thing is to stick to some sort of coding style or something like that. Be consistent. That’s really the most important thing. And listen to your compiler.
Gavin Henry 00:56:30 That would solve 50% of your four.
Jens Gustedt 00:56:32 Yeah. So basically compile things without warnings and you’re already quite good.
Gavin Henry 00:56:40 Yeah. Because it’s there for a reason. So pay attention.
Jens Gustedt 00:56:44 Yeah. .
Gavin Henry 00:56:45 Do you think the C standard helps any other programming languages?
Jens Gustedt 00:56:49 Yes, a lot. Basically, C is not only a programming language, C is also somehow a lingual front, how to approach, how to talk between different languages. So basically, all languages that you see today at some level they interface to C and then some of the work is done at C.
Gavin Henry 00:57:10 Yeah, that’s what I think about. Like I was saying, I’m doing a lot of Rust work at the moment and I’m seeing more and more libraries be written in Rust and then they expose a C interface. And every other language then uses that interface to use that language.
Jens Gustedt 00:57:24 And often it goes the other way around. For example, for Python, once you do want to do things efficiently and fast, people create some C library which they interface from Python.
Gavin Henry 00:57:36 Yeah. And then that means everyone else can get it because it’s written in C.
Jens Gustedt 00:57:39 Yeah. It’s written in C and then easy to use in Python. So, there are these compromises which are done today.
Gavin Henry 00:57:47 Should someone learn C today that’s never programmed before?
Jens Gustedt 00:57:50 Yes, sure, certainly. It’s really one of the programming languages with, it’s always among the one, first, second or third use programming languages in the world.
Gavin Henry 00:58:00 I suppose for me, I kind of think of it like learning to speak English, you know, everything speaks C.
Jens Gustedt 00:58:06 You’ll get by.
Gavin Henry 00:58:07 If you can translate ideas and things from C and you can always understand what’s under the hood if you can.
Jens Gustedt 00:58:13 Yeah.
Gavin Henry 00:58:16 I don’t know where you hang out, you know? Can people follow you somewhere or how can they get in touch if they’ve got some questions?
Jens Gustedt 00:58:22 I am on Mastodon. Okay, so you would easily find me there. Yeah, also LinkedIn, but I’m not very active so basically it’s Mastodon nowadays. I try not to go on too much on these commercial platforms.
Gavin Henry 00:58:37 Thank you for coming on the show. Just as much fun as five years ago, more so. So, thanks for coming back. This is Gavin Henry for Software Engineering Radio. Thank you for listening.
[End of Audio]