C and Undefined Behavior

8 min read Original article ↗

“The problem with programmers is that you can never tell what a programmer is doing until it’s too late.”

  -- Seymour Cray

Posted by Lelanthran

2026-02-08

See, Undefined Behaviour

Much has been said about the C programming language’s undefined behaviour.

Widely and correctly regarded as a footgun, it’s used to demonise the language, to demonise the Standards Authors, to demonise the compiler authors, and in fact to demonise nasal emissions.

The audience for my blog is rarely non-technical so I rarely descend into arcane and obscure details. I generally stick to what is already widely known, and try to add a tiny original thought of my own…

WTF is a UB?

Undefined Behaviour is, briefly, a behaviour that is not specified by the language nor required to be specified by the implementation.

As a quick example, overflowing an unsigned integer type (size_t, or uint8_t, etc) is well-defined behaviour, that results in modulo reduction. In other words, adding 1 to an unsigned uint64_t (64-bits) with the value 0xffffffffffffffff results in zero.

What is undefined is adding 1 to a signed int that has the value INT_MAX. This is Undefined Behaviour, and the compiler may do anything it wants, including things like:

  1. Let it overflow with modulo reduction,
  2. Ignore it altogether (so that the result is still 127),
  3. Set some other variable to 42,
  4. Reformat your hard drive,
  5. Impregnate your (possibly male) cat,
  6. Turn the milk in your refrigerator, …

Weeellll… maybe not those last two, but you get the idea. When you do something in C and in C++ that invokes UB, anything can happen, including nothing at all.

Isn’t that dangerous?

Pretty much. It’s pretty terrifying that if the programmer makes a mistake and processes 11 items in an array containing 10 items, the resulting program can do absolutely anything that the user who ran it could do.

In practice, such an innocent mistake thing can (and has, at least sometimes in the past):

  1. Result in your passwords being exfiltrated to a malicious party (who will then empty your bank accounts).
  2. Delete important files from your hard disk.
  3. Download CSAM to your computer, making you an unwitting criminal facing severe jail time.1
  4. Use the computer to join a malicious bot network, executing attacks against a foreign state (hope your nation was already at war with them, because… awkward!)
  5. Download malicious programs that will, in the future, steal any new passwords you created for your (now empty) bank account.

I dunno about you, but to me that is pretty damn horrifying.

How do we fix this?

There’s two popular ways:

1. Dev practices

Turn on all linting, all warnings, use memcheckers (valgrind) and sanitisers that will catch almost all of these errors. The remaining ones can be mitigated by using well-known C patterns (In C++ it’s more difficult to do this), using cleanup conventions, etc.

Additionally, sticking to simple and explicit code with local reasoning as the only dependency does make it easier to visually spot these errors in code review.

It’s why there are millions of life-critical devices running C, since the mid-80s, and very few incidents (I can only think of two, TBH) of C programs going haywire and killing people. Millions and millions of devices, from industrial mills, to cars, to microwaves, to rockets, to bombs all controlled by C code, and next to no lives lost to UB.

2. Switch languages

Use a different language dummy!

Use languages that don’t have UB, such as, well, anything else, actually. No UB in C#, Java, Python, PHP, Lisp, etc.

It’s going to be very rare that you actually need to use a non-GC language, right? Outside of OSes and systems programming (such a vague term), you probably can tolerate a GC.

So, all good?

Well, Ackshually…

Ordinarily that would be the end of the conversation; this is where it would die. This would be a pretty pointless blog post if the takeaway is something already well-known and acknowledged in millions of other sites on the net and in material everywhere.

Unfortunately, that is not the case. As of today 2, there is a large and persistent drive to not just incorporate LLM assistance into coding, but to (in the words of the pro-LLM-coding group) “Move to a higher level of abstraction”.

What this means is that the AI writes the code for you, you “review” (or not, as stated by Microsoft, Anthropic, etc), and then push to prod.

Brilliant! Now EVERY language can exhibit UB. Your Java project to provide a simple Kanban board? Did the LLM produce code to do something you don’t know about? How would you know, if you didn’t write it.

The LLM does not deterministically do what is asked; it probabilistically does what it is probably correct. Note the repeated use of “probably” in that previous sentence.

Look, I get it; C has UB that makes it especially dangerous (though still fewer footguns than C++, at least), but WTF would anyone move from C to (for example) Rust to avoid UB and then move on to a tool with many orders of magnitude more different distinct classes of UB?

Your Rust, C++ (or whatever) LLM output is going to have the same UB as overflowing arrays in C, except that it’s at the logic layer which no automated tool is ever going to catch.

You can’t make a sanitiser for “Ensure that no function puts your credentials on the net”, can you? You can ask the LLM, but that doesn’t mean it will comply. Compare that with Sanitisers, and Valgrind.

Is self-delusion hip these days? Must’ve missed that memo.

Which brings us to…

From 2026, and beyond, we are in this weird collective cognitive dissonance where a bunch of people are vociferously arguing that Rust should be used over C, while at the same time generating oodles of code with a “this is probably-correct” black box and not even realising that, in 2026 a human choosing to write C3 is almost certainly going to have fewer errors4 than a blackbox generating Java/Python/Rust that is then subsequently “checked” by a human on autopilot.

So please, don’t be one of those people!

Don’t be a person who wastes valuable time annoying C programmers about their choices while simultaneously hitting more classes of UB in a single product than most C developers have seen in their entire lifetime.

If you’re churning out code from a probabilistic blackbox, your take on the dangers of C is, frankly, like my 80’s 12-year-old-self: Cute But Dumb. If you’re finding yourself on HN or reddit complaining about how C coders refuse to move to C++, Rust, whatever, make damn sure that you haven’t told the world about how efficient the LLM is at churning out your C++, Rust or whatever code.

You can call these people dinosaurs, but look… the apex-predator human has only been around for a few tens of thousands of years. The dinos were the apex predator for 180 million years.

I have no problem being the dinosaur in this analogy.


Posted by Lelanthran

2026-02-08


  1. Notice how easily I slipped in an off-by-one error in that list? Wouldn’t it be a shame if such an error caused your bank account to empty?↩︎

  2. 08-Feb-2026, in case the date isn’t populated in the correct field by the shell script that produces the pages on this site.↩︎

  3. In 2026, anyone choosing to seriously start a new project in C is doing so because they are more comfortable in it than any other language. This comfort can only come if the dev in question has decades of experience churning out working programs, because a younger dev won’t have had any reason to learn C for the last 15 years at least. It’s not even being taught anymore, as far as I know. Anyone choosing C today is one of those dinosaurs from way back when, which means that they have been battle-tested and have probably got more than a few strategies for turning out working products. No C developer spent the last 30 years without developing at least some defensive strategies.↩︎

  4. Also in 2026 - I am developing a security critical product, which has a foundational component written in C. IOW, I have skin in the game! My confidence levels only fall when I occasionally get Claude Code to write some code for me (like, a single function), and then get annoyed at the lack of any defensive strategies, and am forced to rewrite it myself.

    Vibe-coding has no place in a security product.

    ↩︎