Understanding effective type Aliasing in C [pdf]
open-std.orgIt’s fun to consider how C devolves into assembler. In my mind, C and its derivatives dissolve into 68K assembler as I’m writing or debugging. Thinking about code this way lets me get a feel for how all the bits all fit together.
It seems like a lost art to think that way. It’s disturbing to me how many candidates couldn’t write Hello World and compile it from the command line.
Everyone should spend some time with godbolt.org or better, the -save-temps compiler flag, to see how changes affect your generated code. Right now. I’ll wait. (Shakes cane at kids)
In the kernel developer world on the other hand, it's still very common to think about how the C code one writes translates into assembly. (Honestly, I think C should not be used much outside developing kernels anyway, and even there it's just legacy, but that's just personal opinion.)
But it's rough, and dangerous. Optimizers do a lot these days, and I really mean a lot. Besides completely mangling your program order, which includes shoving entire blocks of code into places that you might not have guessed, they also do such things as leveraging undefined behavior for optimizations (what the article is partly about), or replacing entire bits of code by function calls. (A compiler might make code out of your memcpy(), and vice versa; the latter can be especially surprising.)
If you care about the assembly representation of your C code (which kernel developers often do), you will spend a lot of time with the "volatile" keyword, compiler barriers, and some obscure "__attribute__"s.
But I agree, even with those caveats in mind, it's a very useful skill to imagine your C code as what it translates to (even if that representation is just a simplified model of what the compiler will actually do).
>leveraging undefined behavior for optimizations
that is a poor way to handle UB as it introduces bugs (which are UB themselves). If a compiler detects UB, it should flag an error so the source code gets changed. compilers (or any software really) should never be maliciously compliant.
That's not what I mean. The C standard contains some rules that exist for the sole purpose of providing better optimization, and some of these rules give raise to undefined behavior. The compiler leverages undefined behavior by allowing optimizations to not have to care about code that exhibits such undefined behavior.
If compilers did not take advantage of this, then a lot of behavior would not have to be undefined in the first place. Undefined behavior isn't conjured up from a magical place, it was deliberately specified for a reason.
The subject of the linked article, strict aliasing, is a prime example of exactly that: Surprisingly strict rules for aliasing, giving compilers the opportunity to better optimize code that follows these rules, at the risk of breaking code that does not follow the rules in arbitrary and perhaps unintuitive ways.
Now, these particular rules are controversial, and the article acknowledges this:
Nevertheless, there are many other rules that are much more readily accepted where similar things are taking place.If you read this document, you may find to your horror an awful lot of C code, probably code you have written, is UB and therefore broken. However just because something is technically UB doesn’t mean compilers will take advantage of that and try to break your code. Most compilers want to compile your code and not try to break it. Given that almost no one understands these rules, compilers give programmers a lot of leeway. A lot of code that technically breaks these rules will in reality never cause a problem, because any compiler crazy enough to assume all code is always in 100% compliance with these rules would essentially be deemed broken by its users. If you are using fwrite to fill out a structure, you are just fine. No reasonable compiler would ever break that code. The issue is not that implementations don’t give users leeway, the issue is that it’s unclear how much leeway is given.>The compiler leverages undefined behavior by allowing optimizations to not have to care about code that exhibits such undefined behavior.
that's pure maliciousness. if the programmer has written code that exhibits undefined behavior, it should be flagged as an error so it can be changed to code that does not exhibit undefined behavior.
programs need to have one unambiguous meaning, and it should be the meaning intended by the programmer. if meanings can be detected as ambiguous or as not what the programmer intended, that should be flagged, not magically swept under the carpet because it's "faster".
The compiler generally cannot know when the program runs into undefined behavior, because of the halting problem. For detecting undefined behavior at runtime, there’s UBSan. It’s good, but it makes things slower.
or to put it another way, divide by zero is undefined behavior. do you think it should be trapped? or just optimized away so the program can get more quickly back to defined behavior...
Nobody declared divide by zero as undefined behavior for any optimization benefit.
I kind of share this feeling (I knew 68K assembler before learning C), but having spent ~30 years writing C, publishing some open source software in C, reading comp.lang.c and draft standards, as well as answering many C questions on Stack Overflow back when it was cool, let me tell you: it's not a good model any more (if it ever was). :)
C is specified against an abstract (not virtual) machine, and it matters.
All the talk about how undefined behaviors give the compiler right to shuffle and/or remove code really break the analogy with assembler, where most things become Exactly What You Say.
How is this related to the linked article? Assemblers won’t delete your code for treating a register with a float in it like an integer.
> Any access of memory using a union, where the union includes the effective type of the memory is legal. Consider:
union {
int i;
float f;
} *u;
float f = 3.14;
u = &f;
x = u->i;
> In this case the memory pointed to by “u” has the declared effective type of int, and given that “u” is a union that contains int, the access using the “i” member is legal. It’s noteworthy in this that the “f” member of the union is never used, but only there to satisfy the requirement of having a member with a type compatible with the effective type.Is this a typo? Should it say "declared effective type of float" and "“u” is a union that contains float"?
It's interesting to see type-punning using a union - I've read that it should be avoided and to use `memcpy` instead. Are there any issues with the union approach in C? Or is the advice to prefer `memcpy` specific to C++, where AFAICT the union approach is undefined behaviour?
> type-punning using a union - I've read that it should be avoided and to use `memcpy` instead
The other day we had standard committee members confirming union punning is good in C: https://news.ycombinator.com/item?id=43793225
Looks to me like union-based type-punning in C is indeed "better than C++" (in C++ it's just plain undefined behavior). In C, it looks like the behavior is defined unless you hit a trap representation.
https://port70.net/~nsz/c/c11/n1570.html#6.2.6.1p5
> Certain object representations need not represent a value of the object type. If the stored value of an object has such a representation and is read by an lvalue expression that does not have character type, the behavior is undefined. [...] Such a representation is called a trap representation.
https://port70.net/~nsz/c/c11/n1570.html#6.5.2.3p3
> A postfix expression followed by the `.` operator and an identifier designates a member of a structure or union object. The value is that of the named member. [Footnote: If the member used to read the contents of a union object is not the same as the member last used to store a value in the object, the appropriate part of the object representation of the value is reinterpreted as an object representation in the new type as described in 6.2.6 (a process sometimes called ''type punning''). This might be a trap representation.]
I'm fuzzy on exactly what a "trap representation" might be in real life. I have the impression that a signaling NaN isn't. I suspect that a visibly invalid pointer value on a CHERI-like or ARM64e-like platform might be. Anyway, my impression is that sane platforms don't have trap representations, so indeed, you have to go out of your way to contrive a situation where C's paper standard would not define type-punning (whether union-based or pointer-cast-based) to have the "common-sense" physical behavior.
Again this is different from C++, where both union-based type-punning and pointer-cast-based type-punning have UB, full stop:
https://eel.is/c++draft/expr.prop#basic.lval-11
> An object of dynamic type Tobj is _type-accessible_ through a glvalue of type Tref if Tref is similar to Tobj, a type that is the signed or unsigned type corresponding to Tobj, or a char, unsigned char, or `std::byte` type.
> If a program attempts to access the stored value of an object through a glvalue through which it is not type-accessible, the behavior is undefined.
Thanks, I didn’t see that discussion at the time.
The writer seems to be Eskil! https://www.youtube.com/@eskilsteenberg/videos
What drugs were they on? Why on earth is there any distinction between variables allocated statically, on the stack, or on the heap? I allocate a struct, copy data to it, and those data have no Effective Type? Because I started with malloc? Give me a break.
The point of the type system is to define types. It’s not to make the compiler’s job easier, or to give standards committees clouds to build their castles on. No amount of words will justify this misbegotten misinvention.
> If a value is stored into an object having no declared type through an lvalue having a type that is not a character type, then the type of the lvalue becomes the effective type of the object for that access and for subsequent accesses that do not modify the stored value
As I read it, this means that
Will have an effective type of "struct foo*", which seems like what you would expect.struct foo *x = malloc(sizeof(*x))But if then you write to that memory through a int pointer the effective type is int. Unlike if you would have allocated the struct in the stack.