Making a low level Linux debugger, part 2: C
blog.asrpo.com> By convention, the stack is between the value of registers rbp (lower address) and rsp (higher address) and rsp increases when there are more stack frames added. We're on 64-bit Linux so each frame takes up 8 bytes.
Stack frames grow in size based on local variables allocated on the stack. And $rbp and $rsp need not point at the correct places in the stack for leaf functions, at least on Linux, because it uses the System V ABI.
Thank you. I'll add an edit to the post later (here's hoping this[1] is accurate enough). I do not know the conventions well and this debugger/editor in parts help me see what's going on.
In general, is there some "don't do anything funky" compiler flag so it sticks to a simpler internal model?
> In general, is there some "don't do anything funky" compiler flag so it sticks to a simpler internal model?
Most of my experience is with LLVM/Clang, so I can't say too much about how gcc differs in its thought model.
Modern compilers use SSA as the basis for common optimizations. In SSA form, every variable has exactly one definition (different writes originating on different control flow paths is represented with special phi constructs). Conversion to SSA form pretty much irrevocably destroys the original notion of variables. The standard big optimization passes will further destroy any easy mapping to the source code: code is pushed out of loops if possible, unexecutable control flow paths are removed, redundant computations (including both within statements and across the entire function) are eliminated, etc. This becomes particularly tricky in the backend, where register allocation means that some variables just don't exist in state anymore (because you needed that space for something else, and it's dead, so why keep it somewhere?).
What this means is that, when optimizing code, the maintenance of debugging information is very much a best-effort. If you disable optimization, you get something that is relatively akin to a very literal translation of C code to assembly. Even very basic optimizations, however, will almost immediately destroy the basic guarantees. -O1 (or -Og for gcc) will generally avoid the optimizations that do the truly insane manipulations, but you're still liable to get this issue.
The basic representation for debugging information on Linux and OS X is DWARF. DWARF is a nasty specification to read, and it doesn't insulate you from having to learn all of the C or C++ ABI implications. There is a facility to use DWARF to indicate variables that aren't located in the stack, but it doesn't look like compilers maintain debugging information well enough if variables are promoted to registers instead.
Thanks. The choice of gcc was somewhat arbitrary so clang could work too. I actually fiddle a bit with lldb before this.
From your description, it sounds like I'd really ought to removing optimizations (with -O0 from what's suggested here).
For variables, local variables can be optimized out (something I recall seeing in gdb without -O0) but all global variables are still kept, right? (At least, the ELF has names and addresses.)
Compilers are free to delete global variables, if nothing references them, just like local variables. That said, if you don't declare them static or some other form of private variable, then compilers generally need to assume that some unknown entity can refer to them, which generally prevents their removal.
Do you mean -O0?
-O0 will generate the most predictable code.
In fact, if you compile with -fomit-frame-pointer (the default for optimized code), $rbp can be repurposed by the compiler (for holding the value of local variables for instance).
Where's part one? Author states "Last time", but that hyperlink just goes to the same page it's on.
https://blog.asrpo.com/making_a_low_level_debugger_part_1 gives 404
Thanks to both you and christophergray. The broken link should be fixed now.