Settings

Theme

My application programmer instincts failed when debugging assembler

landedstar.com

41 points by lifefeed 2 days ago · 30 comments

Reader

xg15 16 hours ago

> Abstractions. They don’t exist in assembler. Memory is read from registers and the stack and written to registers and the stack.

[...] But my application-coded debugging brain kept looking at abstractions like they would provide all the answers. I rationally knew that the abstractions wouldn’t help, but my instincts hadn’t gotten the message.

That feels like the wrong takeaway for me. Assembly still runs on abstractions: You're ignoring the CPU microcode, the physical interaction with memory modules, etc. If the CPU communicates with other devices, this has more similarities with network calls and calling the "high level APIs" of those devices. For user space assembly, the entire kernel is abstracted away and system calls are essentially "stdlib functions".

So I think it has a different execution model, something like "everything is addressable byte strings and operates on addressable byte strings". But you can get that execution model occasionally in high-level languages as well, e.g. in file handling or networking code. (Or in entire languages built around it like brainfuck)

So I think assembly is just located a few levels lower in the abstraction pile, but it's still abstractions all the way down...

  • TacticalCoder 9 hours ago

    > Assembly still runs on abstractions: You're ignoring the CPU microcode ...

    Yes and no. There's no way to "get" to these. Arguably assembly is an abstraction on top of codes (hexcodes or binary if you want to see it that way), but the assembly instructions are the lowest level we get to access. For as a programmer you don't get to access the microcodes emulating an amd64 architecture and you cannot decide to use these microcodes directly.

    Otherwise it's just electricity. Then it's just electrons.

    So it's not false that it's all abstractions but it doesn't help much to view it that way.

    • david-gpu 4 hours ago

      > Otherwise it's just electricity

      There is a lot going on at the hardware level that is going on and hidden from the view of assembly. Hardware is not magic, there are a ton of design decisions that go into every architecture, most of which isn't immediately obvious by looking at the ISA.

  • leptons 4 hours ago

    No, assembly doesn't always inherently deal with abstractions. It depends on the system involved. I don't really count "microcode" as an abstraction, it's essentially part of the hardware and doesn't even exist on many embedded CPUs. The assembly instructions for all intents and purposes operate directly on the hardware. If you wanted to get really absurd with it, you could say that all of it is an abstraction of electrons.

    Embedded CPU assembly is what I do most often, for the last 40 years, and there aren't really any abstractions at all - not even microcode. You have a few KB or ROM and maybe a few KB of RAM, ALU, registers, peripherals, and that's it - no APIs, no kernel, no system calls, no stdlib. Just the instructions you burn into the ROM.

    • david-gpu 4 hours ago

      Depending on the ISA, assembly is quite abstracted away from the actual underlying hardware, at least from the viewpoint of a computer architect. It depends on the ISA, of course.

      Many common hardware features like out of order instruction issuing, register renaming, and even largely caching and segmentation, are largely or entirely hidden at the assembly level.

      • leptons an hour ago

        Maybe you missed the part where I described that not all CPUs work the same, not all have out of order instruction issuing, register renaming, or even any cache at all. OP was blanket talking about assembly, without realizing the many different kinds of CPUs out there, and apparently you're here cherry-picking too.

        I have no interest in continuing this pointless internet interaction.

  • iberator 2 hours ago

    Thats BS.

    You can totally convert your assembly into machine code by hand. There is no lower level. Not all cpus have microcode.

userbinator 18 hours ago

Asm is simple enough that "mental execution" is far easier, if more tedious, than in HLLs, especially those with lots of hidden side-effects. The concept of a function doesn't really exist (and this is even more true when working with RISCs that don't have implicit stack management instructions), and although there are instructions that make it more convenient to do HLL-style call and return, it's just as easy to write a "function" that returns to its caller's caller (or further), switches to a different task or thread, etc. If you're going to learn Asm, then IMHO you should try to exploit this freedom in control flow and leverage the rest of the machine's ability, since merely being a human compiler is not particularly enlightening nor useful.

  • zahlman 8 hours ago

    > Asm is simple enough

    The general conceptual model of "asm" is simple.

    Some instruction sets and architectures are hideous, though.

    > merely being a human compiler is not particularly enlightening nor useful.

    I don't think I can agree with that. At least it teaches you what the compiler is doing. And abiding by conventions (HLL-esque control flow, but also things like "put the return value in r0" and "put constant pools after the function") can definitely make it easier to make sense of the code. (Although you might share a constant pool across a module or something, if the instructions reach far enough.)

    Not to say that you can't do interesting things, and can't ever beat the compiler. One of the things I most enjoyed discovering, in mid-00s era THUMB (i.e. 16-bit ARM) code, is that the compiler was implementing switch statements with tables of 32-bit constants that it would load into an indirect jump. I didn't get around to it, but I figured I could mechanically replace these with a computed jump into a "table" of 16-bit unconditional branches (except for very long functions, but this helped bring the branch distances under thresholds).

  • streetfighter64 16 hours ago

    I agree entirely, great insight! I'd like to add that assembly is best enjoyed in a suitable environment for it, where "APIs" are just memory writes and interrupts. Game programming for the C64 is way more fun than dealing with linux syscalls, for example. A lower level interface enables all the fun assembler tricks, and limited resources require you to be clever.

  • jiehong 18 hours ago

    Then you goto hell…

  • mathisfun123 18 hours ago

    > Asm is simple enough that "mental execution" is far easier, if more tedious, than in HLLs

    Ya totally I can also keep 32 registers, a memory file, and stack pointer all in my head at once ...fellow human... (In 2026 I might actually be an LLM in which I really can keep all that context in my "head"!)

    • RobotToaster 16 hours ago

      there's an interesting new API skill for the human cortex v1.0, that allows for a much larger context window, it's called pen and paper.

      • ExtremisAndy 13 hours ago

        For real! I occasionally write assembly because, for some reason, I kind of enjoy it, and also to keep my brain sharp. But yes, there is no way I could do it without pencil and paper (unless I’m on a site like CPUlator that visually shows everything that’s happening).

      • mathisfun123 10 hours ago

        What do the words "mental execution" mean?

    • userbinator 17 hours ago

      8 registers are sufficient; if you forget what one holds, looking up at the previous write to it is enough.

      Contrast this with trying to figure out all the nested implicit actions that a single line of some HLL like C++ will do.

Surac 11 hours ago

I was luck to learn asm on a very simple 8 bit CPU (6502). It had a very limited register set (3) and instruction count. I think if you realy like to dive into the ASM topic try to find a small easy CPU model and use a emulator to run your code

Chaosvex 18 hours ago

Not sure what to take away from this. __abstract works because GCC allows it as an alias to __abstract__, not because parsing the syntax is forgiving.

Abstractions do exist (disagreeing with the single other post in here) and they also exist in most flavours of assembly, because assembly itself is still an abstraction for machine code. A very thin one, sure, but assemblers will generally provide a fair amount of syntactic sugar on top, if you want to make use of it.

Protip: your functions should be padded with instructions that'll trap if you miss a return.

  • rep_lodsb 14 hours ago

    >Protip: your functions should be padded with instructions that'll trap if you miss a return.

    Galaxy brained protip: instead of a trap, use return instructions as padding, that way it will just work correctly!

    Some compilers insert trap instructions when aligning the start of functions, mainly because the empty space has to be filled with something, and it's better to use a trapping instruction if for some reason this unreachable code is ever jumped to. But if you have to do it manually, it doesn't really help, since it's easier to forget than the return.

jagged-chisel 14 hours ago

I think lots of commenters are being unintentionally pedantic. It’s clear that there are different types of abstractions one is concerned with when programming at the application level. Yes, it’s all abstractions on top of subatomic probability fields, but no one is thinking at even the atomic level when they step through the machine code execution with a debugger.

  • throwaway94275 9 hours ago

    The one abstraction you would have to keep in mind with assembler (writing more than reading tho) is the cache hierarchy. The days of equal cost to read/write any memory location are ancient. Even in the old 8 bit days some memory was faster to access than others (e.g. 6502 zero page).

    The flags are another abstraction that might not mean what it says. The 6502 N flag and BPL/BMI instructions really just test bit 7 and aren't concerned with whether the value is really negative/positive.

    • leptons 4 hours ago

      Ooof I remember the bank switching on PIC microcontrollers was particularly awful. I still got it to work, but it wasn't very fun.

nurettin 11 hours ago

Coming from pascal to C as a highschooler, my biggest wtf moment happened when I forgot a ; after a struct in a header. The compiler kept complaining about the code below the include and for the life of me I couldn't figure it out. Took me another hour to reason that the includes must be concatenating invalid code.

  • zahlman 8 hours ago

    Ah, that's nostalgic.

    I haven't done serious work in C in quite some time. I wonder if modern compilers are better at reporting that sort of thing.

Kiboneu 19 hours ago

Neat. The author is about to stumble onto a secret.

> In Sum# > Abstractions. They don’t exist in assembler. Memory is read from registers and the stack and written to registers and the stack.

Abstractions do not exist periodi. They are patterns, but these patterns aren’t isolated from each other. This is how a hacker is born, through this deconstruction.

It’s just like the fact that electrons and protons don’t really exist. but the patterns in energy gradients are consistent enough to give them names and model their relationship. There are still points where these models fail (QM and GR at plank scale, or just the classical-quantum boundaries). It’s gradients all the way down, and even that is an abstraction layer.

Equipped with this understanding you can make an exploit like Rowhammer.

https://en.wikipedia.org/wiki/Row_hammer

  • wiz21c 17 hours ago

    Abstractions pretty much exist and in assembler they matter even more because the code is so terse.

    Now, there are abstractions (which exist in your brain, whatever the language) and tools to represent abstractions (in ASM you've got macros and JSR/RET; both pretty leaky).

    • Kiboneu 13 hours ago

      That wasn’t my point. You almost got there when you wrote “there are abstractions (which exist in you brain, whatever the language)”. And your point on leaky abstractions is exactly the indication that they exist in your mind, not out there.

      My point is that we settle with what we see for convenience/utility and base our models on that. We build real things on top of these models. Then the result meets reality. If only that transition were so simple.

      When an effect jumps unexpectedly between layers of abstraction we call it an abstraction leak. As you mentioned. The correct response is to re-examine these leaks and make other frameworks to cover the edge cases, not to blame the world.

      Hackers actively seek these “leaks” by suspending assumptions that arise out of the abstractions that humans tend to rely on.

      I’m not surprised that my OP got downvoted. It can be very upsetting when one’s conceptual frameworks are challenged without prescription. No one even mentioned the specific example that I referenced. Well, if they can’t parse it, they don’t deserve it. Keeps me in the market.

david-gpu 13 hours ago

My unsolicited friendly advice to software folks who are curious about assembly languages is: ask yourself what is it that you expect to get out of it.

If you want a better understanding of the architecture, reading the documentation from the hardware vendor will serve you better.

If you want your code to be faster, almost certainly there will be better ways to go about it. C++ is plenty fast in 99% of the situations. So much so that it is what hardware vendors use to write the vast majority of their high-performance libraries.

If you are just curious and are doing it for fun, sure, go ahead and gnaw your way in. Before you do so, why not have a look at how hand-written assembly is used in the rare niches where it can still be found? Chances are that you will find C/C++ with a few assembly intrinsics thrown in more often than long whole chunks of code in plain assembly. Contain that assembly into little functions that you can call from your main code.

For bonus brownie points, here is a piece of trivia: the language is called assembly and the tool that translates it into executable machine code is called the assembler.

  • AlotOfReading 12 hours ago

        For bonus brownie points, here is a piece of trivia: the language is called assembly and the tool that translates it into executable machine code is called the assembler.
    
    IBM has a long history of using "assembler" as a shorthand away to refer to languages. IBM was dominant enough historically that you'd find it used in all sorts of other places. It's bad terminology, but it's not wrong.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection