Settings

Theme

x86 architecture 1 byte opcodes

sandpile.org

90 points by eklitzke 3 months ago · 34 comments

Reader

GuB-42 3 months ago

Hello sizecoders ;)

Additional resources:

http://www.sizecoding.org/wiki/DOS

A nice PDF with similar content:

https://pnx.tf/files/x86_opcode_structure_and_instruction_ov...

Sharlin 3 months ago

Need a couple of instructions for accessing memory (and possibly loading immediates) but otherwise seems like a perfectly adequate general-purpose instruction set. Might be fun (for some values of "fun") to write a compiler backend for it.

  • sparkie 3 months ago

    They're one byte opcodes, but not one byte ops. Most of them have operands which are encoded in a ModRM byte which follows the opcode. The ModRM may be followed by a SIB byte, and that may be followed by a a variable size immediate|displacement. There are also optional prefixes to the opcode.

  • jeffbee 3 months ago

    Tons of these have immediate operands. The question becomes is ADD with an implicit register destination and an immediate value in the next byte a "1-byte opcode"?

    • Sharlin 3 months ago

      Yes, indeed. I'd allow only mov to have a memory or immediate parameter as the only exception to one-byte encoding.

  • benlivengood 3 months ago

    Push, pop, inc, and dec with a 16-bit register argument are one byte, so is ret. That technically gives you enough to do anything, but you can include jz/jnz (which do take immediate bytes, maybe cheating?), stosw, lodsw, clc, and stc to implement Brainfuck (a little harder to perform input/output with single byte instructions, but maybe pretend the OS uses int1 or int3 for calls).

  • themafia 3 months ago

    You've always got the stack segment (SS) to play with and there's also:

    https://www.felixcloutier.com/x86/xlat:xlatb

GeorgeTirebiter 3 months ago

I don't understand, without further description of the symbols.

  • jcranmer 3 months ago

    The explanation of the symbols is largely found here: https://www.sandpile.org/x86/opc_enc.htm

    Essentially, the uppercase letter of an operand is a combination of the operand type (immediate, register, memory) along with how that is encoded (as ModR/M bytes have a register and a register/memory field), while the lowercase letter is the size of the operand (largely 8-bit/16-bit/32-bit/64-bit for the 1-byte opcodes).

  • mras0 3 months ago

    Not sure why you're being downvoted. You need a to know quite a bit of esoteric knowledge to parse this beyond knowing x86 opcodes (even x86 assembly).

    It's more or less the same information you get from the intel manuals (specifically appendix 2A of https://www.intel.com/content/www/us/en/developer/articles/t...). There you can also see what e.g. "Jb" means (a byte sized immediate following the instruction that specifies a sign-extended relative offset to the instruction).

    One-byte opcodes here differs from 2 byte opcodes (386+ IIRC) prefixed by a 0F byte and even more convoluted stuff added later.

    • charcircuit 3 months ago

      >Not sure why you're being downvoted.

      I downvote people when they say they don't know what something is when they could have used a LLM to explain it to them.

      • bigstrat2003 3 months ago

        So you would rather people ask a machine that is known to be unreliable and have no idea what it's talking about, than ask a forum of technically skilled people who will give them a good answer. That doesn't seem very reasonable to me.

        • 9rx 3 months ago

          Why's that? Its is most advantageous to ensure that other people are kept in the dark.

          If they are willing to pay to level the playing field perhaps it might be worth your while to fill them in. The old scholastic business model — gotta pay to play. But to take precious time out your day to fill them in to your own personal disadvantage...?

          In other circles where people have well-rounded feelings you might find someone willing to do it just for the warm fuzzies it gives them. But technically skilled people are generally void of such emotion. That is often what compels them towards technology in the first place.

      • mras0 3 months ago

        The link is to an opcode map with strange abbreviations with no apparent explanation. Asking "What am I looking at?" without doing any research (with a LLM or otherwise) is entirely reasonable.

        • charcircuit 3 months ago

          It is entirely reasonable, but these kind of comments are essentially wishing sites could cater to their knowledge level.

          It's like complaining that the article is not written in French. It's noise in the comment section of an article. If someone wants such a thing, browsers have functionality to translate pages to French. Not every site needs to have their own French translation to suit such a person.

          • mras0 3 months ago

            I understand what you're getting at, but in this case even I (who know what most things on that page means) struggle to understand why it was submitted. Are we looking for the 0E opcode? New optimization opportunities?

            Genuinely asking, for this post did you click on the link and say "yeah, I got the point" or did you involve an LLM? If you did, what did you ask it? I'm asking because I want to get better at LLM use (Another example post (and prompt) where you've used this, that's also fine)!

          • wewtyflakes 3 months ago

            They were not asking for the website to change; they were asking for context so that they can appreciate the website.

            • charcircuit 3 months ago

              In this case the person was not asking anything. The person was stating they didn't understand. The equivalent in my analogy is a French speaker commenting that they don't understand English without further translation into French.

              • GeorgeTirebiter 3 months ago

                Geez. I was the first one to comment. It was "This may be great, but... would you please give us more explanations / context." It's not "laziness" but trying to understand how this table is useful / teaches us something. And, to the OP, that a 'typical' HN reader didn't get it.

                I know 8008, 8080, z80, 80186, 80286, 80386, 80486, and some fancy opcodes for SSEx. The table still, IMHO, needs further explanation. Some have provided pointers to more info; thank you.

      • Rietty 3 months ago

        What if the LLM gives them bad information and they don't know it? I personally would also just ask in a thread than risk the LLM info.

      • jrockway 3 months ago

        I never punish people for asking a question. It's how you learn!

      • sparkie 3 months ago

        You realize that LLMs are trained on human discussions right?

        If everyone stops asking questions and asks the LLM instead, there is no new training data for future LLMs to learn from. They will stagnate, or consume their own slop, and regress.

ryanschneider 3 months ago

A reverse engineer friend once taught me I could patch an x86 function with `0xEBFE` to get the CPU to spin forever. It wasn’t until much later that I understood that (IIRC) 0xEB is the “single byte” jump instruction and that of course 0xFE is -1 as a signed byte. Hence the spin.

hornd 3 months ago

What does the 0eh comment mean?

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection