I made a programming language with M&Ms

13 min read Original article ↗

How much of this article was AI-generated — code, prose, or structure. The rest is human-written.

What if a little pile of M&Ms on a table was a real program?

I mean literally. Imagine you arrange M&M-like candies into a specific pattern, that pattern is executable code.

Alright story time. Featuring inline interactive interpreter embedded right inside this post.

It all started when I spilled a full packet of GEMS GEMS is sort of an Indian version of M&Ms. on the floor cus I opened (ripped?) the packet a bit too hard.

It fell into an interesting pattern that I could only describe as the shape of an arrow.

Random patterns, fractals, interpreting nonsense into structure are hobbies that entice me. I am somewhat of an apophenic when it comes to these things.

The colors, the placements, and the structure of what I saw dropped a silly idea into my minds eye. What if I could write programs with M&Ms. This is the story of one of my many silly little projects.

Poster illustration for MNM Lang showing candy code on the left and a terminal-style branching tree on the right

abstract art of m&ms being parsed

Seeing the spilled candy on the floor, a few constraints dawned on me...

  • There are only six useful colors.
  • A photo is a terrible place to store exact symbolic data.
  • Candy is round, glossy, messy, and inconveniently physical.
  • Strings are a disaster if you try to cram them into an image.
  • If this thing is going to be funny, it still has to actually work.

So I built it.

The result is MNM Lang, a tiny programming language where:

  • source code is written as runs of six letters: B G R Y O N
  • those runs compile into a PNG made from candy sprites
  • the PNG decompiles back into source exactly
  • and a controlled photo decoder can recover programs from mildly skewed images (I hope that works)

There is a CLI, a browser playground, example programs, tests, and a sprite pack generated specifically for the project.

And this is obviously not a practical language. It is a serious implementation of a silly idea.

The core problem

If you only have six candy colors, how do you build a language that is:

  • easy to place by hand
  • easy to read from a photo
  • expressive enough to run real examples
  • and small enough that the whole bit stays funny?

My answer was: encode instructions by color family, and encode operands by count.

That means a token like this:

BBB

isn’t “three arbitrary blue things.” It means a specific opcode.

And a token like this:

RRRR

means the integer literal 3, because operand values are len(token) - 1.

That single rule ended up doing a lot of work for me:

  • it is easy to author in text
  • it is easy to render into image cells
  • it is easy to reconstruct from image geometry
  • and it feels appropriately ridiculous

You can explain the language to someone in about thirty seconds:

“Blue clusters are control flow, green is stack and variables, yellow is math, orange is I/O, brown is labels and strings, red is stack shuffling and logic. If you want the number five, use six red candies.”

Whether or not that's intuitive is not a question I can answer at this time.

Images are bad at text

The earliest fork in the road was strings.

I could have tried to encode text directly into candy layouts. Maybe invent a micro-alphabet. Maybe use rows of yellow and red as bytes. Maybe do some cursed base-6 trick.

That would have been technically possible and spiritually awful.

The fun part of the project is the visual structure, not building an OCR-resistant QR code out of sugar shells.

So I pushed strings and initial variables into a sidecar JSON file.

That means a program has two parts:

  1. the visual candy layout in .mnm
  2. the non-visual runtime data in .mnm.json

For example, hello world is:

OO Y
OOOOOO
BBBBBB

And because the whole bit only works if that text turns into an actual candy program, here is the compiler output for it:

The hello_world MNM Lang program rendered as candy sprites

`hello_world`, but in snack form

And its sidecar is:

{
  "strings": ["Hello, world!"],
  "variables": [],
  "inputs": {
    "int": [],
    "str": []
  }
}

And because I apparently have no sense of restraint, this page can also run that exact little program inline:

That split ended up making the whole system cleaner:

  • the image only carries what images are good at: structure
  • runtime input can change without moving candy
  • the photo decoder does not have to pretend it can read prose from glossy candy

Sometimes the correct answer in a whimsical project is to stop being whimsical for one layer of the stack.

A language made of six colors

Once strings moved out of the image, the language itself fell into place pretty quickly.

I grouped instructions by color family:

  • blue: jumps, calls, halt
  • green: push/load/store/dup/pop/inc/dec
  • yellow: arithmetic and comparisons
  • orange: printing and input
  • brown: labels and string operations
  • red: swap, rotate, boolean logic

And then I made the first token on every row the opcode.

That gives the language a very physical feel. A line is an instruction. A cluster of candies is a token. More candies means a different variant.

It is almost closer to arranging game pieces than writing code.

The full programs still look absurd, which I consider a success. This is the opening stretch of the factorial example:

OOO O
GGG G
G RR
GGG GG
N B
GG G
G RR
YYYYYYYY
BB BB

Fed through the renderer, that opening section looks like this:

The opening stretch of the factorial example rendered as candy sprites

this is a real program

If you already know the rules, you can decode that as:

  • read integer queue 0
  • store into variable 0
  • push 1
  • store into variable 1
  • label 0
  • load variable 0
  • push 1
  • compare >
  • jump-if-zero to label 1

Which means, yes, I wrote a looping factorial program out of candy.

The only correct compiler target was an image

If the whole gimmick is “this program is candy,” the compiler cannot stop at an AST.

It has to emit an image.

So the compiler takes normalized .mnm source and renders it on a fixed grid:

  • one source character per cell
  • spaces become empty cells
  • cells hold transparent-background candy sprites
  • the output is a PNG

That fixed geometry turned out to be a huge win, because it made the reverse direction almost trivial.

If an image came from the compiler, the decompiler can:

  • recover the exact row/column count from the canvas size
  • sample each cell
  • classify it as blue/green/red/yellow/orange/brown/blank
  • strip trailing spaces
  • and re-parse the result

That gives an exact round-trip:

source PNG source

with no heuristics at all.

In other words: the “compiler” is also a tiny image format.

I generated the candy sprites with an image model

One of my favorite parts of the project is that I didn’t hand-draw the sprites.

I used AI image generation This is a Codex Skill — a reusable capability you can give to Codex for specialized tasks like image generation. to create six M&M-style candy tokens:

  • blue
  • green
  • red
  • yellow
  • orange
  • brown

The raw generations were decent, but not directly usable. They came with a few annoying traits:

  • too much studio backdrop
  • a bit of inconsistent shadow
  • minor scale differences

So the final asset pipeline became:

  1. generate six isolated candies with transparent-background prompts
  2. normalize them with a small script
  3. crop and center them onto a canonical 128x128 canvas
  4. extract palette metadata for the decompiler and photo classifier

Not conceptually. Literally. The checked-in prompt bundle for the sprite pack looks like this:

{
  "prompt": "a single blue candy-coated chocolate lentil ... isolated on a transparent background",
  "composition": "one candy only, top-down, centered, consistent scale",
  "constraints": "transparent background; no logo; no text; no watermark",
  "out": "blue.png"
}

And the normalization script starts by estimating the backdrop and isolating the largest candy blob:

background = np.median(border, axis=0)
distance = np.linalg.norm(rgb - background, axis=2)
threshold = filters.threshold_otsu(distance)
mask = distance > max(18.0, float(threshold) * 0.9)
return labeled == regions[0].label

Then it scales and centers that cutout onto the canonical sprite canvas:

available = CANVAS_SIZE - (PADDING * 2)
scale = min(available / cropped.width, available / cropped.height)
canvas = Image.new("RGBA", (CANVAS_SIZE, CANVAS_SIZE), (0, 0, 0, 0))
x = (CANVAS_SIZE - resized.width) // 2
y = (CANVAS_SIZE - resized.height) // 2
canvas.alpha_composite(resized, (x, y))

And finally it writes the palette metadata that the decompiler and photo classifier both use later:

rgb = array[..., :3][array[..., 3] > 0]
mean_rgb = tuple(int(round(value)) for value in rgb.mean(axis=0))
palette[color[0].upper() if color != "brown" else "N"] = mean_rgb

That normalization step mattered a lot more than I expected. If the shadows are too strong, candies that are supposed to be separate blobs start merging after blur and perspective transforms. That sounds like a silly implementation detail, but it is exactly the sort of thing that determines whether “photo decoding” is real or fake.

Projects like this are fun because the silly part and the engineering part keep interfering with each other in useful ways.

I almost talked myself into training a model

When you say “image decoding,” your brain immediately offers to make the project bigger than it needs to be.

I had the same impulse:

  • maybe I should train a tiny classifier
  • maybe synthesize candy crops
  • maybe build the MNIST-for-M&Ms pipeline

That would be fun. It is also not necessary for v1.

The version I shipped uses deterministic image processing for the photo decoder:

  • estimate background color from the border
  • segment candy-like foreground blobs
  • classify each blob against the canonical six-color palette
  • cluster the blobs into rows
  • infer spaces from centroid gaps
  • re-parse the reconstructed source

This works surprisingly well for the target use case:

  • overhead photo
  • plain contrasting background
  • separated candies
  • mild blur
  • small rotation or perspective skew

It absolutely does not solve “dumped a bag of candy on a messy kitchen table and took a dramatic iPhone shot.”

Real example programs are where the joke becomes a language

I didn’t want this to stop at “hello world with candy colors.”

So I added a few examples that push on different parts of the language:

hello_world

Pure output. Basically the proof that the whole pipeline exists.

echo_name

Uses a string queue and concatenation to greet the input name from the sidecar.

factorial

This is where it starts feeling real:

  • labels
  • variable mutation
  • arithmetic
  • conditionals
  • loops

fizzbuzz

Mandatory. Also unexpectedly good at showing off the design because it uses:

  • modulo
  • branching
  • string slots
  • repeated output
  • a small amount of state

Watching fizzbuzz compile into a candy grid and then run correctly is exactly the kind of payoff I wanted from the project.

At that point it stops being “a cursed novelty syntax” and starts being “okay, this is a legitimate little VM that happens to look like a snack.”

The browser playground made it feel like a real toy

The CLI is the serious interface:

  • compile
  • decompile
  • run
  • serve
  • list examples

But the browser playground is what makes the repo inviting.

It lets you:

  • load a shipped example
  • edit source
  • edit sidecar JSON
  • render the candy-sheet preview
  • run it immediately
  • upload an image and decode it back into source

I also added two views that made the whole thing feel much more like a real language toolchain instead of a cursed renderer demo:

  • a tree-formatted AST showing what the parser believes each candy row means
  • a tree-formatted execution trace showing which branches the interpreter actually took at runtime

For a tiny program like hello_world, the AST stays pleasantly readable:

Program (3 instruction(s))
|-- labels
|   `-- (none)
`-- instructions
    |-- [0] PRINT_STR @ line 1 (string[0] from Y)
    |   `-- source: OO Y
    |-- [1] NEWLINE @ line 2
    |   |-- source: OOOOOO
    |   `-- operands: (none)
    `-- [2] HALT @ line 3
        |-- source: BBBBBB
        `-- operands: (none)

And the execution trace is exactly the kind of thing I wanted once the language had loops and branches. Here is a clipped excerpt from factorial, right around the point where the loop either keeps going or breaks out:

Execution
|-- [step 8] [ip=7] GT @ line 8
|   `-- state: stack=[1] vars=[5, 1]
|-- [step 9] [ip=8] JZ (label[1] from BB) @ line 9
|   |-- branch: fallthrough -> instruction[9]
|   `-- state: stack=[] vars=[5, 1]
...
|-- [step 52] [ip=7] GT @ line 8
|   `-- state: stack=[0] vars=[1, 120]
|-- [step 53] [ip=8] JZ (label[1] from BB) @ line 9
|   |-- branch: taken -> label[1] @ instruction[15]
|   `-- state: stack=[] vars=[1, 120]

That same tree output now shows up in both the CLI and the browser UI, which is nice because candy code is way funnier once you can also inspect it like a real compiler/runtime pipeline.

So here is the same idea, but actually live:

We need tests

I designed the interpreter but the code is mostly written by GPT 5.4 XHigh via Codex.

And vibe coding calls for tests cus what if it reward hacked Reward Hacking is when an AI optimizes for the metric you gave it rather than the goal you meant — passing tests without actually solving the problem. my idea into existence?

So I wrote tests for the actual guarantees:

  • parser validation
  • runtime semantics
  • example golden outputs
  • exact source/PNG/source round-trips
  • synthetic photo decoding with blur, rotation, and perspective skew
  • API behavior
  • a playground-style smoke flow
  • sprite asset sanity checks

One of the bugs I hit was that the photo decoder accidentally treated fully opaque RGB images as if their alpha channel meant foreground everywhere, which turned the entire canvas into a single blob. That sounds obvious once you know it, and it is exactly the kind of mistake I wanted to catch.

Another was that the sprite normalization kept too much drop shadow, which caused nearby candies to merge after blur. Again: a ridiculous bug, but a real one.

The tests are what separate “look, I rendered candy once” from “this is an actual system with constraints and failure modes.”

The best part of the project is the tradeoff it forces

Every joke project has a point where you decide whether you are going to protect the joke or protect the implementation.

MNM Lang kept forcing me to do both.

That is how you end up with rules like:

  • blue cluster width decides which branch instruction you mean
  • red run length encodes integer literals
  • strings live in JSON because candy OCR is a terrible life choice
  • compiled PNGs are exact but photos are “controlled” on purpose

None of that is language design orthodoxy.

All of it is completely justified by the premise... I tell myself.

If you want to try it

GitHub: mufeedvh/mnmlang

The repo includes:

  • the interpreter
  • the photo decoder
  • the candy sprites
  • example programs
  • the local playground

The best first command is probably:

uv run mnm serve

Load fizzbuzz, render it, and look at the compiled PNG for a second.

It really does look like a programming language you could pour out of a bag.

So stupid.

Oh and I have more silly projects. This is #1 of the series. Tune in for how I reverse engineered my keyboard's driver binary to play snake with the backlights while my agents run in the background.

Follow me on 𝕏