I decided to build a nine-bit computer

madned.substack.com

180 points by mad_ned 4 years ago · 70 comments (67 loaded)

Reader

pjdesno 4 years ago

Long ago I interned for the group supporting the C-30 ARPANET IMP. At one point the IMP was a 16-bit machine, emulating the old Honeywell (?) minicomputer that the original IMP code was written for. At some point they needed more memory, so they lashed on another 4-bit bit slice, and it became a 20-bit machine.

There was an alternate microcode load for it, which implemented an instruction set similar to that of a PDP-11, and could run an ancient version of Unix. (maybe not so ancient back in 1985, but definitely pre-BSD) We used one or two of those for our development machines, and it was my job to write software tools on them, using C with 20-bit words and 10-bit bytes.

Man, it was a pain in the ass.

larsbrinkhoff 4 years ago

The CPU was called MBB (Microprogrammable Building Block).
Here's the C/30 Programmer's Reference for the "Native Mode Firmware System". That was the software that ran in the native 20-bit mode rather than emulating the 16-bit Honeywell mini.
https://walden-family.com/impcode/c30-nmfs-programmers-refer...
googamooga 4 years ago

I have working PDP-11/23 in my possession and last couple of months I’m trying to convince myself that I have enough soldering skills to solder four additional memory lanes on its backplane to increase memory limit from 256KB to 4096KB. Otherwise even though I install a processor able to work with 22bit addresses only 18bit addressing will be available.
linguistiliniw 4 years ago

IIRC MSP430 microcontrollers do something like that, too. They have low-power 16-bit CPUs, but you can't do a lot with 64k memory space so newer models have instructions with 20-bit addressing.
They're fun to work with; some versions have FRAM instead of Flash memory.
- wildzzz 4 years ago
  
  FRAM is pretty neat. I considered using an MSP430FR5969-SP for a component on a satellite but backed off when I learned more about the destructive read process. We were concerned about a transient bitflip during the write back and had no where safe to store program code other than the FRAM itself. Instead, we went with a RISC-V core in an FPGA and used an 8 channel ADC to cover the telemetry reporting the MSP430 was supposed to handle. I spent a lot of time writing fast, safe code for it so was a little disappointed it never made it into the final design.
larsbrinkhoff 4 years ago

The Unix machine was called C70, right?

giovannibajo1 4 years ago

Nintendo 64 had a 9-bit RAM (Rambus RDRAM). Only 8 bits of each byte were accessible from the MIPS CPU for obvious reasons; the 9th bit was only used by the GPU (called "RDP") to store extra information while rendering (begin a UMA architecture, the CPU used the same RDRAM used by the CPU). Typically it contained a flag called "coverage" that was used to discriminate pixels on the edge of polygons, that were later subject to antialiasing. By reading back pixels using the CPU, you would be unable to see the coverage flag.

klodolph 4 years ago

The 9th bit was also used by the depth buffer.
To add to this, the reason that the RAM was available with 9 bits in the first place is so that it could be used to make systems with ECC. It's just that you didn't have to use that 9th bit for error correction, you could use it for extra data, if you designed the system to use it that way.
phaedrus 4 years ago

Earlier today I was idly wondering: if aliens invented computers and went through the whole 8-bit era, 16-bit era etc. - would they perhaps have developed architectures which had additional "shadow" or "tag" bits on every word? What might they use them for?
The idea being maybe an intelligent race with different balance of motivations might not do only the minimum economical thing, instead being willing to trade X% of memory bits or X% of instructions per second or X% more chips for other purposes. For example extra tag bits might be used to encode where the data came from or its datatype, or additional clock cycles between instructions might be used to run a reliability check on the program as it executes, etc.
wolfram74 4 years ago

Yet another weird bit of N64 lore, I'm amazed mupen64 works as well as it does.
- oh_sigh 4 years ago
  
  > weird bit
  Quite literally

googamooga 4 years ago

Ternary logic based computer Setun with 9-bit "bytes" was developed in late fifties in the USSR. Not much info on about it in English, unfortunately.

https://en.wikipedia.org/wiki/Setun

buescher 4 years ago

Ternary on binary systems usually uses a 2-bit "trit" with an extra potential state. That's even how the later Setun machines did it, from what I understand. Oh yes, and the Soviets even developed their own trinary Forth dialect for them.
- kragen 4 years ago
  
  Even the original Setun machine worked that way, as it turned out, if we believe Willis Ware's contemporary report on Soviet computers.

krallja 4 years ago

I hope your address bus is three nonads wide (3^3 bits = 128MiB address space).

It would be thematically even better if you used ternary logic, but I’m not sure that FPGA can handle more than two voltage levels.

mad_nedOP 4 years ago

haha yes, excellent suggestions! I did think about ternary logic actually but I don't know of an FPGA that supports it. I considered creating like a primitive that burns 2 register bits to approximate it even, and just throw away the 4th state and pretend I have 3-state logic on all the layers above. but i have enough on my hands just trying to get the stupid timing working on a simple CPU. Im not actually a CPU designer so I dont really know what I'm doing lol.
- enriquto 4 years ago
  
  Throwing away 25% of your bits sound wasteful... what you need is a moderately large power of 2 that is very close to a power of 3. These can be found by computing the continued fraction of log(3)/log(2). The sequence of convergents starts thus 2/1, 3/2, 5/3, 8/5, 11/7, 19/12, 46/29, 65/41, 84/53. Some good choices seem to be 2^8-3^5=13 (loses 5%) or 2^46≈3^29 (loses 2.5%).
  - amelius 4 years ago
    
    You can also detect Z state by driving the input high, reading, then driving the input low, reading. If both reads are different, then you have a Z state. Otherwise, the input is the read state.
    Of course, drive the input through a resistor.
- buescher 4 years ago
  
  This is a fantastic hobby project. Have you thought about doing something with the "extra" bit along the lines of tagging bytes for type or garbage collection or whatever like the lisp machines?
tyingq 4 years ago

>I’m not sure that FPGA can handle more than two voltage levels
There is a high-Z (high impedance) state you can set I/O pins to for a third state, but no way to detect that high impedance state from the FPGA. It's just used to share an output line with more than one pin. You could make a peripheral that could detect the three states though, with a voltage divider and an analog input.

onion2k 4 years ago

I took my Macbook to pieces and there were lots more than 64 bits.

selcuka 4 years ago

> I took my Macbook to pieces
How did you do that? I couldn't even remove the battery to replace it.
errcorrectcode 4 years ago

2^(>64) combinations to put it back together. I find the empty set plus a Hackintosh seems to have fewer bits but work much faster. It must be those repairable qubits.

IshKebab 4 years ago

> The answer to why you would still want to build an FPGA system is (and always has been) speed.

> So I quickly gave up on creating something that could only exist on my FPGA board

I've been doing some FGPA stuff and I think that's the wrong way to look at it. Yes FPGAs are often useful when you need raw speed but that's not the only advantage over CPUs. You also get extremely low latency and direct control of IO pins. With software you are limited to the existing hardware peripherals, but with an FPGA you can make your own!

horsawlarway 4 years ago

I agree with you.
I had a brief stint in hardware design, and an FPGA is almost always going to be worse than dedicated hardware for a task, but it's extraordinarily flexible.
Most workflows I saw - you design the hardware on the FPGA (hugely useful for quickly testing and prototyping) then you outsource and actually build a custom chip if you really want speed.
It's also a great polyfill tool - since it can take the place of a lot of other hardware peripherals at a moment's notice.
- bandq 4 years ago
  
  Hey horsawlarway,
  This is definitely the wrong way to ask but in another thread about the PineNote, you mentioned you have a flow to sync your RM2 to Bookstack -- are your scripts on github or any other place? If not, would you consider making them public or posting on Reddit /r/selfhosted/ and/or /r/RemarkableTablet/ ?
  I'm new to HN (as a poster, long-time lurker) and I didn't find a way to directly send you a message.
  - horsawlarway 4 years ago
    
    Unfortunately I have not made them public. I'd have to go and clean up a lot of places where I have hardcoded auth.
    It's not going to be tomorrow (or probably any time in the next month or two) but I have been intending to do the cleanup at some point.
    If I post them, I'll shoot you a link.

vient 4 years ago

See also cLEMENCy 9-bit middle-endian (sic) arch from DEF CON CTF 2017

https://2017.notmalware.ru/89dc90a0ffc5dd90ea68a7aece686544/... (link from https://blog.legitbs.net/2017/07/the-clemency-architecture.h...)

nneonneo 4 years ago

Ah, I have fond memories of hacking on that architecture for DEF CON. We wrote a lot of tools for it: by the end (less than 3 days after getting the spec), we had disassemblers, debuggers, binary rewriters, and even rudimentary decompilation support. It was quite a fun journey :)

sillyquiet 4 years ago

Regarding the interesting bit (to me) in there about the advantages of FPGAs over an SBC like the Pi (speed)- does anybody know of any blogs or projects where an FPGA's speed helped in a hobby project where software running on an SBC wasn't fast enough? I can imagine a few, mostly real-time projects involving expensive computations (image or pattern recognition maybe?), but I would love to see some concrete examples.

PragmaticPulp 4 years ago

Basically anything with significant real-time requirements or high bandwidth requires an external FPGA or microcontroller.
Embedded Linux is great, but if you’re trying to do something like read from a high-speed ADC then the only way to do it is with an FPGA. The FPGA reads from the ADC at precise intervals and buffers the data. The embedded Linux system can then periodically read the buffer with all of the jitter and latencies that come with using Linux.
Virtually every Linux-based software defined radio, oscilloscope, and logic analyzer work on this architecture. For lower speeds you can get away with a microcontroller running bare metal code to do the buffering, but the high speed stuff enters the domain of FPGAs.
- coryrc 4 years ago
  
  > read from a high-speed ADC
  You just have the peripheral DMA and flag/interrupt when done. If you need an "immediate" reaction you use a DSP. There are only so many useful calculations you can do with a single input stream and DSP can handle them.
- andai 4 years ago
  
  Noob question, in this instance would a realtime OS or a unikernel also solve the problem?
  - tyingq 4 years ago
    
    It's better, but still not the same level of timing guarantees. I suppose, left to right, you would have something like:
    SBC/Linux -> SBC/Real-time OS -> General Purpose MCU -> Specialized MCU (Parallax Propeller, for example) -> FPGA/CPLD/DSP
    With perhaps some additions to the diagram to account for bit-banging vs actual drivers, speeds where some portion of the left side just isn't fast enough to even kind-of work, slow clock MCUs vs fast clock MCUs, etc.
    
    sbierwagen 4 years ago
    
    There are also some some hybrid SOCs, like TI's Sitara chips that the Beagleboard is built around, that have a ARM core for linux, then a couple MCU cores for doing fast-ish realtime stuff. (TI's FAQ says a four instruction busyloop that just toggles a pin can run at about 50MHz on a PRU)
  - emteycz 4 years ago
    
    The problem is, machine code is not the lowest level, there is also processor microcode. Machine code doesn't give you a hard real-time guarantee, the execution is still too approximate. FPGA enables you to work on/below the microcode level.
FredFS456 4 years ago

RF/radio is one solid application. Can't really do the signal processing fast enough on an SBC.
- carlsonmark 4 years ago
  
  Not just the signal processing on the received data, but if you want to transmit something, you will probably be using one or more DDS channels to do so. Those may be in the FPGA, or external chips. Either way, if you are mixing the outputs of the DDS, being off by a single clock cycle can cause your transmitted data to be complete garbage.
  With an FPGA and external DDS chips, this is difficult to do just because of mismatches in PCB trace lengths and/or small temperature fluctuations. With a microcontroller, it is nearly impossible to do even when using DMA because of memory bus contention.
- sillyquiet 4 years ago
  
  Ohh good one. mmRadar projects I guess will fall into this category too.
undersuit 4 years ago

There's a growing community using an Altera Cyclone SBC to create faithful recreations of retro gaming machines. Software emulation by the similar sized Raspberry Pi limits you, and the MiSTer is much more compact than a desktop computer that does have the power for accurate software emulation.
https://www.retrorgb.com/mister.html
tyingq 4 years ago

Driving LED matrix displays is a good example, since they require good adherance to timing on the output signal. Especially at high refresh rates. There's lots of hobby projects that get away with just using the CPU, but you're throwing a lot of horsepower at something a cheap CPLD could handle fine. There's also solutions like using the "PRU" in a Beaglebone to drive the display...the PRU is basically a microcontroller that can share memory with the CPU, but can work in a more real-time fashion.
So it's not always raw speed, per se, but anything that's sensitive to timing. Linux on a PI can be busy doing something else and miss a critical time to have output (or read) something. An FPGA based solution is working with known loop/io/etc times that don't change.
- marktangotango 4 years ago
  
  That's interesting, looks like the PRU is built into the AM3358 SOC, is that correct?
  - tyingq 4 years ago
    
    Yes, I believe it's in all (most?) of the products within the "Sitara" line, or at least AM33XX models.
    Lowrisc.org also has a similar plan for what they call "minion cores" in their RISCV based product, whenever that happens. Some NXP processors also have something called an "eTPU" that seems similar.
al2o3cr 4 years ago

This is a commercial product so it's not _quite_ what you asked for, but the production volume is pretty low (100s) and the implementation is literally an FPGA dev board mounted to the back of the interface panel.
https://intellijel.com/shop/eurorack/cylonix-rainmaker/
In this module, the FPGA's ability to do LOTS of computations in parallel is used to produce 16 taps of pitch-shiftable delay along with a 64-tap comb filter.
IshKebab 4 years ago

I think you won't find many because an FPGA that is as fast at computation as a Raspberry Pi will be thousands of pounds. The real advantage is latency and low-level control.
- tyingq 4 years ago
  
  There are cases where an FPGA is used to make a faster CPU. Old CPUs, of course, but it's still a pretty active niche. There are soft cores for Z80s, 6502, and other old CPUs that run circles around the real hardware.
coryrc 4 years ago

Only useful if a microcontroller peripheral doesn't already exist for the thing you want to do and you have some sub-millisecond latency requirements. If a calculation can be vectorized a CPU or GPU is really fast.
At normal speeds, an image can take 10-15 ms to clock out of the sensor. At that point, there's little reason not to run your image processing on a $3 CPU rather than $$$ FPGA because what's another < 30ms at that point and what would need a reaction that quick anyway?

thehappypm 4 years ago

Unbelievably tangential but your dog is very cute and I want to see more pictures!

mad_nedOP 4 years ago

Wish granted! @the.bessie.report on Instagram
- thehappypm 4 years ago
  
  I love this account! My dog also is a great pup, with some behaviors we’re working on, like barking and reactivity. From the scenery it looks like central MA if I had to guess too!
- errcorrectcode 4 years ago
  
  Hey, we don't need any Aladdin djinns showing off their magical puppers here. Definitely against guidelines and regulations. ;)

JoachimS 4 years ago

I was hoping that he was building a CPU that worked on Strong Kleenean logic. https://en.wikipedia.org/wiki/Three-valued_logic

jacquesm 4 years ago

https://en.wikipedia.org/wiki/UNIVAC_1100/2200_series

Had a 36 bit word length resulting in a 9 bit 'byte'.

klodolph 4 years ago

36-bit was a common enough word length. Not just UNIVAC, but IBM 360, PDP-6/PDP-10, and some others. Convenient both for octal (multiple of 3 bits) and working with pre-ASCII, 6-bit character encodings (multiple of 6 bits).
Which is why we have UTF-9 and UTF-18, as defined in RFC 4042.
https://datatracker.ietf.org/doc/html/rfc4042
(Spoiler: It's an April Fool's joke.)
malkia 4 years ago

I was coming to mention this, though I think my memory goes back to some LISP machine (and was related to car/cdr and related encoding if I'm not mistaken)
uvesten 4 years ago

I went way down the rabbit hole on this one. Seems that they are still made and used, fascinating!
- jacquesm 4 years ago
  
  They are pretty impressive machines. The loadable microcode store is especially interesting, they allow you to emulate an arbitary CPU. Diagnostics in 'IBM' mode was a real possibility on these!

_nalply 4 years ago

When I was bored, I thought about a minimal six-bit computer with a three-byte address bus (18 bit).

I went ahead and tried to design a machine language for that computer. There are three registers: the three-byte accumulator, the two-byte stack pointer and a 6 bit wide flag register and these addressing modes: accumulator, immediate, absolute, relative and stack.

It's possible that I will try to implement this system with the help of a FGPA.

Just out of curiosity. As a hobby.

einpoklum 4 years ago

> I decided to build a nine-bit computer

Somehow I was sure that sentence was going to end with "... in MineCraft!"

IncRnd 4 years ago

Here is a computer created in MineCraft! Wow, it's actually v5.0. The intro starts at 1:22. [1]
[1] https://www.youtube.com/watch?v=SbO0tqH8f5I
- einpoklum 4 years ago
  
  8-bit... amateurs :-)

Koshkin 4 years ago

Wrote an emulator of a simple 12-bit CPU once, and ran a few examples (coded in binary, of course) on it. It's a fun exercise - highly recommended!

vba616 4 years ago

Unfortunately, I don't have it anymore, but I was shown a souvenir IBM manual once for a machine with six bit bytes. Probably a 704.

teekert 4 years ago

I read it because of the steam powered machine at the top… turns out it’s an fpga.

bencollier49 4 years ago

Related question - is there an FGPA simulator / designer which works on OS X?

jecel 4 years ago

This one is written in Java:
https://github.com/hneemann/Digital
You can export your project as a Verilog file that can be used in the various FPGA tools.
- bencollier49 4 years ago
  
  Brilliant, thank you.

Settings

I decided to build a nine-bit computer

Keyboard Shortcuts