first-cc-gcc
A port of the earliest C compiler to modern GCC. The compiler outputs PDP-11 assembly code that can be compiled and run on a PDP-11 emulator (check out c72 if you want x86 code that runs on current Linux). The compiler runs only in 32 bit mode as the original code assumes that the pointer size and word size are the same.
Usage
To compile the compiler and run it simply do:
make
./cc examples/fizzbuzz.c > fizzbuzz.sNote: if you get errors on missing "bits/libc-header-start.h" headers make sure you have the 32bit libc installed.
Emulator
The hard part is to set up an emulator, transfer the file to it(!) and run the assembler. A very early UNIX implementation based on SIMH is available. For Windows, there is also a pre-built binary
I could not get the tape emulators working so ended up with a hacky solution to transfer files. The simulator lets you to log in via telnet, so the files are copied by starting up a text editor on the simulator and streaming the characters into it and then saving and closing the file.
Also, if you close the connection the session is lost, so it is important to keep the connection to the simulator alive with a hack using ncat.
# Start emulator pdp11 simh.cfg # Open a pipe to the simulator # If you use the prebuilt Windows simulator, use port 12323 ncat -lk -p 5556 | ncat localhost 5555 # send login username to emulator echo root | emulator/emucat # copy file over by typing it into ed emulator/cpfile fizzbuzz.s /fizzbz.s # call assembler and linker emulator/emucc /fizzbz.s # execute the compiled program echo a.out | emulator/emucat
Note that the file is called fizzbz.s on the emulator. This is because the UNIX used here handles 8 character long filenames only!
Old C features
This version of C is from around 1972. While the general syntax is pretty much the same as today, there are tons of missing features:
- no preprocessor, no for loops
- even though there is a keyword for
floatanddouble, floating point calculations are not implemented, you can not even write a floating point literal - the type system is very weak: pointers, chars, ints can be freely converted into one another
- types of the function parameters are not checked, anything can be passed to any function
- compound assignment operators are reversed, they are
=+,=* - only integer global variables can be defined, and the syntax is strange (see helloworld example)
- variable names can be of any length but only the first 8 characters are used; i.e. deadbeef1 and deadbeef2 are effectively the same variables
Interestingly, some features that were already existing in this early version:
- function pointers
- the ABI is nearly the same as today's 32 bit ABI
a[b]is implemented as*(a+b)