Reading Code is an Essential Skill

Press enter or click to view image in full size

Over the last couple of months I have spoken about “next generation software development” to audiences in Bangkok, Bangalore, and Tokyo. The goal of my talk is to show how our profession’s skill at tool building takes us from the simplest assemblers to today’s AI-powered coding assistants in a long, unbroken chain. The most common question that I get can be summarized as:

So what do we do to prepare for this new era?

After some thought I answered:

Learn to read code

While this took some people aback because it sounded too simple, I believe that it is an essential yet neglected skill. A quick search for advice to aspiring writers is to read a lot, and preferably in another genre that the one that you aspire to write in. So, why don’t we train developers to read code?

When an AI-powered coding assistant proposes some code, the developer must be able to read it, understand it, and make a decision as to its applicability and correctness.

Let me share some thoughts on the matter, mixed in with some personal experience…

Back to College

When I was in community college in the late 1970’s, I used to dumpster-dive in the Montgomery College Computer Science building for operating system “build” listings for the department’s IBM mainframe. Back in those days, changing the basic parameters of the operating system required a complete reassembly from source code, resulting in a stack of green bar paper 5 or 6 inches thick:

While my classmates were having a fun and carefree college experience I was poring over these listings in order to understand how the operating system worked and how to write better assembler code.

The Lions Book

In the early 1980’s I bought my copy of the C Programming Language and started to use Unix System 7 running on a PDP-11 while working at Contel Information Systems in Bethesda, Maryland. At some point one of my colleagues gave me a photocopy of the infamous Lions’ Commentary on the UNIX Operating System document, also known as the Lions Book.

Once again I spent my precious free time diving deep in order to understand the finer aspects of system initialization, task switching, memory management, and I/O handling.

Get Jeff Barr’s stories in your inbox

Join Medium for free to get updates from this writer.

Several years later, all of this time spent reading code and understanding systems at a deep level paid off when I was asked to run four copies of a single-user operating system on the same hardware (read IN/MSX: Running 4 Copies of an Operating System at Once for that story).

Learning to Read Code

I do not believe that our educational system does enough to teach this skill. Most of the time we are taught to start with an empty file and build from there. Imagine if the first weeks of a typical CS 101 curriculum were spent studying the “classics”, with students asked to figure out the principal organization of the project, the primary data structures, and the algorithms. They should also be instructed to pay attention to aesthetics — variable naming, indenting, commenting, and so forth.

The curriculum should also look at the project history over time, commit-by-commit in some cases, again with the goal of building understanding.

Rather than studying idealized pseudo-code, the classics should be battle-worn production programs that have been patched, maintained, enhanced over time by developers with varied levels of skill and familiarity with the code.

All of this reading would give students a better sense of what real-world production code looks like after it has spent some time out in the wild, scars and all!

Only after doing a lot of reading would students be allowed to write their own code. With the help of great teachers & carefully chosen examples they would be better equipped to grow into highly capable developers.

Writing for Readability

As I was thinking about this post, I also thought about some efforts to write code in ways that would make it easier to read and to understand. Two (of many) came to mind right away: Hungarian Notation and Literate Programming.

Hungarian Notation

Invented by early Microsoft employee Charles Simonyi, Hungarian Notation attempts to encode the type or purpose of a variable in the first couple of letters of the name. For example the p prefix indicates a pointer, sz a string, w a word, and b a byte. This system was widely used (but definitely not widely liked) when I was at Microsoft 25+ years ago. I don’t know if it is still used for new code today or if it has fallen out of favor.

Literate Programming

Invented by living legend Don Knuth, first described in his 1983 paper (Literate Programming), and fully laid out in his Literate Programming book in 1992, this system and methodology aims to consider programs as works of literature. As he writes in the paper:

Let us change our traditional attitude to the construction of programs: Instead of imagining that our main task is to instruct a computer what to do, let us concentrate rather on explaining to human beings what we want a computer to do.

Press enter or click to view image in full size

TeX: The Program

Despite the appeal and the expectation that literate code is easier to read and hence a better teaching tool, it has never become mainstream. However, I do wonder if the detailed natural-language explanations found in literate code make it ideal fodder for the training data that is used to create the LLMs that power today’s coding assistants.

Learn to Read Code

My hope is that you will spend more time reading code, that it will make you a better developer, and that it will empower you to get even more value from your favorite AI-powered coding assistant. Let me know how that goes!