Why do our programs need to read input and write output?

7 min read Original article ↗

I wrote this post to challenge basic assumptions that people make about software architecture, which is why I chose a deliberately provocative title. You might not agree with all the points that I am about to make, but I do hope that this post changes the way that you think about programming

This post is an attempt to restate in my own words what Conal Elliot (and others before him) have been saying for a while: modern programming is a Rube-Goldberg machine that could be much simpler if we change the way we compose code.

Most programmers already intuitively understand this at some level. They will tell you that the programming ecosystem is deeply flawed, fragmented, and overly complex. I know that I felt that way myself for a long time, but in retrospect I believe that my objections at the time were superficial. There were deeper issues with programming that I was blind to because they are so ingrained in our collective psyche.

Disclaimer: This post uses my pet configuration language Dhall to illustrate several points, mainly because Dhall is a constructive proof-of-concept of these ideas. The purpose of this post is not so much to convince you to use Dhall but rather to convince you to think about software in a new way

Consider the title of this post for example:

"Why do our programs need to read input and write output?"

Most people will answer the title with something like:

  • "Our programs need a way to communicate with the outside world"
  • "Programs need to do something other than heat up CPUs"

Now suppose I phrased the question in a different way:

"What if only the compiler could read input and write output?"

"What's the difference?", you might ask. "Surely you mean that the language provides some library function that I can call to read or write input to some handles, whether they be file handles or standard input/output."

No, I mean that only the compiler implementation is allowed to read input or write output, but programs written within the compiled language cannot read input or write output. You can only compute pure functions within the language proper.

Again, this probably seems ridiculous. How could you communicate at all with the program?

Imports

Most languages have some way to import code, typically bundled in the form of packages. An imported function or value is something that our compiler reads as input statically at compile time as opposed to a value read by our program dynamically at run time.

Suppose I told you that our hypothetical programming language could only read input values by importing them

"Ridiculous!" you exclaim as you spit out your coffee. "Nobody would ever use such a language." You probably wouldn't even know where to begin since so many things seem wrong with that proposal

Perhaps you would object to the heavyweight process for publishing and subscribing to new values. You would recite the package management process for your favorite programming language:

  • Create a source module containing your value
  • Create a standard project layout
  • Create a package description file
  • Check your project into version control
  • Publish your package to a package repository

Perhaps you would object to the heavyweight process for configuring programs via imports? Your favorite programming language would typically require you to:

  • Retrieve the relevant program from version control
  • Modify the project description file to reference the newly published dependency
  • Modify project code to import your newly published value
  • Compile the program
  • Run the program

Why would a non-technical end user do any of that just to read and write values?

This is exactly the Rube-Goldberg machine I'm referring to. We have come to expect a heavyweight process for source code to depend on other source code

Importing paths

Distributing code doesn't have to be heavyweight, though. Consider Dhall's import system which lets you reference expressions directly by their paths. For example, suppose we saved the value True to a file named example.dhall:

$ echo 'True' > example.dhall

Another Dhall program can reference the above file anywhere the program expects a boolean value:

$ dhall <<< './example.dhall || False'
Bool

True

This is the exact same as if we had just replaced the path with the file's contents:

$ dhall <<< 'True || False'
Bool

True

Dhall doesn't need to support an explicit operation to read input because Dhall can read values by just importing them

Similarly, Dhall doesn't need to support an explicit write operation either. Just save a Dhall expression to a file using your favorite text editor.

"What if I need a way to automate the generation of files?"

You don't need to automate the process of saving a file because one file is always sufficiently rich to store as much information as you need. Dhall is a programmable configuration language which supports lists and records so any one file can store or programmatically generate any number of values. Files are human-generated artifacts which exist purely for our convenience but Dhall code does not behave any differently whether or not the program spans multiple files or a single file.

Most of the time people need to automate reads and writes because they are using non-programmable configuration file formats or data storage formats

Programmable configuration

You might object: "Configuration files shouldn't be Turing-complete!"

However, Dhall is not Turing-complete. Dhall is a total programming language, meaning that evaluation eventually halts. In practice, we don't actually care that much if Dhall halts, since we cannot guarantee that the program halts on reasonable human timescales. However, we can statically analyze Dhall and most of the Halting Problem objections to static analysis don't apply.

For example, Dhall can statically guarantee that programs never fail or throw exceptions (because obviously a configuration file should never crash). Dhall also lets you simplify confusing files by eliminating all indirection because Dhall can reduce every program to a canonical normal form.

In fact, most objections to programmable configuration files are actually objections to Turing-completeness

Importing URLs

Dhall also lets you import code by URL instead of path. Dhall hosts the Prelude of standard utilities online using IPFS (a distributed hashtable for the web),and you can browse the Prelude using this link, which redirects to the latest version of the Prelude:

For example, you can browse the above Prelude to find the not function hosted here:

... which has the following contents:

$ curl https://ipfs.io/ipfs/QmQ8w5PLcsNz56dMvRtq54vbuPe9cNnCCUXAQp6xLc6Ccx/Prelude/Bool/not
{-
Flip the value of a `Bool`

Examples:

```
./not True = False

./not False = True
```
-}
let not : Bool → Bool
    =   λ(b : Bool) → b == False

in  not

... and we can "apply" our URL to our file as if they were both just ordinary functions and values:

$ dhall <<< 'https://ipfs.io/ipfs/QmQ8w5PLcsNz56dMvRtq54vbuPe9cNnCCUXAQp6xLc6Ccx/Prelude/Bool/not ./example.dhall'
Bool

False

... or we could use let-expressions to improve readability without changing the result:

    let not = https://ipfs.io/ipfs/QmQ8w5PLcsNz56dMvRtq54vbuPe9cNnCCUXAQp6xLc6Ccx/Prelude/Bool/not

in  let example = ./example.dhall

in  not example

"That's a security vulnerability!", you protest. "You are ... literally ... injecting remote code into your program."

Playing the devil's advocate, I ask you what is wrong with remote code injection

"Well, for starters, if the URL is compromised the attacker can run arbitrary code like ..."

... reading and writing files? Dhall is totally pure and doesn't support any effects at all (besides heating up CPUs ☺).

This brings us full circle back to our original question:

"Why do our programs need to read input and write output?"

Conclusion

Software doesn't have to be so complex if we challenge our assumptions that got us into this mess

These ideas were heavily inspired the following post by Conal Elliot:

... and this post is not about Dhall so much as Conal's vision of an effect-free purely functional future for programming. I believe explicitly reading input and writing output will eventually become low-level implementation details of higher-level languages, analogous to allocating stack registers or managing memory.