The terminal, the TTY, and the shell

16 min read Original article ↗

People say things like "I work in the terminal" or "drop into the console" all the time. These phrases are useful shorthand, and almost everyone who uses Linux knows what they mean in practice. But if you stop and ask which piece of software is being referred to, the answer is not simple. What is perceived as "the terminal" is not one program. "The console" can mean three or four different things depending on the context. The casual phrasing is fine, but the precise picture underneath it is worth having, because everything else in this chapter will assume you can point to the right layer.

This first text of the chapter does almost no shell scripting. We will not cover pipes, redirections, PATH, variables, or job control here. The point is to name the parts. Once that is in place, the rest of the chapter has a foundation to stand on.

Worth saying upfront: even though this series is about Linux, the picture below is not Linux-specific. The same three layers exist on macOS, FreeBSD, and any other Unix-like system. We'll come back to this near the end.

Table of contents

Open Table of contents

What is actually running when you "open a terminal"

When you click on a terminal icon, a window appears with a prompt that says something like uros@laptop:~$. You type ls and the output prints. Next, you type cd projects and the prompt changes. It feels like one, unified program.

It is not one program, however. There are at least three independent pieces of software cooperating to produce that experience, with the kernel sitting between two of them. The pieces are:

  1. A graphical application that draws the window and renders characters. This is the terminal emulator.
  2. A kernel object that acts as the pipe between the emulator and whatever program is running inside it. This is the TTY, or in modern desktop use, a pseudo-terminal (pty).
  3. A userspace program that reads commands you type, parses them, and runs other programs. This is the shell.

Each of these is its own thing, owned by a different party, with its own contract. They are bolted together so seamlessly that most people never notice the seams. The rest of this article walks through each one and ends with a small program that lets you see the boundaries from inside a running process.

The terminal emulator

GNOME Terminal, Konsole, Alacritty, kitty, xterm, foot, you name it. On macOS, Terminal.app and iTerm2. On Windows, Windows Terminal. These are all the same category of program: graphical applications that draw a window, render a font, accept keyboard and mouse input, and paint characters on the screen.

The terminal emulator does not know what ls means. It actually does not even know what a file is. Additionally, it does not parse commands. It does not know what program is on the other end of the connection it's writing into. From the emulator's point of view, it is reading bytes from a file descriptor and painting them as characters, and it is writing your keystrokes into the same file descriptor as bytes going the other way. That is the entire job.

The word "emulator" in the name is literal. These programs are emulating a piece of 1970s hardware called a video terminal: physical machines like the DEC VT100 and VT220 that had a screen, a keyboard, and a serial line going somewhere else. When the program on the other end wanted to move the cursor or change the color, it would send special byte sequences (escape sequences) that the hardware terminal would interpret. Today's terminal emulators still speak those same escape sequences, which is why your $TERM environment variable says things like xterm-256color. The hardware is gone, but the protocol stayed.

There is one important variant worth naming: on Linux, you can switch to a text-mode virtual console (typically with Ctrl+Alt+F2 through F6). There is no graphical application there. The kernel itself draws the glyphs directly to the framebuffer. The role of "terminal emulator" is played by the kernel in that case. Everything else in this article still applies.

This is the layer that holds the whole thing together, and it is often the confusing bit.

The terminal emulator (a userspace program) and the shell (another userspace program) need to talk to each other. They cannot just share memory: they are separate processes with separate address spaces. They could in principle use a regular pipe or a socket, but historically there was already a well-defined kernel abstraction for exactly this kind of communication, and it predates Linux by decades.

The name TTY is short for "teletypewriter." In the early days of Unix, a terminal was a physical device: a keyboard attached to a printer (and later, a screen), connected to the computer by a serial line. The kernel has a driver for that serial line, and the device showed up as a file in /dev, like /dev/tty0. Programs read from it to get user input and write to it to display output. When graphical workstations arrived, the physical teletype was gone, but the abstraction was too useful to throw away. So the kernel grew a way to create a "fake" teletype in software: a pseudo-terminal, or pty.

A pty is a kernel object that exposes two endpoints. One end is called the master (or PTY master), the other is called the slave (or PTY slave). Whatever is written to the master can be read from the slave, and whatever is written to the slave can be read from the master. Both ends are file descriptors, so they're operated on with the same read and write system calls used for any other file.

When you open a graphical terminal emulator, this is what happens, in order:

  1. The emulator asks the kernel for a new pty. The kernel hands it back two file descriptors: the master and the slave.
  2. The emulator forks a child process. In that child, the slave is set up as stdin, stdout, and stderr.
  3. The child then executes a shell program (bash, zsh, or whatever the user has configured).

From that point on, the shell is reading commands from "stdin" without any awareness that stdin is actually the slave end of a pty. The emulator is reading from the master end and painting the bytes it gets as characters on the screen. When you press a key, the emulator writes a byte to the master end, and the shell reads it from the slave end. Two userspace programs, talking through a kernel object that looks exactly like a file to both of them.

You can see this from inside your shell. Run:

$ tty

You'll get back something like /dev/pts/3. That is the slave end of your pty, presented as a file in the filesystem. The pts part stands for "pseudo-terminal slave," and the number is just a counter. Your shell has this file open as its stdin, stdout, and stderr. The "everything is a file" principle is doing real work here: the terminal you're staring at is, from the kernel's perspective, a special file under /dev.

The pty is not just a dumb pipe. The kernel runs something called a line discipline in the middle, which adds behavior that a plain pipe would not have: it echoes characters back so you see what you typed, it buffers input by line so the shell receives a whole line when you press Enter rather than one character at a time, and it generates signals from special keys (Ctrl+C becomes SIGINT, Ctrl+Z becomes SIGTSTP). All of this happens inside the kernel, between the two endpoints. The shell and the emulator don't implement any of it. We'll cover the line discipline in detail later in the chapter, because it deserves its own treatment.

The shell

Now come bash, zsh, fish, dash, ksh, ash, BusyBox's built-in shell, etc. These are all userspace programs and there is nothing special about them at the operating system level: from the kernel's point of view, a shell is just a process like any other, with stdin and stdout connected to whatever file descriptors it inherited (in our case, the pty slave).

The shell's job is to read bytes from stdin, treat those bytes as a programming language, and act on them. It parses commands, does word splitting and glob expansion, looks up programs in PATH, etc. It calls fork and exec to run those programs as child processes. It wires up pipes between commands. It handles redirections. It interprets control flow (if, while, for). It manages background jobs. It maintains variables and a history of past commands. None of this is in the kernel, and none of it is in the terminal emulator. It is all just code in the shell's process.

The shell is one program in a family, not a single canonical thing. The names you'll encounter most:

  • bash - the GNU Bourne Again Shell. The de facto interactive shell on most Linux distributions for a long time, and still the default on Debian-derived systems for the root user and many others.
  • zsh - the Z Shell. Very popular for interactive use, especially with frameworks like Oh My Zsh. Default login shell on macOS since Catalina.
  • fish - a friendlier shell with autosuggestions and syntax highlighting out of the box. Not POSIX-compatible by design.
  • dash - a small, fast, strictly POSIX shell. On Debian and Ubuntu, /bin/sh is actually dash, not bash. Used for running shell scripts where startup time matters.
  • ash and busybox sh - tiny shells used in embedded systems and minimal containers, including Alpine Linux.

All of these are doing the same kernel-level job: reading from a pty, parsing a language, forking children, writing to a pty. They differ in what language they accept and what features they offer for interactive use.

The reason different shells can coexist and even substitute for one another in scripts is the same reason different C libraries can coexist on Linux: there is a standard. POSIX defines a shell language, often called "the POSIX shell" or just "sh." Most shells implement that language as a subset and add their own extensions on top. This is the direct analog of what we saw with glibc and musl in the C series: a standardized interface, multiple implementations with different feature sets and trade-offs.

A small but real example of the fragmentation that follows from this: a shell script that begins with #!/bin/sh will run under whatever /bin/sh points to on the system that runs it. On Ubuntu, that's dash. On Alpine, that's BusyBox's ash. On some other systems, it's bash in POSIX mode. A script that works on one of these may quietly fail on another if it accidentally used a bash-specific feature. This is exactly the kind of friction Linus Torvalds has complained about, and it ties back to the broader fragmentation theme mentioned in other series.

Who owns what

It's worth stepping back and looking at the contracts.

The kernel owns the TTY and pty mechanism. It defines what read and write do on those file descriptors, how the line discipline behaves, which keys generate which signals, and which file under /dev/pts corresponds to your terminal. This contract is stable in the same way Linux's syscall contract is stable: code written against the pty interface fifteen years ago still works today.

The terminal emulator owns the rendering side. It decides which font to use, how to draw glyphs, which color palette to apply, and how to interpret escape sequences for cursor movement and styling. Different emulators support different sets of escape sequences and different terminal features (true color, ligatures, image protocols, hyperlinks). The $TERM variable is how the shell and the programs running under it know which dialect of escape sequences to speak.

The shell owns the command language. It defines what ls -la | grep foo > out.txt actually means, what *.c expands to, what $HOME evaluates to, and how to invoke fork and exec to make all of that happen.

Programs running inside the shell, like ls or grep, don't know or care which terminal emulator you use. They don't know anything about pty masters or line disciplines. They only ever ask the kernel one question: "is my stdout a terminal?" That question is answered by the isatty function, and tools use it to decide whether to colorize output, paginate, or behave differently when their output is being piped into another program.

None of this is unique to Linux. The same picture applies to macOS, FreeBSD, OpenBSD, and any other Unix-like system. Terminal.app on macOS is a terminal emulator in exactly the same sense as GNOME Terminal. macOS has ptys with masters and slaves. macOS has a line discipline. macOS's default shell is zsh, the previous default was bash, and there is a POSIX sh available. The three-layer model is at the POSIX level, not the Linux level. If you're reading this on a Mac, every paragraph above describes your system as accurately as it describes a Linux box.

Seeing the boundary from inside a process

Let's make this tangible with a tiny C program. It opens nothing, forks nothing, and asks the kernel two questions about its own standard input: "is this a terminal, and if so, what is its name?"

#include <stdio.h>
#include <unistd.h>
#include <errno.h>
#include <string.h>

int main(void) {
    if (isatty(STDIN_FILENO)) {
        char *name = ttyname(STDIN_FILENO);
        if (name == NULL) {
            perror("ttyname");
            return 1;
        }
        printf("stdin is a terminal: %s\n", name);
    } else {
        printf("stdin is NOT a terminal (errno=%d: %s)\n",
               errno, strerror(errno));
    }
    return 0;
}

Build and run it interactively:

$ gcc -o tty-probe tty-probe.c
$ ./tty-probe
stdin is a terminal: /dev/pts/3

Now run the same program with its stdin coming from somewhere other than the terminal:

$ echo hello | ./tty-probe
stdin is NOT a terminal (errno=25: Inappropriate ioctl for device)

$ ./tty-probe < /etc/hostname
stdin is NOT a terminal (errno=25: Inappropriate ioctl for device)

The same program, the same binary, sees a completely different world depending on what is connected to its file descriptor 0. In the first case, stdin is the slave end of the pty your shell is using, and ttyname reports the path. In the second and third cases, stdin is a pipe or a regular file, and isatty returns 0. This is the exact mechanism ls uses to decide whether to print colors: it asks, "is stdout a terminal?" If yes, colors. If no (because you piped its output into grep), no colors. Now you know how it works.

This program compiles and runs without modification on macOS, on FreeBSD, and on any other POSIX system. The interface is part of the POSIX standard, not a Linux extension.

Speaking the protocol across the boundary

The tty-probe program asked the kernel a question about a file descriptor. This next one writes bytes that travel through the pty unchanged and get interpreted by the emulator on the other side as commands rather than characters. It's a "hello, world" where the two words are painted in different colors:

#include <stdio.h>

int main(void) {
    printf("\033[31mhello\033[0m, \033[32mworld\033[0m!\n");
    return 0;
}

Build and run it:

$ gcc -o hello-color hello-color.c
$ ./hello-color
hello, world!

You'll see "hello" in red and "world" in green. There is no "color" data type involved anywhere. The string is just bytes. The sequence \033[31m is five of them: ESC (0x1b), [, 3, 1, m. Our program writes those bytes to stdout, the kernel hands them through the pty unchanged, and on the other end the terminal emulator recognizes the ESC-[ prefix as a Control Sequence Introducer (CSI), reads 31m as "set foreground color to red," paints the following bytes - hello - in that color, then sees \033[0m and resets back to the default.

This is the VT100 protocol we mentioned at the start of the article. The hardware is gone but the bytes are the same. Your $TERM variable is what tells programs inside your shell which dialect of these sequences the emulator on the other side of the pty understands.

It also closes the loop on the isatty dance from the previous example. If you redirect ./hello-color into a file, the escape bytes go into the file along with the text and become noise that any non-terminal tool reading the file has to deal with. That is why ls --color=auto checks isatty first: it only emits the protocol bytes when something on the other end actually speaks the protocol.

Where this leaves us

You can now point at what is happening when you "work in the terminal":

  • The terminal emulator is the graphical program drawing the window. It knows nothing about commands. It paints glyphs and forwards keystrokes.
  • The TTY (a pty in the modern desktop case) is the kernel object connecting the emulator to the shell. It is a bidirectional pipe with line discipline in the middle, exposed as a file under /dev.
  • The shell is a userspace program reading bytes from the pty, parsing them as a command language, and running other programs. There are many shells; POSIX defines the common language.

Saying "I work in the terminal" is fine shorthand. You now know which of the three pieces (or all three) you're really pointing at when you say it.

The picture so far is missing one crucial thing: how the kernel decides which process should receive Ctrl+C, why your shell does not die when you press it, what happens when you close the terminal window, and what binds a shell, its child processes, and a pty together as a single unit. That is the topic of sessions, process groups, and controlling terminals, and it is the next text in this chapter.

The rest of the chapter then digs into the pieces we have just hand-waved at: how the shell looks up commands through PATH, what fork and exec actually do, how pipes and redirections are wired together as kernel objects, what quoting and word splitting really mean, why exit codes matter, and what it takes to turn a sequence of commands into a real script. If any of that felt glossed-over here, that was on purpose. Each one earns its own entry.