What happens when you open a terminal and enter `ls’
warp.devHey HN! I'm Suraj, one of the authors of this blog post. Andy (@acarl005) and I came up with the idea to write this post when we realized there weren't many existing resources on how a terminal works under the hood, end-to-end. If folks find this useful, we'd be happy to turn this into a series and dive deeper into sub-topics. Let us know what you think :)
Thank you for your article and I would be grateful for a deep dive series on how a terminal works would be great.
> ASCII text would be transmitted character-by-character over the wire as the user typed.
How does that work exactly on a lower level, say the current? ASCII text would be decoded to the binary and 1s would be high voltage and 0s would be low? And if there's no data transmitted it would be all low voltage?
On the old terminals, the wire sits at one voltage level until a character is transmitted. After that, the bits are spaced out at agreed-upon time intervals. So the receiver detects the first transition, and then reads and stores the bits by applying the same timing, then waits for the next character. Meanwhile, the value of that character, say an 8-bit number, is stored in a register, and typically the processor is interrupted. This gives the processor time to deal with each character before the next one is received.
Oddly enough I could have answered the OP's question in an interview, 40 years ago, but stuff has gotten so complex that I can't even tell you what all of the layers of abstraction are.
In the olden days, I'd drive my Fred Flintstone car to work. Serial I/O was done by wire-wrapping a UART chip on the board, which handled all the dirty details.
https://en.wikipedia.org/wiki/8250_UART
Though the UART chip I used was years earlier than that. Perhaps the Wikipedia article is wrong on that point.
On my first PC, the serial port was an add-on board, and I wire-wrapped one to save a few bucks. My hands were trembling as I powered it up. I don't remember the chip either. Next time I "wired" a UART, it was built into a microcontroller. I also bit-banged one for the PIC16C84 MCU.
But asynchronous serial isn't completely obsolete yet. It just has more protocol layers on top of the physical layers.
I had a design notebook for the project, which included all the datasheets for the various chips. Dammed if I know what happened to it.
It still works the same way, just abstracted as individual characters and not specifically the implementations of electrical representations of keys. Every character is stateless, so whatever's before or after doesn't matter. So this means that some operations are a lot easier and a lot harder than they otherwise might be, because while you only have to make a decision for one character at a time, you need to keep an internal state if you want to display characters or do something with them as a whole.
I binged CuriousMarc’s YouTube series on teletype restoration a few months ago, so here’s a simplified (and possibly somewhat inaccurate) explanation of how one works.
The only electrical components in a Teletype are (a) a continuously-running electric motor, and (b) an electromagnet and a few switches. Everything else is completely mechanical.
The Teletypes in a circuit, and the electromagnet and switches within them, are all connected in series, forming a current loop. (Current, rather than voltage, is used because you can use a constant-current power supply to get the same power at each electromagnet regardless of how many are in the circuit and how many miles of wire are between them.)
When the circuit is idle, current is flowing. This allows any Teletype to begin transmitting; if it were the other way around, each one would need its own individual line power supply. (Incidentally, this is the origins of the “break” key still found on many modern keyboards: when the circuit was disconnected, nobody could transmit, so “breaking in” to the circuit was how you interrupted somebody, and a lot of early computers used this as a primitive version of ^C.)
When you press a key, a rod is lifted, activating a clutch that connects the motor to the rest of the machine, which starts running. The first thing this does is break the circuit, deactivating the electromagnets in the receiving mechanisms of all the other Teletypes. When this electromagnet deactivates, it trips the clutch in those machines, starting them running so they can receive the character. (This is why it’s called the “start bit.”)
One-tenth of the way through the rotation, a cam in the sending machine switches from the “start bit” to the switch for the first data bit. If the rod on the key you pressed has a bump on it in the right spot, it will press this switch and send a “mark”; if it’s missing the bump you’ll get a “space” instead. On the receiving end, a clever mechanism connects the electromagnet to a lever: if the magnet engages, it pushes the lever one way; and if it doesn’t, the lever stays pushed the other way.
At two-tenths of the rotation, the first data bit switch is disconnected and the second data bit switch engages. This of course reads the second bump on the rod, and at the receiving end the clever mechanism has moved so the magnet pushes (or doesn’t) the second lever.
(Notable exceptions: the ‘ctrl’ key forces the first two bits low—this is why e.g. Ctrl+D and “End of Transmission” are the same—and ‘shift’ forces the first bit high.)
At three-tenths of the rotation, of course, the third bit is read, transmitted and received; this process continues for the remaining data bits.
The final two-tenths of the rotation are known as the “stop bits”, and consist of some signal that doesn’t matter much. The receiving end uses this time to trip the print mechanism: the levers that were pushed-or-not during the data bits engage with some bumps in some other rods, blocking all but one of them that matches the specific pattern, and then a thingie shoves forward, slamming the appropriate type bar up into the ribbon and paper, and as it returns the carriage advances one character. (The reason for using two bits is simply to give some time for all of this to happen.)
Finally, the mechanisms on the sending and receiving ends complete their full rotation, disengaging the clutch and coming to an abrupt halt; the transmitting end reconnects the line so the circuit is available, and everything is ready for the next character. All of this has happened in less than a tenth of a second, a mechanical ballet choreographed too fast for the human eye to see, and transmitting information potentially hundreds of miles at the speed of light.
Google asked me this in an interview.
Haha! The post was actually motivated by a common (interview) question: https://github.com/alex/what-happens-when. We thought it would make for a fun adaptation for the terminal world
Would you pass this test if instead you answered what would happen if you open a terminal in a made-up operating system and you essentially designed the whole thing in your answer?
I have asked this question in interviews before, with focus on low level OS aspects. It depends on whether your made-up OS includes the things we care about.
I usually ask the question in an open way, but guide the candidate to "fast forward" past the (for us) boring parts, and focus on the interesting parts we are looking for along the way.
If your OS covers the concepts we are looking for and the answers seem competent, sure, let's give it a try, but don't be surprised if we veer off of that. If the OS is too simple to cover everything (not unlikely for actually existing toy OS projects, I have one myself), we'd want to branch out to those concepts no matter whether you included them in your hypothetical OS or not. And if your made up OS is just too different, it might not apply either.
In other words, ask yourself why this is part of the interview, and that should answer your question: Because you are going to work on a particular OS with particular characteristics.
I mean, points for creativity; it just depends on this made-up answer. If you're relying on transistors made up of unicorn farts and gumdrops, and the interviewer has no appreciation for whimsy, that's not gonna go over well. OTOH, if they do, that's bonus points. All I'm saying is that it's a gamble.
I would prefer this answer, it shows an ability to create anew rather than memorize rote.
I don't think rote memorization applies very much here. When I ask the question, we start out open, let the candidate drive, but ask lots of questions along the way, and at certain points get very detailed on aspects that we care about particularly (while skipping quickly over, for us, uninteresting ones).
If you "rote memorized" both the high level view and the raw details so well that you can freely think and talk about it during the interview, I'd say you'd have to have had some experience as well.
Now with this article memorized one could answer this question with rote memorization.
However most people are unlikely to have done that.
Instead if someone can provide a good answer they are demonstrating a deep understanding.
I had the same for SRE @ Google - really simple question that probes the entire free-form depth of your systems understanding.
I guess you could leave out all the hardware aspects of it - that's why I think it's not a great question myself, just too wide.
One thing that is missing is the termcap/terminfo system. For funsies, you can try to run your session with TERM=dumb to get a retro experience :)
terminfo would be a good topic for a couple blogposts for sure. Back when I was first learning all this stuff I never felt like I truly understood all the intricacies of this system. I tried to understand back when alacritty made it's own entry, but a lot of things still feel somewhat arcane.
Basically, it's a capability list though, IIUC.
Yeah I like to think of it as a capability list as well. There was a brief footnote [12] about $TERM in the post, but terminfo could certainly be dissected further.
[12](https://www.warp.dev/blog/what-happens-when-you-open-a-termi...)
Nice to see a fun topic on the front page.
ls is not recognized as an internal or external command,
operable program or batch file.
;-)Haha, yeah we probably should have added a caveat that this is only applicable to Unix
I assumed so, but then I got curious what would happen in a DOS window.
Thankfully it works in Powershell on Windows. I lost any muscle memory for `dir` a long long time ago.
Until you hit something like "Get-ChildItem : A parameter cannot be found that matches parameter name 'lh'" and are jarringly reminded that this is not your old friend `ls` after all.
This is why, if I use a shorthand in PowerShell at all, I avoid using "dir", "ls", and the like. "gci" is at least not like any common Unix/BSD/PC-DOS/Windows/OS/2 command.
Awesome content! Very interesting how the process of a tool deal with the wires behind the scenes.
Thanks! It was fun to write this up. Glad you enjoyed :)
Great article. It's amazing how much ancient tech still pulsates in our sleek new machines.
Just like our rat brains still in our heads.
We aren't descended from rats, though. Lizard brains, maybe?
Thanks! Were there any topics in the blog that you think would benefit from more detail in a follow-up post?
Things you might want to add:
a) Escape sequence to set the "dynamic" terminal title
b) how e.g. GNOME notifies you that a long running command has completed
c) how e.g. GNOME asks if you are certain to close a terminal, but only in case you're not in a shell
d) maybe readline or /etc/inputrc
e) bash completions maybe? it's sort of in there already
Excellent article! Thank you
I want to use Warp but it has forced telemetry. Any roadmap for making this more private? Would happily give you my money
Warp actually made telemetry optional last month https://www.warp.dev/blog/telemetry-now-optional-in-warp
Now let us compile the source without the code at all.
A terminal should only do anything when the user types something and presses enter, and then it should only do what the user told it to do. The idea that it goes off to the network unasked is beyond invasive.
But good move. The idea of a terminal tracking my typings and commands is pretty scary stuff, especially if I'm going to use this thing for security sensitive sessions.
I just found out about warp, installed it and having a similar issue.
Terms are really weird for something I thought was a terminal app, and their commong questions are talking about "cloud-oriented features" which I really don't know and probably don't want.
I'd be happy to pay (even per month) for a version of this that asks you "do you want to use our cloud features or just the local version" and has different pricing and definetely no login.
Note: I didn't end up actually trying this as I really didn't like the sound of not knowing what my terminal is sending out. They do list what they send for telemetry, but not sure what is considered "cloud-features"
Thanks for the comment. I definitely feel your concerns.
To clarify, all cloud-oriented features are fully opt-in. For example, you have to explicitly share a block for us to store it in the cloud.
You can also opt-out of telemetry (which we use to determine feature usage and plan our roadmap FWIW). We even have a network log so you can see every network call we make. More details here: https://www.warp.dev/blog/telemetry-now-optional-in-warp
Another issue is that when an update is available I need to be able to tell Warp that I don’t care at the moment.
Due to the corporate software on my work Mac auto-updates fail. Ordinarily this isn’t a huge deal as I’ll just download the DMG and replace it manually. But Warp drops down an obtrusive overlay and you can’t dismiss it until you update.
I tried to like it, but little forced choices like that sent me back to iTerm 2.
Hey, sorry about that! We actually have a WIP fix for this exact issue (the obtrusive overlay that you can't dismiss). Should be out in the next week or two.
Great to hear, looking forward to trying Warp once again! It did have some really handy stuff built in.
That article starts with too many inaccuracies to recommend it to anybody: - Teletypes were designed about a century before the era of mainframes. - Mechanical teletype did not have an "I/O driver" in it. - OS on Mainframe computers did not have a "Kernel", and neither "I/O driver", "Line discipline" or "TTY driver". This model was introduced with UNIX, the OS that ran on minicomputers. - Was ChatGPT used to write this article? ;)
I've recently stumbled upon an article on the same topic, but containing competent and accurate information, I have a link because I recommended it to a friend: https://thevaluable.dev/guide-terminal-shell-console/
Hi. I definitely took some liberties with terms here. For example, I said "I/O driver" instead of, say UART driver because I didn't want to get overly specific on the historical part. That wasn't the focus of this article. Thanks for linking this other one, as it fills in those parts well.
That's quite an unnecessarily toxic comment, besides all your criticism seems fixated with the introduction on teletypes.
I enjoyed the blogpost, I trust the authors of a terminal emulator to give me a good overview on the topic and I enjoyed it.
I’ve seen many times descriptions of how original teletypes work and how that influences modern terminals.
However, something that I have not seen yet and I’d love to check is an attempt to redesign the terminal from scratch separating historical baggage from the parts that still make sense nowadays.
How would a designer create this tool without the historical precedence? Is tradition holding us back? What new standards could we reach ?
I love this sentiment. Your questions would make for a very interesting post.
While we haven't rebuilt from absolute ground zero, Warp is definitely trying to extend the capabilities of a terminal (emulator) from what's historically been possible. For example, we introduced a dedicated input editor so you can have an IDE-like experience in the terminal. It's fundamentally different from how input is entered in a traditional terminal. But with this innovation, we've had to be careful to ensure that all the input features you expect in a normal terminal (even obscure ones like `alt-.`) work how you'd expect, _and then some_.
Overall though, starting from scratch is hard because we need to stay backwards-compatible with all the CLIs we use everyday.
That was already done in the 1980s and early 1990s. Microsoft/IBM operating systems evolved the notion of consoles, still-handle-based I/O devices that responded to additional first-class I/O system calls (beyond the read/write/ioctl model) for 2-D addressing, direct output buffer manipulation (including reading of the output buffer), keyboard input that did not hide the decoding of key chords into characters, and mouse input.
We like some of the baggage for two main reasons. 1) some designs were well thought out back when resources and performance really mattered, a lot, so they make for fast and efficient implementations enforced by spec. 2) perhaps more importantly, we like our old tools to continue working, plain and simple.
> This model was introduced with UNIX
The UNIX IO system was adapted from Multics.
> That article starts with too many inaccuracies to recommend it to anybody
Why would you even read it after reading the title?