A case against text protocols
unmdplyr-new.bearblog.devI don't really agree... with any of this, actually.
It IS simpler to use text - He claims "Text was never as portable as it's believed to be.", but ascii/unicode are probably the most portable formats we've ever created. I can't think of a computer that won't be able to parse and display one of those two formats (From embedded hardware, to old f16 parts, to my modern laptop, to the raspberry pi, to the fucking computer I designed in my EE classes).
Being able to type out messages is hugely helpful while debugging and developing (I copy and paste things that look exactly like the code he claims no one would ever write - It's like he doesn't understand the value of a clipboard, or a text editor I can dump a message into and change a single value in - Something I can conveniently do on pretty much any system ANYWHERE without having to install any extra software if the format is text)
His parsing example is hilarious - See that readable text above? Psh, Folly! That's hard to read so lets use specialized tools that depend entirely on system specific details and configuration (Int size, byte order, struct packing, etc) and claim that's better!
Extensibility, meh - I find this one rarely matters as much as people believe it does, but to me, the big benefit of text is that I can easily craft messages with new fields myself without having to write code to do it.
Error recovery... I can sort of agree (in transit over a noisy channel, use a format that supports ECCs) but he misses that there are two different types of error here - An unexpected field value/type, and a generally malformed payload.
The first will break binary but not something like a json parser. The second will break both (he only talks about the second, since he assumed the failure happens at tokenization time...)
Basically - My whole point devolves into "It sure seems like he's arguing for premature optimization".
If you have a spot where text is particularly expensive or inefficient, suck it up and move to a binary protocol that requires more documentation, tooling, and work. Everywhere else... it seems like a bad move.
I've taken advantage of text protocols countless (hundreds? thousands?) of times in my career to troubleshoot, learn, and experiment.
Just a few weeks ago when I needed to peek into some StatsD packets to ensure we were sending what we expected when monitoring wasn't working. If it was a binary format this simply would not have been an option as this was a remote environment with limited tooling available to it.
> Being able to type out messages is hugely helpful while debugging and developing
You are only a small fractional part of the entire life of the protocol. Protocol designs largely impact hardware design and requirements. A binary protocol will safely work on a micro-controller without too much effort while a text based protocol requires some serious CPU juice, code storage space and volatile memory to get it off the ground.
Sure, text will work on most devices, but the parsers for those text based protocols become excessively complex. There is some ideas floating on in the field of "green" computing where its becoming increasingly imperative to do more processing per watt. Text parsers will certainly not fit into that bill.
> new fields myself without having to write code to do it.
Then what's the point of that new field if there's no code to handle it? That said, CBOR and TLV is similar too -- in that, you can add new fields without any code to handle it. But what good is it?
> An unexpected field value/type, and a generally malformed payload.
Your parser will break the key-value pair and hand it off to a call-back or something that makes sense of what key means, perhaps a semantic analyzer? And when it reaches that point, you have irreversibly wasted enough CPU time already only to discover that the whole message is invalid. Not that I am unaware of the difference, just that differentiating them is often pointless.Content-Length: bad-stringContinuing the hilarity...
> That's hard to read so lets use specialized
Is it CR? Is it LF? Is it CRLF? Did I configure my text editor to use the correct line terminator? Does my clipboard reset the CRLF to CR or LF? Oh, wait is that a space (0x20) or tab (0x09) there? Hmm.. never mind.
Also,
vsFrom: 1234<1234@example.org>;branch=abcd1234
Which is the right way? Should my parser expect `branch` in URI or should it expect only when parsing `From` and `To` address? Should I make it part of the URI sub-parser or should I make it part of the top-level parser for endpoint addresses? This was a real inter-op problem between two big vendors.From: 1234<1234@example.org;branch=abcd1234>
Another argument: text-based protocols often admit too many degrees of freedom in constructing messages, the handling of which is left underspecified and completely overlooked during implementation. (What happens if you separate lines with LF instead of CRLF in HTTP headers? What if the opening and closing HTML tags don't match? I know this should not usually happen, but how should I handle it when it does anyway?)
It's not by any means exclusive to text-based protocols, but there's this tendency to assume everything about a text-based protocols is ‘obvious’, ‘self-documenting’ and doesn't need specifying, and to think that just because the individual elements of the protocol are human-readable, this will somehow magically make the computers using the protocol follow the Gricean maxims (if it doesn't make sense, nobody will ever say that, therefore I don't need to think about it).
> the handling of which is left underspecified
I used to see Postel's Law ("be conservative in what you send, be liberal in what you accept") quoted as some sort of antidote, but it seems to have fallen out of fashion -- I think enough people saw how that ideal played out in reality. Nowadays a JSON library feels justified throwing a fit if it sees a comment string instead of playing along with such shenanigans.
> It's not by any means exclusive to text-based protocols
Plus, I would argue text-based greatly increases the surface area for ambiguity, whereas, for instance, there are only a few ways a sane person would send an integer as bytes.
I’d say it’s played out ... ambiguously. Conservative HTTP unworkable, liberal TLS dangerous. JSON for internal APIs and data succeeded because it’s conservative, XHTML and XML+XSLT on the open web failed for the same reason. Postel’s law is less of a universal principle than it initially seemed to be, sure, but it appears to me that part of the reason for its increasing irrelevance is our moving away from open ecosystems and not deficiencies valid in its original context.
Integer encoding (as opposed to e.g. encoding of opaque binary strings) actually appears to be a bad example to me: various universal binary encoding protocols, self-describing or not, have an astounding number of unsigned and signed integer encodings among them. It’s like inventing a new one is a rite of passage or something.
I see your point, but I don't agree about why XHTML failed. For starters, see: https://en.wikipedia.org/wiki/WHATWG (Basically, XHTML failed because it was a pointless boondoggle, whereas HTML5 very much wasn't.)
Regarding binary integers, having written code for a few common binary protocols and file formats I've never had to think very hard about it (just: How long? Which endian? Signed?) but maybe it's different for older or more esoteric stuff.
Re integers, it’s not the esoteric stuff, it’s the flexible, supposedly universal stuff: there’s like half a dozen varieties of varints across MessagePack, CBOR, Protobufs, ASN.1 *ER, etc.; even UTF-8 is just a (limited-range) varint encoding from a certain point of view. “Zigzag encoding” (using the least significant bit as the sign bit) is particularly insidious. And note that the (integer) exponent in IEEE floating-point formats is signed but not two’s complement: it’s in a biased representation instead.
Er, no, that’s not what I was referring to. The XHTML 2 story was stupid, yes (though I think the RDF / “Linked Data” tooling could’ve been really nice had it not been a fantasy), but lots and lots of people were willing to give XHTML 1.1 a chance during the XML craze and the original web standards push; except the HTML 4.01 Strict rules which XHTML 1.1 enforced were complicated enough that nobody ended up willing to tolerate showing the user literally nothing for every fumble in a server-side script. (Part of the problem was that people were routinely generating markup from textual templates.)
This is all 100% correct but he missed probably the biggest reason!
It's really really hard to write an unambiguous text protocol specification and equally hard to write something that implements it properly. Think about all the extra ambiguities text adds: where is whitespace allowed? How are newlines encoded? Are windows line endings ok? Does case matter? Which bits are ASCII and which are UTF-8? How are values quoted?
It's just insanely more complicated, and there are many subtle differences that seem reasonable to different people. "Of course whitespace is allowed there!"
It makes everything way less robust and way more prone to quirks-mode style degradation.
> How are newlines encoded? Are windows line endings ok?
sigh Those are not "Windows line endings". CRLF has been used as a line-delimiter in every single ASCII-based network protocol since 1971 to this day. UNIX is not the single progenitor of every thing in the modern computing, not by a long shot.
> It is as simple as dumping struct ntp_packet on wire and reading it off it -- no parsing involved except for calling ntohX()/htonX() on all fields except li, vn and mode.
Nope, you may still need to call ntoh/hton, depending on how the compiler you use orders the bitfields inside an int. Plus you need "__attribute__((packed))" or whatever the compiler you use supports to make that C struct definition mean what it looks like it means: even then I am not sure those three bitfields are required to occupy exactly 8 bits.
Better yet is using wrappers for char arrays paired with helper functions:
Perfectly portable (as long as CHAR_BIT = 8), type-safe, and on modern compilers it generates no overhead over direct memory accesses.typedef struct { unsigned char data[2]; } u16be_t; inline static uint16_t get_u16be(const u16be_t *p) { uint16_t result = 0; result |= (uint16_t) p->data[0] << 8; result |= (uint16_t) p->data[1]; return result; } inline static void put_u16be(u16be_t *p, uint16_t value) { p->data[0] = value >> 8; p->data[1] = value; }
Back when HTTP and SMTP were designed, the Internet was mostly old Unix machines talking to each other. Everything was a file full of `char`, piped to an 80-column terminal. Text-based made sense. Decades later, when the computer world was bigger, faster, and more unified, other systems kind of cargo-culted off of those earlier successes. And isn't it neat that you can telnet into port 80?
I think another big reason text-based protocols are seductive is that they're an engineering path of least resistance. When you start off text-based, how to debug and analyze and interoperate with other implementations can be put off for later, or be Someone Else's Problem. Whereas if you design the same protocol but in binary, these tricky considerations are harder to ignore -- even though text-based protocols will still run into the same problems eventually, because there's no way I'm typing in a Cookie header by hand or decoding Base64 in my head.
And yet I have definitely copy&pasted cookie headers from the database or log files into the browser - something I’ve never done it could imagine doing with binary protocols
You could use something like Wireshark to do it with binary protocols. Also lots of binary protocols/formats support a text format to make them readable, e.g. Wasm and Capnp
>internet was mostly old Unix machines talking to each other.
Was it? By my reading there was way more weird stuff than there is today but I certainly wasn't alive back then.
In the context of popular things like HTTP and SMTP, yes, it was mostly Unix/BSD/NeXT[0][1]. Obviously the entire history of the Internet is a different story.
[0] https://en.wikipedia.org/wiki/World_Wide_Web#/media/File:NeX...
[1] https://en.wikipedia.org/wiki/Simple_Mail_Transfer_Protocol#...
There is one thing that is often neglected in text vs binary protocols debate and that is self-terminating vs prior-length. Although it is not strictly connected, text protocols are usually self-terminating (e.g. closing tags), while binary protocols are usually prior-length (e.g. type-length-value approach). The first approach leads to escaping and all associated problems.