Designing Network Protocols

journal.paul.querna.org

66 points by tomazmuraus 14 years ago · 23 comments

Reader

trout 14 years ago

As someone who troubleshoots networks for living, it can't be overstated what value easily understood information has. Normal factors such as lack of understanding of a system, miscommunication, false assumptions, false information, all types of bugs, and operator error already plague troubleshooting. Anything you can do to simplify is extremely important. Little things like being able to read debugs directly out of wireshark can make the difference between solving something between minutes and hours, days and weeks.

Unfortunately serviceability is not normally high in the initial product requirements, but I've seen a direct correlation between customer satisfaction in support and products/protocols/designs with good serviceability.

jacques_chester 14 years ago

I think the short version is: in the past 5 years, simple wire-oriented serialisation formats have become much more common. At the time you had your pick of ASN (humongous) or Thrift (brand new).

tptacek 14 years ago

Not particularly fair. You also had XDR (ubiquitous on Unix systems), IIOP, TLV, ICE... not to mention what ever protocol designer for the past 20 years has used: network byte order integers and ASCII/UTF8 strings.
Some people just like ASCII, human readable protocols. There's nothing wrong with that, but it's a little silly to suggest that the options for a packed binary encoding in 2007 were limited because Thrift and Protocol Buffers were too new.
- jacques_chester 14 years ago
  
  You're right, to my shame. I should've thought of XDR myself, after suffering through a lecture on it at uni.
  I'd be interested in the original designer's remarks on using existing wire serialisation formats.
- pquerna 14 years ago
  
  I mentioned ASN.1 DER, but I honestly didn't think I should go into a history of XDR or other encodings. I guess I can't skip any history in a blog post....
  - tptacek 14 years ago
    
    I was responding to a comment, not your post. I don't think you really need to justify using an ASCII protocol (though, again, I think HTTP query arguments are a poor choice).
fabricode 14 years ago
I don't believe a tl;dr is necessary for this short article.
I enjoyed reading through his thought process for designing a simple protocol which is:
```
  * Easy to use (requiring little/no additional libraries)
  * Easy to extend (simple keyword/value extensions)
  * Immune to changes in technology
  * (above all) easy to understand
```
- jacques_chester 14 years ago
  
  Yet he himself says he considered and rejected a binary protocol for the reason I gave: "I considered using a binary format, but the immediate problem was having extendable fields.", going on to point out that he rejected Thrift because it was too new and ASN.DER because it was too big.
  That said, I think he didn't want a binary format in any case -- his "doing it again today" remarks point to JSON.
groby_b 14 years ago

Actually, the short version is: I wanted it easily debuggable by a network admin. (It's a point he made repeatedly). All the other arguments were moot.
- huhtenberg 14 years ago
  
  So the network protocol must be in text so that a network admin could debug it? This is absurd.
  - tptacek 14 years ago
    
    How does a network admin debug a binary protocol for which no dissector has been implemented/merged into core for Wireshark, and no decoder has been written for tcpdump?
    It's obviously doable, but it's very painful.
    
    marshray 14 years ago
    
    Isn't Wireshark extensible in Lua?
    I can see both sides of the argument here, but basing a protocol on text just for the ease of eyeballing it on-the-wire seems like optimizing for the uncommon case.
    Heck, almost any decent protocol should only have ciphertext on-the-wire anyway.
    
    tptacek 14 years ago
    
    That's more or less like saying "well they can just write the decode". They're network administrators. If you use an ASCII protocol, they don't have to do anything.
    
    marshray 14 years ago
    
    I'm saying someone can write the decode and share it on their blog post or Github and your admin can start using it without having to recompile Wireshark. (I think, haven't actually tried it myself).
    But even still, this only matters if:
    A. The protocol is so new that Wireshark isn't shipping a parser,
    B. the admin's stuff isn't working,
    C. the admin can't get his stuff working by normal troubleshooting and must resort to observing the protocol,
    D. the admin can't get his stuff working by observing the binary representation of the protocol, and
    E. the admin actually can get his stuff working with a transliterated ASCII representation of the protocol.
    Certainly I would probably find it easier to troubleshoot a text-based protocol too. I just think it's a relatively minor case in the grand scheme of things.
    
    huhtenberg 14 years ago
    
    How does a sysadmin debug a binary application for which he doesn't have any symbols?
    
    jacques_chester 14 years ago
    
    On the other hand, are Wireshark and tcpdump now the gatekeepers for new protocols?
    
    tptacek 14 years ago
    
    What's your point? I'm not making a value judgement.
    
    jacques_chester 14 years ago
    
    You say that to me a lot.
    My point is that I imagine a network designer shouldn't focus on Wireshark or tcpdump integration over other non-functional requirements such as, well, network performance.
    Network performance isn't as visible as the non-functional requirement of inspectability because it is amortised over potentially millions of machines, whereas inspectability is an immediately visible issue to the select few who "pop the hood" to fix an issue or simply to have a look.
    For example: in terms of network capacity, I wonder how much HTTP headers cost all of us collectively. Probably a lot more than the cost of making a Wireshark plugin and having sysadmins install it as necessary.
    Edit: put another way, I think designers should prioritise the needs of the people who pay the cost of network operation over the convenience of the operators.
    There's a feedback loop here -- if it's too hard and thus very expensive to operate a system, then optimising for performance was a false win. But I don't think this is such a case, especially since as you pointed out elsewhere there are a number of very mature binary wire formats that were extant in 2007.
    
    tptacek 14 years ago
    
    See: http://cr.yp.to/sarcasm/modest-proposal.txt
    "I implore [you] to remember Dave and Virginia, preying on the drug addicts of the next generation and the sexually dissatisfied men of the previous generation. How different their careers could have been if their parents had not downloaded so many terabytes of data! We must not abandon our children to such a fate."
  - groby_b 14 years ago
    
    Not as absurd as you'd think. You don't want to debug the protocol itself, but you want to be able to easily read what messages were exchanged.
    I get the rationale. But I think it's weak, and this entire post is lots of fluff around that core rationale. (I've been writing extensible binary protocols back in 1988 - and it never struck me as particularly difficult even back then.)

alfiejohn_ 14 years ago

There's a complete chapter in The Art of Unx Programming about the importance of being textual:

  http://catb.org/~esr/writings/taoup/html/textualitychapter.html

tptacek 14 years ago

HTTP-style query strings are a horrible format, whether you like ASCII or not.

alexchamberlain 14 years ago

I would accept the ease of debugging argument if these messages weren't so small and so common. 1000 servers constructing strings and sending them over the network once a second is a nontrivial waste of resources.

Settings

Designing Network Protocols

Keyboard Shortcuts