Zero Trust Revisited - Systems Approach

One of our reviewers recently pointed out that we hadn’t covered Zero Trust in our draft book on network security, in spite of having promised we would do so in an early chapter. Another reviewer asked why we weren’t covering “mesh VPNs”—which I admittedly had to look up—but this reminded me that VPNs (mesh or otherwise) were not really covered in any detail. While these two topics might seem only loosely connected, they are linked in my mind; VPN compromises have been a main vector of attacks on corporate networks that motivate the move to zero trust security. I’m not the only person making this connection: a zero trust white paper from mesh networking provider Tailscale just landed in my inbox with the provocative headline “Zero trust is dead”. So I set about addressing these two gaps in our book, which forced me to go and update my knowledge on the current state of zero trust and revisit its history.

My first exposure to the concept of zero trust security came about when I was working on the network virtualization team at VMware. Having set out to virtualize as much of the networking stack as possible, we rather stumbled onto the dominant use case for network virtualization when we introduced the distributed firewall (DFW) around 2013. Network virtualization makes it easy to create small, precisely scoped virtual networks connecting a handful of computing resources (initially, virtual machines); the addition of a DFW provided the necessary ingredients to define exactly how a group of VMs could communicate with each other. Instead of the traditional approach to firewalling in which a zone contains a set of machines with unfettered access to each other, we could create “microsegments”—minimal virtual networks with precise policies for how the machines in the segment may communicate. A simple example of the power of this approach applies to a three-tier application with a set of front-end machines that are exposed to the Internet. Microsegmentation supports the policy that the front-end machines may talk to the machines in the application tier but may not talk to each other, a policy that is hard to enforce if they are placed on the same traditional network segment.

Distributed Firewalls allow precisely scoped communication

With microsegmentation in our toolkit, and after some bright person coined the term “microsegmentation”, we set out to explain the value of this approach to customers. Part of our explanation leveraged the term “zero trust”. The term apparently had been around since the 1990s, and the analyst firm Forrester had popularized it a few years previously. Because microsegmentation enables an enterprise to break away from the old way of putting machines into a zone where every device is free to talk to every other device, it conforms to one of the key principles of zero trust: don’t trust a device just because of where it is located. To quote the NIST specification on zero trust:

Zero trust…became the term used to describe various cybersecurity solutions that moved security away from implied trust based on network location and instead focused on evaluating trust on a per-transaction basis.

You can certainly argue that microsegmentation did not really achieve per-transaction evaluation of trust, but it does at least allow a default policy of “trust nothing” when describing how a set of VMs can communicate. You can then explicitly allow specific communication pathways such as “let the web tier machines talk to application tier machines on a specific port” without opening up a free-for-all of communication among those devices.

Bringing Zero Trust to VPNs

What I found most interesting about my recent foray into zero trust was the way that zero trust and VPNs seem much more closely coupled now than when I was focused on microsegmentation. This is typified by the fact that Tailscale, a company that builds mesh VPN technology, is publishing reports on zero trust. As I was working on my explanation of mesh VPNs, I was a bit surprised to find this definition of zero trust in the How Tailscale Works blog:

Tailscale node connections are end-to-end encrypted (a concept called “zero trust networking”).

Now I like Tailscale, who have built a clever product that I enjoy using (it’s great for accessing a pi-hole remotely for example), but that sentence above is pretty far from the NIST definition of zero trust, or Forrester’s. Tailscale does support an approach to zero trust in my view, but there is rather more to it than end-to-end encryption. (Their definition here is much more accurate.)

It’s worth spelling out why traditional approaches to VPNs have been the exact opposite of zero trust. The classic VPN model is one where a corporate network is protected by a perimeter firewall, and there are relatively few controls on traffic flows between devices “inside” the firewall. A traditional VPN allows a user, once authenticated, to tunnel their traffic to a gateway (sometimes called a VPN concentrator) that allows access to whatever resources are inside the firewall. In other words, the ability to authenticate to the VPN leads to the user and their device being assigned broad trust, as if they were inside the corporate network. The list of breaches that have occurred as a result of a VPN user’s credentials being compromised and then used to access a broad set of “inside the firewall” resources is depressingly long. (Medibank, Colonial, etc.)

What Tailscale enables is VPNs that are built as a mesh of tunnels (end-to-end encrypted) between a precisely specified set of devices—very much analogous to what microsegmentation does, where we specify the precise communications that are allowed between VMs. Tailscale has solved a few problems that made this hard in the past, such as building tunnels that work even when both endpoints are behind NATs. They also have adopted a very SDN-like architecture in which a centralized controller is responsible for coordinating all the control traffic that needs to flow so that certificates and keys can be put where they are needed to establish the encrypted tunnels. In this regard there are clear analogies between mesh VPNs and Software-Defined WANs, which brought site-to-site tunnel meshes to the corporate VPN market. Unlike SD-WANs, mesh VPNs extend the tunnels all the way to the endpoint, rather than just to the edge router at a corporate site; for this reason they meet my understanding of zero trust.

Finally, it’s worth noting that Google’s BeyondCorp also solves the zero trust problem in a rather different way, by dispensing with the notion of the trusted corporate network (hence the name). Every user and device has to be authenticated before accessing any service, no matter where they are located, meeting the zero trust definition. It would take another post to discuss BeyondCorp in detail, but a salient difference I observe between Tailscale and BeyondCorp is that I could easily set up Tailscale on my own. BeyondCorp was developed with the resources of Google and is offered as an enterprise product. (Tailscale’s founders have some BeyondCorp history, and further differences show up in the documentation.)

A few years back I wrote a post entitled “Is Zero Trust Living Up to Expectations”. I concluded that progress was being made, notably with microsegmentation and Service Meshes. The fact that mesh VPNs are rising in popularity and BeyondCorp is available beyond Google both strike me as positive developments. But as Tailscale’s State of Zero Trust report points out, zero trust implementation remains an aspiration for many enterprises—there is plenty of work to be done. At least our book now takes a stab at covering the topic.

We’re excited to announce that we have reached an agreement to publish a Japanese translation of our latest book “What We Talk About When We Talk About Systems“. This is particularly appropriate given that our title was inspired in part by Haruki Murakami’s book.

We noticed that the Open Policy Agent team has been picked up by Apple and continues work on OPA.

This New Yorker article suggests a bit less hubris might be in order regarding AI, a position we share. Signs of the AI bubble reaching its limit are starting to be picked up in mainstream media.

Privacy-preserving age verification falls apart on contact with reality.

Preview image by Beth Gallant on Unsplash.