Linkerd: Twitter-Style Operability for Microservices
blog.buoyant.ioSuppose I designed something from scratch at former company, and then I decided to reimplement the same project after leaving the company (and perhaps turn that into an open source project), with mostly new code, but similar concept, would that be considered copyright infringement?
I doubt it will be "copyright infringement" since the code was rewritten. It might fall under patent infringment if something was patented.
The more likely scenario is that it will constitute a misuse of trade secret either under law or from the contract signed on employment. I am not a lawyer and the law is different in every place but it is likely that such a use would be protected under paw by protections given to professionals to retain expertease gained while practicing their professions. I don't remember how that protection is called.
Your employment contract or a related document you've executed may include a post-employment restraint/non-compete. These vary a lot by level of the position and type of work. They are difficult/expensive to enforce and are open to interpretation about "reasonableness". These are generally the former employer's best means to prevent you from immediately reproducing something. However they also can't prevent you from working in your profession and using what you've learned as a professional.
are there a lot of companies in the practice of copyrighting their internal architecture and patterns?
In usa everything that is copyrightable, becomes copyrighted at moment it is created. For instance that sentence, and this on, are protected works owned by me. The only reAson HN can legally use then is cause I gave them license, buried somewhere in Eula/site terms of use.
Also, an architecture is not copyrightable. Maybe patentable. Copyrights protect specific expressions, not a general idea, plan, or architecture. At least not yet thankfully I haven't seen anyone try to argue an unrelated software doing same thing is a derivative work.
But Twitter has a patent pledge with its engineers [1]:
[1]: https://blog.twitter.com/2012/introducing-the-innovator-s-pa...The IPA is a new way to do patent assignment that keeps control in the hands of engineers and designers. It is a commitment from Twitter to our employees that patents can only be used for defensive purposes. We will not use the patents from employees’ inventions in offensive litigation without their permission. What’s more, this control flows with the patents, so if we sold them to others, they could only use them as the inventor intended.
Many big companies will patent their internal architectures. I've been involved in the creation of quite a few patent applications of exactly that. My name is even on one of them.
It's silly and ridiculous but necessary because of the way patents are enforced.
I can promise you that the existence of the patent system did not motivate us to build any of that software, but it certainly motivated us to patent it after the fact, so that someone else couldn't patent it and then sue us for having independently built the same thing.
[Edit: Apparently this is language-agnostic, which wasn't clear from the blog post, so please ignore the complaints below. Will leave them here rather than deleting.]
I loved the pitch, but then I discovered that this is Scala only, which was disappointing.
Sure, if your entire organization runs on the JVM (like Twitter presumably does), then something like this is going to be fine. But many/most organizations use multiple languages, for various reasons. At my company we are currently looking into replacing our current microservie RPC (JSON over HTTP) with something better, and we do need to support Ruby, Go and Node.js, as well as plain HTTP from browsers.
The only viable cross-platform RPC technologies right now are gRPC and Thrift, both of which are rather heavy-handed (lots of IDL + code generation + client/server setup code), and neither of which solve the really hard problems (discoverability, load balancing, fault tolerance, etc.). It's also doubtful that gRPC is really in a usable state yet. Thrift is by far the most mature solution in this space.
Maybe we'll be able to take some inspiration from this project when building our upcoming solution, whatever it will be.
No no, you're missing the point, it's an RPC proxy which is language agnostic. It sits along side your application as opposed to being a library that you bake in.
I see. My fault for going straight to the Github repo rather than to linkerd.io.
We can definitely improve the docs a bit on this point.
But FWIW we totally agree with you. Finagle itself is a JVM library, and that worked well enough at Twitter [insert caveats here], but a big part of the reason we built linkerd is extending that model to non-JVM / polyglot services. There's SO much good stuff in Finagle... it would be a pity to confine it just to the JVM.
Yup. I do wish there were more work done on the actual client/server side (the P and C of "RPC"), too, something Linkerd can't solve for you, and which I believe Finagle does.
So far all our work has been JSON-over-HTTP, which is not performant. We've looked at gRPC and Thrift, but we've been dreading the prospect of having to pre-declare IDL, generate language glue and write client/server setup glue for every app.
At the moment, we're closing in on using NATS as a routing proxy, fulfilling a role very similar to Linkerd. Turns out it's fast enough that you can do RPC with it, and it seems very reliable. Language bindings are good to the point where you have to write almost no glue code. You have to pick a serialization format, and I think Msgpack might work well here.
I'm curious to hear what your experience with NATS will be. We haven't played with it. I do know that a ton of the work that went into Finagle was around resilience at scale: things like backpressure, circuit breaking, and generally tolerating slow, bad, or flapping servers. Personally I would want to see that sort of thing in any RPC system before using it at scale.
We're a small shop; each of our clusters are currently a couple dozen static servers at most. So we don't really do anything at what would be considered "at scale". For us, developer ease of use and performance is a higher priority.
But we're slowly moving towards a setup where we'll likely rely on autoscaling, and where we will likely need a more robust architecture.
This is actually closer to Google-Style operability than it is to Twitter-Style operability. :) Twitter doesn't have an equivalent of GSLB (software load balancer), which is essentially what this is.
Have you seen Google's GSLB? It's not a proxy and it's a rather complex and powerful system. It aggregates and coordinates traffic flow but doesn't directly do any kind of forwarding. You could maybe think of Linkerd like GFE but even still, GFE acts as an edge gateway rather than internal RPC system. This is like a wrapper around stubby or grpc.
Do you have any public resources about Google's GSLB? Searching for it results in general articles about GSLB.
I've only ever found two things publicly available on this. The patent related to DNS GSLB http://www.google.com/patents/US7581009 and the video by Simon Newton which describes the global footprint and some of the services https://www.youtube.com/watch?v=DWpBNm6lBU4. Most of what I know about GSLB is from my time working at Google.
Is GSLB different from Google Seesaw, the load balancer that was open-sourced recently?
Yea, it's a completely different use case. Internally Google has something called Maglev which looks a little more like Seesaw but even still its quite different.
I haven't actually seen anything like GSLB anywhere else. With classic load balancing we've always been taught to proxy traffic through a single point of entry. GSLB flips that on its head and rather provides routing information which clients can then use for the next set of requests. I'm attempting to develop something similar within https://github.com/micro but it's going to be a long time before I even get marginally close to something as powerful as Google has.
The "sidecar" proxy model reminds me a lot of https://github.com/airbnb/synapse
We definitely took some inspiration from projects like synapse.
However, Finagle, the core tech behind linkerd, provides some extremely powerful tools to do things like:
- per-request routing to support things like "when I browse the site, use the staging version of the users service and production versions of all other services". (https://twitter.github.io/finagle/guide/Names.html#interpret...)
- request cancellation, so that when a user request timeouts downstream work can be reclaimed
- budget-based timeout management (https://twitter.github.io/finagle/guide/Servers.html#request...)
- circuit breaking (https://twitter.github.io/finagle/guide/Clients.html#failure...)
- etc, etc.
We think that offering these sorts of features in a sidecar model will be extremely powerful.
Netflix have something similar https://github.com/Netflix/Prana and I've got a Sidecar in Micro too https://github.com/micro/micro/tree/master/car. It's a solid model for integrating applications which you don't necessarily have client libraries for or don't speak the same protocol.
This is a good step forward for Finagle as it eliminates the anti-pattern of encapsulating the communications functionality into a library. This inevitably turns any collection of services into a distributed monolith, killing the loose coupling that is the point.
Totally unrelated: the effect/animation when you hover over the avatars (Safari) at https://buoyant.io/#team is really weird. Not sure if it's intentional.
It's fine in Chrome, but in Safari it's very jittery and animates the wrong images.
This was not intentional. Thanks for pointing this out.
Looks like the support protocols right now are HTTP, Thrift (framed transport) and something called Mux. Is this intended to be pluggable?
Yup, absolutely. Any requests? :)
GRPC. It's really the only practical choice if you have Go, Node.js, and PHP apps that need to consume one central service. It also has a lot of the benefits that Mux brings.
Luckily GRPC can downgrade to HTTP/1.1, so if you want some of linkerd's features it isn't a total non-starter; but this is obviously less-than-desirable. We're working with the finagle team to complete finagle's netty4 integration, which will enable us to transparently introduce netty's http/2 codec into linkerd. This is a high-priority feature.
that's a fair answer.
It seems like there's overlap beyond lack of HTTP/2 support, where some of Finagle's features are handled by GRPC in different ways. I'd quite like to use the GRPC way, since Google is maintaining the bindings to many languages, and doing a decent job.
Cool! Is HTTP/2.0 already there, or planned? I think HTTP/2.0 is important to get proper pipelining of asynchronous calls.
We agree 100% that multiplexing is vital.
HTTP/2 is absolutely on the roadmap. We're actively working with some of the folks at Twitter to get it integrated and tested before introducing it into linkerd.
That said, linkerd is able to provision multiple downstream connections to multiplex requests to other services; but we need to extend this to the serverside to get the best application integration.
You may also be interested in Twitter's mux protocol (https://twitter.github.io/finagle/guide/Protocols.html#mux), which provides this featureset internally at Twitter, primarily for Thrift.
"It can be done, but it takes years of thought and work to make everything work well in practice."
Really wish they expressed the cost in man-years. It's a few calendar years, and thousands of man-years.