Choosing Go to implement the new GOV.UK router
gdstechnology.blog.gov.ukNot impressed.
I have nothing against Go (in fact I like it) but they seem to have built GOV.UK with every damn bit of technology there is available. I've seen Varnish, Go, Ruby, Python, Scala, Java, Mongo, MySQL, nginx, rails, sinatra, Django and PILES of Not Invented Here.
Also I'm not sure it's a great use of public money to build stuff like this when they should have nginx/apache up front and some configuration.
In fact this whole thing stinks of crappy information architecture resulting in a massive front end router hack that they did in Scala, binned and moved to Varnish, then binned and moved to Go.
This is not one "cohesive" web front end. It's a rat's nest and an increasingly expensive one.
We did have some configuration - it was stored in Varnish and Nginx. The time cost of shipping new redirects when content is moved or new routing configuration for new tools coming online is non-negligible, particularly as we're in the middle of moving 300+ government agencies onto the GOV.UK platform. Also having all those redirects, friendly URLs, and application mappings in static configuration makes self-service publishing tools very difficult to build.
This is all outlined in the blog post - perhaps not as clearly as I would have liked, though!
They're incredibly easy to build. I notice you use puppet. That's enough tooling to push configuration out and restart services for nginx or apache+mod_proxy. You can break configuration out into separate files per agency if you need to.
Varnish is different as that requires restarting.
We do the same with commercial kit (Riverbed) with over 140 HTTP application endpoints and that is higher friction than the equivalent setup yet we manage it with a mere 1 person...
Deployment is a solved problem. No offense but you're not Google!
There is a need for dynamic routing, which you're overlooking. So updating configuration files and restarting services would not work.
and you are not a government agency delivering digital services to 60+ million people
"OMG we're the government and everything we do is huge and unprecedented so it is impossible for us to use off-the-shelf anything and we can learn nothing from best practises" is step 0 of every failed big government project.
There's significant thawing of that attitude (and gov.uk, for their imperfections, is definitely a very positive part of the trend) but the idea is alive and well.
Yes because a real world organization with IT systems going back 50+ years is exactly like some me to macbook and MBA start up with two men and a pantomime horse in old street.
Yes because a hipster startup in Old Street is the only kind of non-government entity that uses computers for anything.
Well not everyone is good at it at BT is/was well before that fuckwit sales man got over promoted and fucked over Global Services good and propper.
And at least they have avoided any healthcare.gov and RBS fiascos - Though I suspect that Universal credit is going to be a real CF
No we're just handling that 60 million people's mortgages, insurance, financial status, financial history, legal and personal data instead...
Are you a price comparison site? Or Experian or Equifax perhaps?
Unless you clearly say you're not just some enterprise business that aren't exactly known for their technical competence this comment is totally ambiguous.
Do you just think you know what you're talking about, or do you actually?
None of them although close.
You don't have to take my opinion seriously. I could work in Tescos and be splurging false information out on the Internet. I could be an elaborate hoax!
However, please don't write off people as "just some enterprise business that isn't exactly known for its technical competence" because we all know that startups get it right all the time as well...
As for do I know versus do I think, there is the third option do others know and that is all that is important when it comes to getting paid...
So you are handling a small set of related services to a large number of people. That's hardly comparable to the task of migrating 300+ (distinct) government agencies/services to the new system, along with whatever old systems they currently use.
I tend to agree. The post makes a point of saying "we had a router built in scala, but no one on the team knew how to maintain it".
It seems they're allowing team members to play with "language of the week" (lotw), build something that's then used in production without the rest of the team knowing the lotw, that member moving on and then rebuilding it in a new lotw 6 months later. no doubt this GO implementation will go the same way at a later date.
I'm not opposed to using a lotw in production with other languages and such, but it needs to be done in a thought out manner with the whole "what if the project lead gets hit by a big red double decker" mentality.
I wonder if the technology churn is more of an issue with the structure and organisation of GOV.UK rather than a technical one. In the post he states that the previous router was unmaintainable as there was no-one with a deep knowledge of Scala in the team and that this was the main reason for the re-write. Why was this allowed to happen? Is there a rapid/high turn over of staff or are teams moved around without regard for skill balance?
Possible. Some good thoughts there.
And yet it works better than most traditional Government IT projects. Healthcare.gov, I'm looking at you.
It does not look that bad actually. Your list lacks Oracle, Microsoft technologies to really complain about inefficient usage of public money. I would love if my country used open source more.
Sadly UK local and national government projects are very much rotten with Oracle and MS solutions.
source: I used to work for local government
New legislation means local councils have to justify their requirement for Oracle/MS licencing costs over open-source/free software. They can still buy the licences if they can show they really need it, but loads of councils are now properly evaluating solutions like Canonical's OS/Cloud stack.
Sadly it's easy to justify because they just play the "this is a critical system and Oracle / MS provide round the clock support and quick resolution milestones" card.
I've lost count of the amount of times I've seen Oracle products bought based off the back of Oracle's keen sales people overstating the value of their post sales support.
Plus since there's so many redundancies happening in local government, they don't like train and retain skilled in house staff who would have been able to offer the same level of support. So 3rd party support is seen as more reliable than in house.
Lastly, many project managers push for such contracts because it means they can deny responsibility (ie if Oracle fucks up, they can blame Oracle. But if the system goes down and they're responsible for their own support, then the buck stops at the project managers). So you often get project managers push for such contracts just out of laziness.
There might be a case that this legislation might see more contracts signed with open source vendors who provide support, but local governments will always still have the argument that MS / Oracle are big trusted names where as a less known (outside of the world of IT) open source vendor may not be able to provide the scale of service that the government wants (complete BS of course, but all to often these decisions are made by project managers who know jack shit about IT)
Agree entirely there!
However it doesn't excuse people working on a web site funded by the taxpayer using it as a technology playground.
Having worked along side a number of government projects, I can confidently say that all the projects based on "off the shelf" products end up becoming vastly more expensive because you usually end up with hugely expensive support contracts with BT / IBM / MS / Oracle / etc and often the solutions need some bespoke tuning to work the way needed, so you add on expensive consultancy costs to have the software adapted to fit.
Where as developing stuff in house means that you're employing significantly cheaper labour (namely, the rubbish salaries that most get in the public sector vs the private sector) and as a bonus the cost is more likely going to stay within the UK (ie you're not paying multinational companies nor their overseas consultants (Oracle was particularly bad for this as they dumped our stuff in a US data centre and our support contacts were all living somewhere in eastern Europe. So very little - if anything - provided was UK based).
Sadly our government doesn't seem to understand about doing anything in between - or at least not that I observed when I used to work there. It was either entirely bespoke or entirely bought, developed and set up by some overpriced conglomerate (or worse yet, outsourced completely). So going by the trends that I've witnessed, I'd rather the technology playgrounds just so long as they remain developed in house.
In their defence, gov.uk is miles better than just about any other government site/application I've ever used.
Oh definitely agree. I'm not dissing the site itself which is marvellous so far, but the technology churn is very concerning.
Allowing staff to use "language of the week" can be an excellent way of retaining staff when you can't pay them market rates.
Maybe. I don't know how huge that project is. Maybe technologies that have been chosen are the best for the tasks at hand. Maybe some of those technologies are only minor part, not in critical area, maybe it is some technology research project. Maybe it is kind of managerial pride "let's write how many lines our source code contains". As well, I guess, you don't assume that software developers should learn new technologies on their own time (while I personally love to do that).
you'd rather £900-£1000 /per diem consultants from Captita where doing a worse job?
That's a silly argument. I'd rather not be pissing an estimated £6400+maintenance up the wall on a project which could have been realised with off the shelf technology.
Just because the organisation uses open-source software doesn't excuse them from public and professional scrutiny.
An MP spending £1645 on a duck house under expenses got a lot more attention than this little bit of waste. Just redressing the balance.
£6400 wouldn't even pay for the coffee and biscuits on the typical UK government IT project.
Nearly 4 duck houses though!
And not even a full week billing for a big 4 management consultant.
Anything can be realised with off-the-shelf technology plus time. Your posited solution further down the page would also have involved work in setting up the pool configuration and switching, so let's not pretend that it comes at zero cost.
It will come with proportionately less cost initially and over time though which is the issue.
Efficiency is a major problem in government. Government should have a low financial impact on society where possible. A government entering a market with an already solved problem is wasteful at best. If they'd contributed time to Nginx or mod_proxy then they would have a net positive social effect.
But they didn't. They built an inferior product at great cost to the taxpayer.
A government entering a market with an already solved problem is wasteful at best. If they'd contributed time to Nginx or mod_proxy then they would have a net positive social effect.
Is this some sort of Markov-chained free-market/El-Reg-at-its-most-prolix/libertarian experiment in satire? Well played!
That made me laugh and I've upvoted you for that, but no, it's serious.
Various governments are already throwing money at Open Document Foundation (France, Germany come to mind) to support LibreOffice so why shouldn't we throw a few quid at nginx/apache?
Or should we go and write our own office suite?
Depends if they have open source support contacts or not. They can be just as much as buying the commercial equivalent.
Presumably by having a list of different technologies they are taking the approach of right tool for the job. The obvious side affect also being that engineers who work on this need to be polyglot as opposed to any specific tech. I certainly categorise this as a good thing, a developer and as a stakeholder (UK resident and user of the site).
Actually they're using the wrong tools for the job because they're having to rewrite bits regularly in different tech. There is obviously no evaluation taking place.
Establishing technical standards and homogenity is important on projects of this scale.
I think they established the reasons for it fairly well in the blog post, and even better in the comments elsewhere. I'd say they are the right tools for the job.
Maybe. Maybe not. The proof is in the pudding. It's immeasautably better than before and is a lot cheaper to run. Until otherwise I am of the opinion that they are making the right decisions.
In their defense having worked on sorting out problems on old complex large scale publishers sites. If your trying to make a collection of services hang together in a seamless way you do need some sort of front end to tie it all together and that needs to be usable by non gurus.
Though they do seem to be using all the new toys maybe a bit more thought about which technology to use rather than jumping from technology to technology.
Yep I understand that.
We publish several huge legacy applications, several huge new applications, several integrations, public web site, documentation, online support. A mere 80-100 million HTTP transactions a day.
Not once have we built something to do it all.
Prove it. Who do you work for?
I don't need to add leverage to the conversation by mentioning my employer. My arguments have merit on their own.
Not impressed. Seriously, this is just awful.
People make changes to their infrastructure. The govuk team are at least brave enough to blog about it, and offer the code up for inspection. Can you post links to similar public contributions you have made?
I'm sure if you joined the team you'd wave your magic wand and everyone would fall in line, and things would be beautiful. Right.
Oh, and I'm sure the golang community are really, really pleased that "in fact, [you] like it" - very charitable of you!
I can't wait to see what you've been up to.
The UK government is a 19th Century 'filing cabinet bandwidth' information architecture. The IT systems as always reflect the organisation structure that produces them. The data redundancy and unnecessary domain separation is frustrating and archaic.
Uk.gov is just the cherry on the top of an old and creaky enterprise design that hasn't kept pace with changes to its mission or the world around it.
Actually the stuff GDS is doing starts from a clean slate and does away with much of what you are complaining about, in fact it is intended to fix a lot of it!
If GDS was the IT department for the whole of government and the departments became a team of policy analysts and ministerial support then you'd be right.
Overwhelmingly the core information management and processing is on the local IT infrastructure of each department. GDS is unifying the web presence and some of the citizen identity systems, its not fundamentally taking over the core information systems of state.
EDIT: To be clear, I think something like GDS should take a long hard look at the core information systems of state and start again with a clean slate breaking the technology and the data out of the policy silos.
I don't disagree with you, but change takes time.
Getting publications together in one place will be a great start (which is being helped by the public bodies transition).
I'm not sure why you have such a negative attitude towards gov.uk, the site seems refreshingly good at it's job. It's simple, clear and easy to use. As long as they can execute fast what does the tech stack matter?
Try doing something with it (eg apply for a tax disk). At that point it always hands back control to the old site(s) it "replaced".
I think it's fair to say it's going to take a little while to migrate all of the systems over. Give credit where it's due, they're doing a good job with the new site and I'm sure that when it arrives the replacement for the tax disc application will be of a similar standard.
I think the site is good. I have no problem with the technology ultimately but to put a fine point on it, I'm not paying the government lots of money to invent their own infrastructure components when perfectly good ones exist already.
The same applies to people working under me.
Yes, but this time they have really found the silver bullet (honestly!).
... something something blub something something agile rockstar ...
In this case NIH resulted in a space-optimized algorithm, rather than a time-optimized one. So well done there, kids.
A lot of the comments here seem to be missing something quite important: Scala and Go are not similar languages.
Scala is a large and complex language, encompassing multiple programming idioms. Teams programming Scala (as with C++ and many other "large" languages) often have difficulties nailing down what portion of the language they are going to use. "Java-without-semicolons" all the way through to idiomatic, functional Scala.
Go is a relatively small, simple language. You could easily learn Go's basic syntax and semantics in an afternoon or two. Simplicity and suitability for programmers is an explicit design goal.
Switching away from Scala due to lack of experience while simultaneously picking up Go is not as illogical as everyone seems to be implying.
This, whilst likely true, is not a self-evident fact.
I think the argument in the blog post is weak because it does not quantify what makes Go more suitable for learning than say, Erlang or Clojure, which are dismissed outright, before moving on to discuss some of the advantages of Go in depth.
You're right, but i think the point you're missing (and i don't mean this in a rude way) is that they had a system built in scala, somehow, but now no one on the team knows scala.
They lack an architectural oversight, that they should really be using one(ish) language rather than a big mix of languages so that this doesn't occur again when no one on the team knows GO because the project leads taken a new job.
Retention. Also, I'd suggest that your average dev will learn Go quick enough to work on this router quicker than they'll learn Scala; I prefer Scala myself, though! It's a balance between proper tech choice, correct management choices moving forward, and dealing with management mistakes (like letting someone build the router in Scala when the rest of the team don't know it) from the past.
A crappy PHP dev will learn Go a hell of a lot quicker than Scala, IMO. A safer choice, for their team and position, IMO.
If Scala and Go were the only two technologies mentioned you would have a point. But this is a monstrous fad bloated lineup of technologies they have there. Integrating this mess is way way more complex than any individual language.
There are good arguments for trying new technologies, but this particular instance seems to be a very good example of how such a strategy can go terribly wrong.
[Edit] On second thought, I might be jumping to conslusions as I don't know anything about how successful they are in actually achieving their goals, doing it on budget or how fun it is to work there. Their actual approach may be less confused than the blog post makes it sound.
"Big is beautiful" versus "small is beautiful". Some people like Ruby, others like Python.
Hell, some people like Perl! :p
That seems weird to me:
* No one having a deep knowledge of Scala is mentioned as a problem,
but they will learn Go.
* They rule out a language because of its syntax.I like Erlang, but in shops I've worked with that have your "average" dev in it, it's syntax and semantics would be something they struggle with (if it ain't PHP, they'll struggle with it...). Unfortunately I don't blame them. That said, I think if they truly needed to build a somewhat stateful concurrent router, Erlang is a better choice (solely because it's battle tested in those highly concurrent workloads). Picking tech for big things like this when you have to balance a teams knowledge with the right tool is a hard problem in my experience.
I agree, but if you want to learn something today then GO would fit better for the task at hand and easier to learn. That and the cost involved is really good value given the training involved with learning Scala over GO.
That and I suspect there was some wishlists on what people wanted to do in the mix, like COBOL, still used but people just do not want to learn that as it is deemed so last century in comparsision to say java, though they are altogether different fish. Bit like the difference between a proxy and a router, what sounds better on a CV.
I suspect that from a maintenance aspect that the GO code will be much easier to support and grow as and when needed as well as scaleability.
Still it is most encouraging to see such a young language like GO get so much respect and use so quickly in a enviroment that has to many in the past and present been slow to adapts new technology. But there again many Govermental departments still have XP, albeit used as a fancy terminal to some old mainframe application that just ticks along.
"Go" is not an acronym.
I know, sorry got carried away with that caps key.
On the github page it says the router sits between nginx and varnish and does:
* Reverse proxy, forwarding requests to and serving responses from multiple backend servers on a single domain.
* Redirector, serving HTTP 301 and 302 redirects to new URLs.
* Gone responder, serving HTTP 410 responses for resources that used to but no longer exist.
Does anyone know why they have this as a separate piece of software instead of getting nginx to do it?You should read the first article in the series - https://gdstechnology.blog.gov.uk/2013/12/05/building-a-new-...
I actually read that and thought nginx/mod_proxy.
Looks like they didn't actually try it or know about it which is worrying.
Either that or it was more interesting to do it in Go which is not a valid reason in a taxpayer funded site.
Dynamic route updating with nginx is not fun.
No but updating static routes via puppet/ansible and reloading is a piece of piss.
If your site is volatile enough that the routing needs to change that often then there is something wrong either with your information architecture or your development process.
Or maybe you're trying to deal with a huge legacy content migration of several hundred different government agencies and want to deploy several times a day without waiting for 10 minutes for nginx to load 1MB of config each time?
2 front end nginx boxes s01-s02:
1. take s01 out of pool.
2. migrate config on s01 to new config.
3. put s01 back in pool.
4. take s02 out of pool
5. migrate config on s02 to new config.
6. put s02 back in pool.
Ansible can handle this quite happily. Scales up to any number of boxes. For an (n-1)/n capacity reduction during deployment.
I've got a 450k apache config somewhere that takes <1 second to reload so I don't think that's a major issue.
Also if you have THAT much config, something is wrong with your information architecture (see my other points).
Or the fact that having multi-megabyte configuration files in the first place is in itself somewhat of a horror...
The back-end to GOV.UK is designed to be very configurable by non-technical users, and many of their routes are dynamic. While Puppet could be used, the process of automating those edits, then rolling them out to all the servers, but in batches so that no downtime is experienced with restarts, configuring load balancers to be aware of this, etc, etc, the whole process has lots of points that could fail.
It really sounds like a dynamic front-end that is aware of the routes would be a much better idea, far less that can go wrong.
Allowing non-technical users access to routes is a disaster waiting to happen. The whole thing should have a QA process around it. I'm genuinely surprised it's being run like this.
The biggest point of failure is always humans and you're basically handing them a gun there.
Something is wrong with this statement:
"Scala is great for performance, but quite bad at resource usage"
Maybe it's directly related to the next statement: "No-one in the core GOV.UK team had a deep knowledge of Scala, and particularly how the old router worked"
But the first statement, literally, does not make sense on its own.> and particularly how the old router worked
Oh, then it's OK. It's always reasonable and a safe bet to rewrite something instead of understanding how something works, because after all, your predecessors were a dumb bunch and you (using Go! how cool is that!?) obviously know much more.
> But the first statement, literally, does not make sense on its own.
Why? There are other resources than CPU time, e.g. memory.
"Computer performance is characterized by the amount of useful work accomplished by a computer system compared to the time and resources used." --wikipedia
Is there a different definition I'm missing? I think, a car can be "fast" and take too much fuel, but (in my book) a programming language cannot be "great for performance" while being bad at resource usage.
So they are ditching Scala, because no one in the team is an expert, while replacing by a language that everyone has to learn?!?
I asked on twitter, and got a reasonable response. Investing in learning Scala is a much longer term thing than learning Go (which is, in their experience, a much quicker thing to do).
Because you always need to use all the features the language provides. Simpler -> limited.
C is much simpler than Scala, is it more limited?
In it's expressive ability - definitely. In terms of implementing stuff, everything that is Turing-complete could be used.
I'd far, far rather relearn Go than relearn Scala...
Given that Scala is far more complex, learning Go from scratch is probably a lesser task.
How did they come to choose Scalatra if nobody knows how to use Scala?
As they are Using Nginx already interested to know if they ruled out openresty or did not consider it.
I'm a big fan of Go and we use it internally, but I too wonder why nginx + Lua wouldn't be the chosen path.
If it's good enough for Cloudflare it probably is good enough here.
Besides, Varnish also would have excelled (yes I read the first post in the series) if they had just stored the configuration in a saner format and then output the required VCL when the config changed and reloaded on the fly.
Both of the technologies I would've picked (nginx and Varnish) were in their stack.
I love working in Go, but reading this feels like a reinvention of the wheel. Go would be my choice only if these other things didn't exist already and you had to start from scratch.
The biggest pain identified was the writing, maintaining and loading of the router config. Openresty could've solved that.
Openresty and Lua are great, but Go is a more generally applicable language for building web services. So when taken in that context Go makes more sense if we look at it as a learning exercise as well as a new part of the stack.
Scala wasn't a learning exercise then?
"Erlang, Go, Scala, and Clojure are all good fits and represent the current state of the art." - why is the choice so limited? It seems to be quite JVM-focused as well, I would consider F#/Nemerle as well. Erlang was invented in 1986 by the way - so a safe bet rather than "state of the art".
F# and Nemerle are both based on the CLR, and unfortunately the performance of Mono is pretty bad, and not really suitable yet for high performance use-cases like this.
Also I'd Erlang is state of the art, even though it is so old (Haskell too), but I wouldn't call it an entirely safe bet because it requires quite a different style of programming and experience, so might not work well for certain developers.
Isn't Erlang just a (semi-)functional language with Actor Model baked in? I might be wrong but it seems that you just need any dev familiar with message queues and functional style.