StateService: Automating recovery of third-party services after a major outage
code.fb.comHuh? I feel like this article is missing an example of what kind of operation a Chef run would do that requires coordination between hosts. It's incredibly vague right now, I don't feel like I have enough information to determine whether it's even a good idea for this to happen in Chef.
Like even what the two Curl commands would be in the small example there would be helpful.
They mention it's used for 3rd party software, so I can take an educated guess.
It might be, for example, that some 3rd party app doesn't retry database connections. So, you have to start the database server first, and wait until it's accepting connections before you start the 3rd party app.
Just a guess though. You're right...it should have been less vague.
Nice. A few years ago we‘ve build something along these lines at Technicolor: https://medium.com/@rad_g/the-ultimate-reactive-infrastructu... I can imagine that our solution wouldn‘t work at facebook scale but for us, it was a massive timesaver, in 300 lines of ruby code.