Settings

Theme

Software Architecture Draft with Request Duplicator and Failover

2 points by Exadra37 10 months ago · 0 comments · 4 min read

Reader

I am looking for opinions on my approach on this architecture draft. Please read carefully and until the end, and don't forget to see the diagram link at the end of this post.

For context, I am building the BEAM Devs app for the Elixir, Erlang and Gleam community, and instead of building tech debt from day one to ship faster, I want to start with a solid base skeleton that enables me to progressively build a resilient and robust architecture from the start. It's much easier to begin this way than to add it as an afterthought.

In my first two UK roles, software architecture always included a failover system, an independent, exact copy of production running in another cloud or on-premises datacenter. This differs from redundancy within the same provider via availability zones or whatever other approach.

In this approach, the switch from production to the failover happens by manually switching the IP for the server in the DNS, which has a very short TTL. In the case the cloud provider is having an outage/issues or a catastrophic production incident that is not easy to solve immediately or roll back effortlessly, we can switch the DNS and use the failover system, or having clients switch automatically to the failover when production doesn’t respond after a certain timeout.

Failover isn't the same as Blue-Green Deployments. While a blue-green deployment gradually replaces an older version, failover runs continuously alongside production. Ideally, both strategies should be used together when possible. Failover isn't the same as using availability zones for redundancy within the same system of your application. A failover is a separate system of your application running in parallel on another hosting/cloud provider and datacenter. If you can afford to run both, then it's ideal. We did it at Approov, one of my previous roles. If you run the failover in the same cloud provider then you immediately defeated it's propose. For example AWS already suffered some severe incidents that affected their availability worldwide.

In my second role, we had a request duplicator. This tool allowed stress testing of new releases by amplifying live requests (e.g., x2, x4) to find breaking points. It also helped validate major architecture changes before going live by running them in parallel with production.

The request duplicator only relied on production responses but on my case it could be coded to consider the first response from production or failover. For strong consistency guarantees, it could wait for both before returning a response, backed by a TTL and a request failure-handling strategy.

Key Consideration: Applications using this approach must ensure side effects (e.g., emails, billing) only occur in production. A flag-based system is required to enforce this.

Bear in mind that I wasn't in the DevOps team, nor did I have input on the architecture. Thus, the diagram is trying to reflect what I was aware of and can recall.

I am thinking of also using this approach for BEAM Devs, as per the diagram image. However, in my case, I have a CRUD application from the user perspective, whereas in my previous roles, they were read-only for external users and CRUD internally based on background jobs or request metadata collection and analytics.

As with everything in software architecture, it's about trade-offs. Thus, this will have some, like added complexity to ensure no side effects occur in the non-production systems and to guarantee that both production and failover are in the same state (strong consistency).

So, my challenge is to be able to use the failover and request duplicator approach in conjunction with blue-green deployments and keep strong consistency guarantees for my CRUD application.

What would you do differently? Do you have any questions?

Link to the architecture diagram draft: https://beamdevs.com/images/architecture-draft.jpg

> NOTE: If this project resonates with you then you may want to visit beamdevs(.)com to subscribe for early access.

No comments yet.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection