Ask HN: How to deal with multi-tenant SaaS database structure?
Hello HN, I'm building a SaaS product. Each tenant can have their own database structure/fields while many tables are common between the tenants. For example table A is common for all of them while table B might have different fields upon organization/tenant's need, in other words we have table B for all orgs with different fields.
I'm thinking of shared-codebase and shared-database paradigm for it's simplicity(no synchronization needed). Have you encountered this situation? Is it possible/and how to handle this objective with single database for all tenants? Further, I'm using PostgreSQL database.
Thanks in advance. You're heading the right way. A correct design is not simple, but it can be done. You might be able to buy a boilerplate that'll work as a starting point. Separate database will only work if you'll have a handful of customers, and even then a pain. Possible to do with one database, yes - Add JSON columns to your tables to hold tenant-specific metadata. Deal with it in your app layer. But I'd re-think whether a single database is truly the right answer. What is driving you to that solution? The main reason: getting rid of synchronization/migration process for many databases. What do you have to keep synchronized across tenants? And is consistency across tenants really a hard requirement? Would an eventual consistency model work? How does migration factor in? Do you mean migrating one database vs multiple databases? What exactly would you be migrating? And if it’s written in code, migrating individual tenants one at a time, at your pace or when a particular tenant is ready to migrate, might be a feature and not a bug. If one tenant misbehaves and takes down your database, how will your other tenants feel? What misbehaving do you mean? If traffic is what you mean, it's okay since at the first stage of the software we have time to test/scale accordingly. Therefore in that stage it might make sense to change the architecture to per-tenant database. One tenant sends you a flood of traffic, that you might not be scaled to handle. Let’s say your application servers are scaled for this traffic, but your single, shared database is not. All that traffic could cause an outage on your database. If this happens, the impact won’t necessarily be on one tenants but all tenants. There are ways you can mitigate this. Strong isolation is one. Throttling tenant calls could be another. We don’t know your specific use case enough, but it’s something to think about now. This is generally referred to as the noisy neighbor problem. https://docs.aws.amazon.com/wellarchitected/latest/saas-lens... As you mentioned there are easy workaround for this issue on both application and infrastructure level:
implementing rate limiters, monitoring/alerting systems for DB resource usages, putting quotas for tenants, etc.
After all a postgreSQL instance in 16GB ram and 8core cpu can handle pretty much of the traffic for most SaaS products at their first stage. How many tenants and how many clients per tenant will you support? Also how much data will each tenant consume? It differs for each tenants. Preferably there should be no limit for the number of orgs/tenants. The traffic for each tenant is also varying but it's safe to consider 500req/s. postgresql row level security excels on this right