We Are Saving Costs by Dumping AWS Cloud
Amazon is using the same trick with AWS pricing as cellphone providers: they rely on bad math capabilities of their clients. The costs for one single item, in the case of AWS this is one hour of computing, is incredibly low, parts of a cent - but most people out there are not able to do simple maths like multiplication (by 720, the number of hours in a month) or summation (with the costs of the other services you have to implicitly use).
When hosting a simple one-node Kubernetes cluster the costs are easily at a three-figure number per month - the fees for EKS, the fee for the EC2 node, the fee for the inbound load balancer, the fee for storage, ... compared to a root server for $10 that easily outperforms a $60 EC2 instance that's really overpriced. If you've got funding (or immediate profitability due to traction) and a hockey stick growth rate, the cloud makes sense. It's just the cost of business to support rapidly scaling and developer velocity to try to capture as much of the market as fast as possible. As long as your profit stays ahead of your cloud costs, mission accomplished. If the above does not apply, of course you're going to be better off using a combination of Cloudflare (CDN, networking), Backblaze (object store, also has S3 compatibility layer), and either dedicated service providers (OVH) or VPS providers (Digital Ocean). Perhaps even colocation if you've got the experience in house (Stackoverflow and Wikipedia, for example). AWS and other cloud providers are designed for the price insensitive, who prefer having a single vendor, more abstraction away from the metal, or require support for bursty workloads. Shop around and model your run rate based on expected workloads. There is no best or worst solution, only a scale of solutions for your use case(s), ranging from suboptimal to optimal. I've talked* to a number of bootstrapped and non-profit companies who are all-in on cloud and I think there are a few use-cases you're missing beyond just "we value dev velocity over cost savings." The biggest one is ease of scaling vs something like colocation. I talked to a non-profit with incredibly spiky traffic based around whenever they get mentioned in the news. Since every dollar matters for them, being able to scale down to a minimal infrastructure between spikes is key to their survival. Another company I talked to has traffic that's reliably 8x larger during US business hours vs night time and uses both autoscaling and on-demand services (dynamodb, aurora serverless) to pay ~1/3 of what they'd have to if they needed to keep that 8x capacity online all the time. I agree that the velocity/cost tradeoff is one of the better reasons to go cloud, it's far from the only one, and it's certainly not the case that cloud is only for the price insensitive. If nothing else, the proliferation of cloud cost management tools shows plenty of companies care about their cloud spend. * I work on an aws cost management tool and am doing a lot of customer research interviews this month. > I talked to a non-profit with incredibly spiky traffic based around whenever they get mentioned in the news. Since every dollar matters for them, being able to scale down to a minimal infrastructure between spikes is key to their survival. Frankly, I don't understand this point. For €40 a month with Hetzner you'll get an i7 with 64GB RAM and 2x512 SSD that will easily handle any traffic spike (we're talking about a website, right?). €40 is not a lot of money, even for a non profit, and especially for one experiencing a huge spike after being mentioned in the news. On the contrary, I would advise against AWS here as they're completely unpredictable. You can be charged any amount, and the costs only partly depend on you. Developers have been demanding hard caps for over a decade to no avail. My comment is not meant to be a comprehensive analysis of cost/benefit for cloud use cases, but only examples for a thread with a half life of ~24 hours. Please consider publishing your research findings in the future if it doesn't put you at a competitive disadvantage. A rising tide lifts all boats, and cloud spend tracking and modeling is a pain in the ass (as your success demonstrates). This advice holds if you have a stock based compensation or a bonus based on profitability. Here wasting money on cloud directly impacts the profitability (and by extension, the worth of your stock). If you are a dev that's paid to churn out features, and your compensation isn't stock based, just use whatever allows you the greater flexibility and velocity. Cloud venders also help by keeping all the network chatter internal and you not paying in/egress fees to connect to your external object store, to your external managed database, to your. The bandwidth costs can be a significant chunk of hosting. Thankfully, I've never had a popular enough service that made the bandwidth a meaningful cost. I assume you're using Backblaze to host static assets you're rendering from your compute, and using Cloudflare to front and cache those static assets. You'd also use Backblaze for backups. Most network chatter would remain within your compute, and you'd want Redis, Mongo, Elastic, or Postgres as close to the compute (maybe k8s, maybe VMs, up to you) as possible, while shipping the snapshots to your object store. To your point about bandwidth costs, I recommend Backblaze and Cloudflare because they have an arrangement in that regard [1]. Reliably decouple whenever possible. [1] https://www.cloudflare.com/bandwidth-alliance/ (Control-F "partners") For a couple of projects I use the b2/cloudflare "free bandwidth", but I have to pay egress from my droplet to b2 on user uploads, its well within the 1TB that you get for a $5 droplet, and smaller than a rounding error against my total bandwidth bucket of 9TB. So as I said, I'm fine. Thanks for sharing your use case. I will submit a feature request with Backblaze to support request signing to support user uploads directly to the B2 object store, similar to what S3 supports [1]. [1] https://docs.aws.amazon.com/AmazonS3/latest/dev/PresignedUrl... Not a big AWS fan or anything, but... > they rely on bad math capabilities of their clients No, they don't. Showing hourly prices makes sense for hourly services. They also provide a detailed cost estimator, because the arithmetic gets pretty detailed: https://calculator.aws > compared to a root server for $10 that easily outperforms a $60 EC2 instance that's really overpriced. Let's say a developer costs $75/hr. If AWS saves my dev team 10 hrs/mo., then I'm willing to pay a $750 premium for it. Developer costs are also unpredictable. Turnover will cause spikes in my costs, for example. Server costs are predictable. If I need to pay AWS more to save my developers time, I can always do that. I can't always just throw another developer onto my team when I need one. The only people who are optimizing for three-figure costs every month are either: 1) paying very, very low salaries to their developers, or 2) not thinking about their biggest cost, which is developer time. > When hosting a simple one-node Kubernetes cluster Who would do this and why? Why would you have a load balancer in front of a single server? > The only people who are optimizing for three-figure costs every month are either: In our case, we are doing the same for around 50 clients, so it sums up :-) > Who would do this and why? Why would you have a load balancer in front of a single server? afaik, for publishing something to the outside world from EKS, that's the only way - even for single node clusters > In our case, we are doing the same for around 50 clients, so it sums up :-) Wow, so AWS is actually a better value for you than for most people. Being able to easily script and deploy your infra is hugely valuable when you have so much overhead. We heavily use Elastic Beanstalk for dozens of running services, and it's amazing. We don't think about infra at all. > for publishing something to the outside world from EKS But why are you using Kubernetes at all? What problem is it solving for you? See also: https://endler.dev/2019/maybe-you-dont-need-kubernetes/ We built a software (Botium Box - https://botium.ai) mainly for On-Premise use, and we delivered as Kubernetes, Openshift, Docker. We added a hosted plan later and thought it would be a good idea to just use managed Kubernetes for this offer as it didn't require much coding changes. We have to support multiple clouds (Azure and AWS), but with Rancher, it is really easy in usage - setting up new clusters, deploying new services, restarting, logging etc. But now that we built up container technology know-how we are transitioning every service where we don't need the scaling capabilities of Kubernetes to plain old docker-compose on baremetal. Thanks for the interesting article, didn't know about Nomad and will try it for sure. what i think most of people are not getting about cloud providers is that you can surely get better deal out of your money if you go to bare metal, but what you will miss out is a lot of services that come with AWS and that you would end up provision and maintain (e.g. S3, cloudfront). If you are a large business that can afford certain upfront costs to setup certain services cloud services might not be the best place, but for pretty much everyone I think you get a very simplified environment to start your business. fully agree - that's how we started - getting some servers up and running is quick and easy, but as soon as you can foresee what computer power you will need in the next months and years, baremetal is surely the better choice. If you can deal with the technical stuff, of course. During the COVID-19 pandemic I found myself doing nothing interesting during the weekends, so I decided to offer my AWS knowledge to medium sized companies in regards to their monthly expenses in cloud compute. I thought I would help them save a couple of thousand dollars from their annual bills, but so far I already shaved more than $100k from three different companies. I am quite excited about this side gig, and I hope to be able to continue helping other companies this year too. It's how the whole 'cloud' operates; and it only adds up. The cloud is a complete scam in costs, especially if you're using it for very high compute tasks, deep learning, source control management or CI/CD. Another example of this scam is Firebase (aka Google Cloud). Popular with mobile devs who not only risk getting locked in, but are offered limited functionality (Firestore) that charges for reads which that optimises for a huge bill if one is not careful. For servers, as soon as you add Kubernetes, just remember to not show the 'operating costs' section of the balance sheet to the investors, unless you have a healthy recurring revenue stream. A more important figure to watch out for is iOPS and burst VCPU . Things are running fine, and when you get a small traffic spike, suddenly everything hangs (IO and cpu time). Then you reluctantly upgrade to solve the problem, leaving you over-resourced for most of the day. Next bump, next upgrade until you stair-step your spending into 6 figures Yeah but you can spin up everything you need in a day. Then you can shut it all down. If you’re a startup, it can make a ton of sense. sure, that's a good point. and scaling a one-node-cluster to a multi-node-cluster is also nearly no effort on AWS. but for the topics where we can predict the computing power for the next months, we migrated everything to baremetal. Thank you for the twitter post. test comment thanks for this valuable input