AWS Fargate Price Reduction
aws.amazon.comAwesome to see Firecracker[1] already bearing fruit:
> At re:Invent 2018 we announced Firecracker, an open source virtualization technology that is purpose-built for creating and managing secure, multi-tenant containers and functions-based services. Firecracker enables you to deploy workloads in lightweight virtual machines called microVMs. These microVMs can initiate code faster, with less overhead. Innovations such as these allow us to improve the efficiency of Fargate and help us pass on cost savings to customers.
Also seeing interesting Firecracker developments around OSv (7ms boot times)[2] and Kata Containers[3]
1. https://firecracker-microvm.github.io/
I played around with fargate and one of the things I couldn’t work out is scaling quickly (or quick-enough). I think the problem wasn’t purely fargate but actually the load balancer. Even though containers were launched and responsive, the load balancer needed something like 3 liveness responses to bring it into the pool, and the time between each probe was something like 30 or 10 seconds and not very flexible (sorry, my memory is fuzzy)... so this felt like it only really fits loads that aren’t very spikey and also the potential saving from scaling down is somewhat reduced.
Did anyone experience something similar? Or maybe I did something wrong?
If one of the benefits of firecracker is quick spin up time, then this only works if the load balancer also responds quickly doesn’t it?
Granted, it was a while ago so things might have changed.
Fargate definitely takes some time to figure out. It took a while for us to realize that we needed to bump up our instance sizes because the default instance was a t2.micro.
However, now that we got it configured properly (took about 6 hours over the span of 3 days to catch the issues), we flawlessly serve 11M API requests/day without a problem. We were running these on DO boxes, moved it over to elastic beanstalk which caused more problems than it was worth, and finally landed on Fargate.
Tried EKS, but it was a bit more cumbersome than we would have liked for a K8s service. (We run another product of similar scale on K8s via GKE).
If you're looking for something closer to Heroku than K8s, then Fargate is decent option.
That's configurable - https://docs.aws.amazon.com/elasticloadbalancing/latest/appl... - default `HealthyThresholdCount` is 5 and `HealthCheckIntervalSeconds` is 30 seconds.
We adjust those down - somewhere 10-15 seconds for HealthCheckIntervalSeconds and 3 for HealthyThresholdCount works pretty well.
The fastest you can go on the Application Load Balancer is a health check every 5 seconds, with 2 successes being enough to put the machine in service, which means a minimum 10 second lag.
The Network Load Balancer is technically more scalable (able to accept more connections per second from the outside), but has a longer minimum inclusion time - 2 checks at a 10 second interval, so 20 seconds.
So yeah, you want slightly beefier containers, if you're scaling up and down heavily. But all this is pretty moot - whatever autoscaling parameters you set reaction time of CPU / RAM usage analysis is still going to be minutes. It seems like this is okay for now.
If you really want super fast scaling, use a Go function on Lambda (outside a VPC). With Firecracker improvements the cold start time should be barely noticeable, and you'll ramp up pretty quickly.
That’s still at least 30 seconds (+boot time for the container). So felt too slow for some use cases in my opinion
You get into some fundamental signal-processing type issues in terms of how quickly you respond to increases in a given value (incoming requests in this case, but it's a general issue) vs. (in this case) spinning up too many things and overcharging the customer. There's a limit to how reactive Amazon can be here, even in theory. You may have to do some pre-sizing if your needs are that great, and choose to take a possible over-provisioning hit vs. a possible underprovisioning hit. I think it's pretty obvious why Amazon would choose to bias in the underprovisioning direction in this case.
(There's some really good stuff in the signal processing field for anyone responsible for high-scale systems. An underrated branch of math for computer programmers. Believe it or not, the "fundamental limits" I'm referring to are the same ones involved in the Heisenburg Uncertainty Principle, when you get down into it.)
Glad to see this, Fargate was a pretty steep premium. Comparing an on-demand m5.large to the equivalent Fargate container looks to be a 20% premium, which seems reasonable.
Next I'd like to see an equivalent to EC2's reserved instance pricing.
FWIW, using M3.Mediums at Spot with AWS Batch, the per vcpu rate works out to 0.0067/hr. Fargate is still ~600% more expensive.
There is still a one-minute minimum.