Four Million Lambda Invokes Across Python, Java, Rust, and Go

Direct Lambda invoke, no framework, no gateway. Four runtimes, 30 forced-cold samples each, 5,000 warm invokes per environment, ramped load to 500 VUs.

Press enter or click to view image in full size

TL;DR

Cold-start medians were not close: Rust 16.26 ms, Go 53.25 ms, Python 76.30 ms, Java 367.76 ms.
Under a p99 <= 100 ms, zero-error, zero-throttle rule, the highest tested stable point for all four runtimes was the same: 100 VUs.
At that 100 VU point, Python appeared to need about ~40 in-flight requests to sustain throughput that Rust held with about ~17.
Warm cost was nearly identical across all four, but all-cold cost was not: Java came out around ~$1.16 / 1M invokes versus Rust at ~$0.23 / 1M.

This post analyses how four AWS Lambda runtimes — Python, Java, Rust, and Go — behave under the simplest possible workload: an echo handler that returns its input unchanged. The point is not the handler. The point is that with application code stripped to nothing, every measured difference between runtimes can be attributed to the runtime itself or to the Lambda platform. There is nowhere else for the difference to live.

The findings are operationally consequential. Cold start separates the four runtimes by more than an order of magnitude…