Performance metrics across cloud providers as of 2025-10-22
| Provider | Model | Capabilities | Avg Latency (ms) | Best (ms) | Worst (ms) | Pass Rate* |
|---|---|---|---|---|---|---|
| Loading benchmark data... | ||||||
Methodology
- Measures time-to-first-token for serverless LLM providers
- Measured from EU Central, observed similar results with US Central
-
Pass rate: A simple URL classification task was used with a small
max_tokenslimit across multiple samples. (Some reasoning models failed by initially outputting<think>tokens.) - Discussion on Reddit.
The information on this page is freely available under CC-BY-SA 4.0