LLM Latency Leaderboard

Performance metrics across cloud providers as of 2025-10-22

Provider	Model	Capabilities	Avg Latency (ms)	Best (ms)	Worst (ms)	Pass Rate*
Loading benchmark data...

Measures time-to-first-token for serverless LLM providers
Measured from EU Central, observed similar results with US Central
_{Pass rate: A simple URL classification task was used with a small
max_tokens limit across multiple samples. (Some reasoning models failed by
initially
outputting <think> tokens.)}
Discussion on Reddit.

The information on this page is freely available under CC-BY-SA 4.0