LLM Latency Leaderboard

1 min read Original article ↗

Performance metrics across cloud providers as of 2025-10-22

Methodology

Provider Model Capabilities Avg Latency (ms) Best (ms) Worst (ms) Pass Rate*
Loading benchmark data...

Methodology

  • Measures time-to-first-token for serverless LLM providers
  • Measured from EU Central, observed similar results with US Central
  • Pass rate: A simple URL classification task was used with a small max_tokens limit across multiple samples. (Some reasoning models failed by initially outputting <think> tokens.)
  • Discussion on Reddit.

The information on this page is freely available under CC-BY-SA 4.0