240 tokens/s achieved by @GroqInc's custom chips on Lama 2 Chat (70B) Artificial Analysis has independently benchmarked Groq’s API and now showcases Groq’s latency, throughput & pricing on https://t.co/jq2TzJMrHT This represents a milestone for the application of custom silicon https://t.co/yDwLGdaE4B

1 min read Original article ↗

Post

Post

user avatar

240 tokens/s achieved by

@GroqInc

's custom chips on Lama 2 Chat (70B) Artificial Analysis has independently benchmarked Groq’s API and now showcases Groq’s latency, throughput & pricing on ArtificialAnalysis.ai This represents a milestone for the application of custom silicon to large language models and AI Groq are serving a full quality FP16 version of Llama 2 Chat (70B) with the model’s full 4k context window See full results here: artificialanalysis.ai/models/llama-2…

Don't miss what's happening

People on X are the first to know.