Supported Models - GroqDocs

2 min read Original article ↗

Explore all available models on GroqCloud.

Note: Production models are intended for use in your production environments. They meet or exceed our high standards for speed, quality, and reliability. Read more here.

MODEL IDSPEED (T/SEC)PRICE PER 1M TOKENSRATE LIMITS (DEVELOPER PLAN)CONTEXT WINDOW (TOKENS)MAX COMPLETION TOKENSMAX FILE SIZE

MetaLlama 3.1 8Bllama-3.1-8b-instant

560

$0.05 input$0.08 output

250K TPM1K RPM

131,072

131,072

-

MetaLlama 3.3 70Bllama-3.3-70b-versatile

280

$0.59 input$0.79 output

300K TPM1K RPM

131,072

32,768

-

OpenAIGPT OSS 120Bopenai/gpt-oss-120b

500

$0.15 input$0.60 output

250K TPM1K RPM

131,072

65,536

-

OpenAIGPT OSS 20Bopenai/gpt-oss-20b

1000

$0.075 input$0.30 output

250K TPM1K RPM

131,072

65,536

-

OpenAIWhisperwhisper-large-v3

-

$0.111 per hour

200K ASH300 RPM

-

-

100 MB

OpenAIWhisper Large V3 Turbowhisper-large-v3-turbo

-

$0.04 per hour

400K ASH400 RPM

-

-

-

Systems are a collection of models and tools that work together to answer a user query.


MODEL IDSPEED (T/SEC)PRICE PER 1M TOKENSRATE LIMITS (DEVELOPER PLAN)CONTEXT WINDOW (TOKENS)MAX COMPLETION TOKENSMAX FILE SIZE

GroqCompoundgroq/compound

450

-

200K TPM200 RPM

131,072

8,192

-

GroqCompound Minigroq/compound-mini

450

-

200K TPM200 RPM

131,072

8,192

-


Discover how to build powerful applications with real-time web search and code execution

Note: Preview models are intended for evaluation purposes only and should not be used in production environments as they may be discontinued at short notice. Read more about deprecations here.

MODEL IDSPEED (T/SEC)PRICE PER 1M TOKENSRATE LIMITS (DEVELOPER PLAN)CONTEXT WINDOW (TOKENS)MAX COMPLETION TOKENSMAX FILE SIZE

Canopy LabsCanopy Labs Orpheus Arabic Saudicanopylabs/orpheus-arabic-saudi

-

$40.00 per 1M characters

50K TPM250 RPM

4,000

50,000

-

Canopy LabsCanopy Labs Orpheus V1 Englishcanopylabs/orpheus-v1-english

-

$22.00 per 1M characters

50K TPM250 RPM

4,000

50,000

-

MetaLlama 4 Scout 17B 16Emeta-llama/llama-4-scout-17b-16e-instruct

750

$0.11 input$0.34 output

300K TPM1K RPM

131,072

8,192

20 MB

MetaLlama Prompt Guard 2 22Mmeta-llama/llama-prompt-guard-2-22m

-

$0.03 input$0.03 output

30K TPM100 RPM

512

512

-

MetaPrompt Guard 2 86Mmeta-llama/llama-prompt-guard-2-86m

-

$0.04 input$0.04 output

30K TPM100 RPM

512

512

-

OpenAISafety GPT OSS 20Bopenai/gpt-oss-safeguard-20b

1000

$0.075 input$0.30 output

150K TPM1K RPM

131,072

65,536

-

Alibaba CloudQwen3-32Bqwen/qwen3-32b

400

$0.29 input$0.59 output

300K TPM1K RPM

131,072

40,960

-

Deprecated models are models that are no longer supported or will no longer be supported in the future. See our deprecation guidelines and deprecated models here.

Hosted models are directly accessible through the GroqCloud Models API endpoint using the model IDs mentioned above. You can use the https://api.groq.com/openai/v1/models endpoint to return a JSON list of all active models:

import requests
import os

api_key = os.environ.get("GROQ_API_KEY")
url = "https://api.groq.com/openai/v1/models"

headers = {
    "Authorization": f"Bearer {api_key}",
    "Content-Type": "application/json"
}

response = requests.get(url, headers=headers)

print(response.json())

Was this page helpful?