![]()
For audio inputs and outputs with Chat Completions API
For audio inputs and outputs with Chat Completions API
The gpt-audio model is our first generally available audio model. It accepts audio inputs and outputs, and can be used in the Chat Completions REST API.
Oct 01, 2023 knowledge cutoff
Pricing
Pricing is based on the number of tokens used, or other metrics based on the model type. For tool-specific models, like search and computer use, there’s a fee per tool call. See details in the
Endpoints
Chat Completions
v1/chat/completions
Fine-tuning
v1/fine-tuning
Image generation
v1/images/generations
Image edit
v1/images/edits
Speech generation
v1/audio/speech
Transcription
v1/audio/transcriptions
Translation
v1/audio/translations
Completions (legacy)
v1/completions
Features
Function calling
Supported
Structured outputs
Not supported
Distillation
Not supported
Predicted outputs
Not supported
Snapshots
Snapshots let you lock in a specific version of the model so that performance and behavior remain consistent. Below is a list of all available snapshots and aliases for
gpt-audio
.
![]()
Rate limits
Rate limits ensure fair and reliable access to the API by placing specific caps on requests or tokens used within a given time period. Your usage tier determines how high these limits are set and automatically increases as you send more requests and spend more on the API.
| Tier | RPM | TPM | Batch queue limit |
|---|---|---|---|
| Free | Not supported | ||
| Tier 1 | 500 | 30,000 | 90,000 |
| Tier 2 | 5,000 | 450,000 | 1,350,000 |
| Tier 3 | 5,000 | 800,000 | 50,000,000 |
| Tier 4 | 10,000 | 2,000,000 | 2,000,000 |
| Tier 5 | 10,000 | 30,000,000 | 5,000,000,000 |