Previous full o-series reasoning model Previous full o-series reasoning model The o1 series of models are trained with reinforcement learning to perform complex reasoning. o1 models think before they answer, producing a long internal chain of thought before responding to the user. 100,000 max output tokens Oct 01, 2023 knowledge cutoff Pricing Pricing is based on the number of tokens used, or other metrics based on the model type. For tool-specific models, like search and computer use, there’s a fee per tool call. See details in the Endpoints Chat Completions v1/chat/completions Fine-tuning v1/fine-tuning Image generation v1/images/generations Image edit v1/images/edits Speech generation v1/audio/speech Transcription v1/audio/transcriptions Translation v1/audio/translations Completions (legacy) v1/completions Features Function calling Supported Structured outputs Supported Distillation Not supported Predicted outputs Not supported Snapshots Snapshots let you lock in a specific version of the model so that performance and behavior remain consistent. Below is a list of all available snapshots and aliases for o1 . Rate limits Rate limits ensure fair and reliable access to the API by placing specific caps on requests or tokens used within a given time period. Your usage tier determines how high these limits are set and automatically increases as you send more requests and spend more on the API.![]()
![]()
![]()
Tier RPM TPM Batch queue limit Free Not supported Tier 1 500 30,000 90,000 Tier 2 5,000 450,000 1,350,000 Tier 3 5,000 800,000 50,000,000 Tier 4 10,000 2,000,000 200,000,000 Tier 5 10,000 30,000,000 5,000,000,000