OpenAI: GPT-4o Audio
OpenAI: GPT-4o Audio is a audio model for vision-language understanding. It combines multimodal input handling and audio processing with a 128K tokens context window and a premium profile. Use it for audio understanding and multimodal input when quality, speed, and cost matters. It is a practical choice for teams that need reliable output, flexible deployment, and room to scale.
Input
$2.50/1M
Output
$10.00/1M
Cached
$0.25/1M
Batch
$1.25/1M