OpenAI Unveils Three Audio Models for Real-Time Voice Tasks
The models include GPT-Realtime-2 with GPT-5-class reasoning, live translation across 70 input languages, and streaming transcription.
- On Thursday, OpenAI introduced three new Realtime voice models—GPT-Realtime-2, GPT-Realtime-Translate, and GPT-Realtime-Whisper—to its Realtime API, designed to help developers create more conversational software agents.
- The launch follows viral criticism from TikTok user Husk regarding previous models' inability to accurately 'start a timer,' prompting OpenAI to build systems that better handle interruptions and complex user requests.
- GPT-Realtime-2, the flagship model featuring 'GPT-5-class reasoning,' is priced at $32 per 1M audio input tokens and $64 per 1M output tokens; the translation model supports more than 70 input languages and 13 output languages.
- Companies including Zillow, Priceline, and Deutsche Telekom are currently testing these models to develop voice assistants that can 'keep pace' with users, enabling actions like scheduling tours during live conversations.
- Moving beyond simple chat, these tools enable 'voice-to-action' capabilities that allow developers to build agents listening, reasoning, and performing complex workflows in real time, marking a broader shift in computing interfaces.
31 Articles
31 Articles
OpenAI launches GPT-Realtime-2 and two new voice API models
GPT-Realtime-2 brings GPT-5-class reasoning to live voice. A separate translation model covers 70+ input languages. A streaming Whisper variant handles transcription. The pricing is aggressive enough to make the comparison unavoidable. OpenAI released three new voice models in its API, broadening the range of surfaces where developers can plug GPT-class reasoning into live audio. The […] This story continues at The Next Web
OpenAI launches GPT-Realtime-2 for smarter live voice AI interactions
OpenAI has introduced three new audio models through its API, expanding its push into real-time voice AI for developers. The launch includes GPT-Realtime-2, GPT-Realtime-Translate, and GPT-Realtime-Whisper, each targeting a different part of live voice interaction. The company said the new models aim to make voice software more useful in everyday situations. That includes handling conversations while driving, navigating airports, or getting cust…
OpenAI unveils three audio models for real-time voice tasks
Coverage Details
Bias Distribution
- 67% of the sources are Center
Factuality
To view factuality data please Upgrade to Premium















