OpenAI has 3 new AI voice models that the ChatGPT maker says will ‘unlock a new class of voice apps for developers’
The models split reasoning, translation and transcription into separate tools, with translation covering more than 70 languages and transcription available through the Realtime API.
- OpenAI announced three new voice models available via its Realtime API, designed to help developers build specialized artificial intelligence applications for reasoning, translation, and transcription.
- Enterprises previously struggled with expensive, difficult-to-orchestrate voice agents requiring complex session resets; the new models reduce overhead by integrating audio as discrete orchestration primitives.
- GPT-Realtime-2 features "GPT-5 class reasoning" for complex requests, while GPT-Realtime-Translate handles 70+ input languages into 13 output languages and GPT-Realtime-Whisper performs streaming transcription making products feel "faster, more responsive, and more natural."
- Developers can access the models via the OpenAI Playground, with GPT-Realtime-2 priced at $32 per one million input tokens and $64 per one million output tokens, while GPT-Realtime-Translate costs $0.034 per minute and GPT-Realtime-Whisper costs $0.017 per minute.
- Organizations must now consider their orchestration architecture, specifically whether their stack can manage state across a 128K-token context window; these tools compete against Mistral's Voxtral models, which similarly separate transcription tasks.
11 Articles
11 Articles
OpenAI voice models get GPT-5-class reasoning
Voice agents have been expensive to run and painful to orchestrate, not because the models can't handle conversation, but because context ceilings forced enterprises to build session resets, state compression, and reconstruction layers into every deployment. OpenAI's three new voice models are designed to reduce that overhead, and they change how engineers can think about building voice into a larger agent stack.GPT-Realtime-2, GPT-Realtime-Tran…
OpenAI Launches New Real-Time Voice Models for Translation, Live Conversations
OpenAI has Officially unveiled a suite of new real-time voice models designed to revolutionize live translation and conversational intelligence, the update introduces three specialized models to the company’s Realtime API: GPT-Realtime-2, GPT-Realtime-Translate, and GPT-Realtime-Whisper. The rollout signals a strategic shift from simple "command-and-response" bots to sophisticated AI agents capable of reasoning, transcribing, and translating sim…
OpenAI brings GPT-5-class reasoning to real-time voice — and it changes what voice agents can actually orchestrate
Voice agents have been expensive to run and painful to orchestrate, not because the models can't handle conversation, but because context ceilings forced enterprises to build session resets, state compression, and reconstruction layers into every deployment. OpenAI's three new voice models are designed to reduce that overhead, and they change how engineers can think about building voice into a larger agent stack. GPT-Realtime-2, GPT-Realtime-Tra…
OpenAI unveils GPT-5-class voice models for real-time orchestration
OpenAI's modular voice AI tools could reshape crypto markets and decentralized computing, driving innovation and investment in AI infrastructure. The post OpenAI unveils GPT-5-class voice models for real-time orchestration appeared first on Crypto Briefing.
Coverage Details
Bias Distribution
- 67% of the sources are Center
Factuality
To view factuality data please Upgrade to Premium







