Amazon plays catch-up with new Nova AI models to generate voices and video
Amazon is showing off new AI technology this week, including its take on a more conversational voice model to better compete with things like Gemini Live and OpenAI’s Advanced Voice Mode and an update to its model that can generate video. The new Nova Sonic voice model handles real-time speech processing and AI voice generation […]


Amazon is showing off new AI technology this week, including its take on a more conversational voice model to better compete with things like Gemini Live and OpenAI’s Advanced Voice Mode and an update to its model that can generate video.
The new Nova Sonic voice model handles real-time speech processing and AI voice generation for conversational applications, Amazon says. Nova Sonic uses a “unified model architecture” that Amazon claims is better than other approaches that interconnect separate models to handle speech recognition, speech-to-text conversion, response generation, and then text-to-audio. Amazon says Nova Sonic can also better detect someone’s tone and deliver more natural responses.
Nova Sonic is available to try through Amazon’s Bedrock developer platform and the company says it can be used to make things like customer service bots or build AI agents for travel, education, healthcare, and a variety of other industries. “Components” of Nova Sonic are already being used in Amazon’s new Alexa Plus assistant, Amazon’s Rohit Prasad, SVP and head scientist of AGI, told TechCrunch.
As for video, Amazon announced Nova Reel 1.1, which the company says provides quality and latency improvements over 1.0. It also can now keep consistent styles across multiple 6-second scenes cut together to a full video of up to two minutes in length.