Google’s DeepMind’s New AI Tool Can Generate Audio, Dialogues For Muted Videos

  • Google’s new tool can create audio clips for videos that do not have any sound.
  • This new technology is based on Google’s in-house Veo project.

Google has embraced the AI era through its developments in Gemini and other tools. The company has already showcased VideoPoet and Veo, which can generate videos from text input. The company’s DeepMind AI unit has now unveiled a new video-to-audio (V2A) technology to create contextual audio files for silent videos. In simple terms, the technology can create dialogues and soundtrack for a video based on the scene. Here are the details.

Google’s V2A Technology Explained

Google DeepMind’s video-to-audio technology analyses the pixels in a video using natural text prompts, allowing the tool to understand the video content better. Using this data and Google’s in-house AI models, the V2A tool creates high-quality sound effects that match the video.

V2A also uses Veo, Google’s video generation tool, to create realistic sound effects. It also tries to match the tone of specific subjects in the video. The new tool can create audio and animated or stock footage content for human subjects. The company has shared several examples on its website showcasing the potential of this technology.

V2A by Google uses a diffusion-based technique, similar to most multimedia-related AI tools. It creates the final audio file by pairing a series of encoders and recorders with a trained diffusion model. The tool’s effectiveness will improve when trained on additional data, similar to AI chatbots like Gemini and ChatGPT.

V2A is currently a concept project and is not publicly available for use. Google says that further research is still underway to improve the V2A technology and achieve realistic results. Currently, several text-to-video and text-to-audio generators are available on the Internet, but Google’s V2A is a unique product that can create audio for the video provided by the user.

Google has not shared any timelines for the public rollout of V2A. Veo appears to be the company’s priority project in rivalling Sora, an AI video generator by OpenAI.