Microsoft’s VASA-1 Can Generate Realistic Human Videos From Images

  • Microsoft has showcased a new AI video generator.
  • The tool can mimic human expressions accurately using movements of the head, facial muscles, and other parts.

Microsoft has unveiled VASA-1, an AI tool that can create videos of human faces directly from still images. It can also synchronize facial expressions when an audio clip is provided. The company has showcased several samples from VASA-1 on its official website, and the results have impressed AI enthusiasts.

Microsoft VASA-1 AI Video Generator

Microsoft’s Visual Affective Skills Audio, or simply VASA-1, is a top-end model from the company specially curated around human facial expressions. It can generate a wide spectrum of feelings and emotions through facial dynamics and involves movements of face muscles, lips, nose, head tilts, and many other factors.

Here are some samples of videos generated from Microsoft VASA-1:

Currently, VASA-1 can generate videos at a maximum resolution of 512×512 pixels at 40fps. The company says the tool is designed to create videos that are as close as possible to real life.

It is important to note that Microsoft has showcased VASA-1 only as a research demonstration. The company has clarified that it has no plans to release a product or any APIs related to VASA-1. It added that Microsoft won’t release this product publicly, citing vast possibilities of misuse of this technology.

The concept of VASA-1 is similar to Sora by OpenAI. Both tools generate realistic-looking videos using AI. While VASA-1 is focused on human expressions, Sora can create complex videos with contextual backgrounds and artefacts.

However, neither tool has yet been released in the public domain. The official announcements from Microsoft and OpenAI highlight the capabilities and potential applications of VASA-1 and Sora in CGI and realistic AI-generated human avatars.

Google is also working on its AI video generator, VideoPoet. Although the initial samples from VideoPoet are not as good as VASA-1 or Sora, they highlight that even Google is trying to catch up to the AI video generator bandwagon.