Microsoft's New AI Can Make Photographs Sing and Talk — and It Already Has the Mona Lisa Lip-Syncing

Microsoft published a research paper this week highlighting a new AI model called VASA-1 that can transform a single picture and audio clip of a person into a realistic video of them lip-syncing — with facial expressions, head movements, and all.

The AI model was trained on AI-generated images from generators like DALL·E-3, which the researchers then layered with audio clips. The results are images-turned-videos of talking faces.

The researchers built on technology from competitors such as Runway and Nvidia, but state in the paper that their method of doing things is higher-quality, more realistic, and “significantly outperforms” existing methods.

→ Continue reading at Entrepreneur

Microsoft’s New AI Can Make Photographs Sing and Talk — and It Already Has the Mona Lisa Lip-Syncing

More from author

Related posts

Latest posts

Latest Posts

Most Popular

Fast Access