Nvidia’s New Tool Uses Sound Files To Animate Faces In Real Time

The company’s new Audio2Face tool converts audio files into automated facial animation. In a range of tutorial videos, 3d models come to life when fed lines, speaking them with relatively accurate lip sync.
Audio2Face is a part of Nvidia’s new Omniverse platform. If that name rings a bell, it’s because it sounds like “metaverse,” the concept of an interconnected digital world touted by Microsoft and Meta (formerly Facebook). Nvidia’s Omniverse is in fact being pitched as a cornerstone of the metaverse: the platform, which is now shifting from beta testing to general availability, is used to develop virtual worlds and enable people to collaborate within them.
The characters in the videos (see below) may look a little uncanny, but their performances can be refined with various post-processing parameters. Nvidia boasts that the tool will eventually work with all languages. It also supports import-export with Unreal’s Metahuman Creator tool for the creation of virtual beings.

[embedded content]

The usefulness of software like Audio2Face to the metaverse is clear: just last week, Microsoft announced that its Teams meetings software will soon incorporate digital avatars, which will be animated in real time according to users’ speech.
But real-time facial animation increasingly has applications elsewhere, from video game characters to virtual beings and the pipelines of traditionally animated shows. Lip synching is a time-consuming aspect of animation, and Nvidia will be hoping that studios adopt it as a time-saving tool.
When it comes to real-time animation, Nvidia doesn’t just talk the talk — it’s now animating the talk, too.

[embedded content]