Facial Animation: Using Speech Recognition for Lip Synch


Sponsor Irfan Essa
206 CCB
http://www.cc.gatech.edu/~irfan
Area GVU / Intelligent Systems (and Computational Perception Lab / FCE)

Problem
In facial animation, Lip Synch is known as the process of synchronizing a pre-recorded sound track with the lip and facial movements of a synthetic character. To realize a successful animation, it is required to perform an accurate timing alignment between the acoustic signal and the visual rendering. The animation film "Alien Song" by Victor Navone shows a good example of lip synch. The traditional approach in computer graphics consists in labeling the sound track with visual targets, these targets corresponding to the pronounced phonemes (sound units in speech). This time-consuming step can be helped with speech recognition software but the available systems for lip synch making use of speech recognition still require some "art work" to correct the automatic labeling. The goal of this project aims at improving this issue by using powerful speech recognition software.

A facial animation system is available at CPL (see L. Reveret) : it allows to generate a visual animation of a talking head model from a text input. The work of this project will be to align a text with any recorded sound track of a speaker, using the speech recognition softwares available at the CPL (ViaVoice, HTK, ...). The result of this project will allow any speaker to make a visual animation from his voice.
 

Here is what you need to do (and will learn)

Background Deliverables Evaluation
Based on the report turned in to the sponsor of the project by the due date.