| Sponsor | Irfan Essa
206 CCB http://www.cc.gatech.edu/~irfan |
| Area | GVU / Intelligent Systems (and Computational Perception Lab / FCE) |
Problem
In facial animation, Lip Synch is known as the process of synchronizing
a pre-recorded sound track with the lip and facial movements of a synthetic
character. To realize a successful animation, it is required to perform
an accurate timing alignment between the acoustic signal and the visual
rendering. The animation film "Alien
Song" by Victor Navone shows a good example of lip synch. The traditional
approach in computer graphics consists in labeling the sound track with
visual targets, these targets corresponding to the pronounced phonemes
(sound units in speech). This time-consuming step can be helped with speech
recognition but the available systems for lip synch making use of
speech recognition still require some "art work" to correct the automatic
labeling. An alternative approach consists in trying to estimate the facial
geometry directly from the acoustic signal. The goal of this project is
to investigate an example of this approach.
For this project, the visual rendering will be supported by the facial
animation system available at the CPL (ask Lionel Reveret). This system
uses a characterization of facial movements in speech production, based
on a set of visual speech parameters (see below article of Reveret et al.).
The work of this project will be to implement a prediction of these visual
speech parameters from an acoustic signal (multilinear regression, neural
networks, HMM,...)
Here is what you will need to do.