Facial animation

The objective of this work is to develop acoustic-to-visual mappings to animate 3D facial models directly from acoustic features of speech. Our approach does not require a phonetic transcript of the utterance; instead, a mapping is performed directly from low-level acoustic features (e.g., energy, pitch, spectral envelope) onto facial configurations. The acoustic-to-visual mapping is performed by a hybrid model consisting of a discriminatory stage (Fisher’s Linear Discriminant Analysis) that selects the most informative acoustic features and a generative stage (input-Output Hidden Markov Models) that models the joint dynamics of acoustic and visual features.

Relevant publications

P. Kakumanu, A. Esposito, O.N. Garcia, R. Gutierrez-Osuna

A comparison of acoustic coding models for speech-driven facial animation (Article)

Speech communication, 48, 6, Page(s): 598–615, 2006.

(Abstract | Links | BibTeX)

R. Gutierrez-Osuna, PK Kakumanu, A. Esposito, ON Garcia, A. Bojorquez, JL Castillo, I. Rudomin

Speech-driven facial animation with realistic dynamics (Article)

Multimedia, IEEE Transactions on, 7, 1, Page(s): 33–42, 2005.

(Abstract | Links | BibTeX)

S. Fu, R. Gutierrez-Osuna, A. Esposito, P.K. Kakumanu, O.N. Garcia

Audio/visual mapping with cross-modal hidden Markov models (Article)

Multimedia, IEEE Transactions on, 7, 2, Page(s): 243–252, 2005.

(Abstract | Links | BibTeX)

P. Kakumanu, R. Gutierrez-Osuna, A. Esposito, R. Bryll, A. Goshtasby, ON Garcia

Speech driven facial animation (Conference)

Proceedings of the 2001 workshop on Perceptive user interfaces, 2001.

(Abstract | Links | BibTeX)