Text-to-Visual Speech Synthesis

Authors: Sahandi, R., Vine, D.S.G. and Longster, J.A.

Journal: Informatica, Special issue: NLP & Multi-Agent Systems

Volume: 22

Pages: 445-450

Abstract:

The development of interactive multimedia systems, coupled with the advances in computer technology and high-speed communication systems, has made it possible for information to be presented to users more effectively and efficiently. Man-machine communication can be enhanced via the use of synthesised speech and computer animation in multimedia systems. Whilst synthetic speech is potentially a more natural communication medium, it can be improved by the addition of an animated human face synchronised with the synthetic speech. This facial display provides a number of visual cues relating to what the speaker is saying and the speaker's emotional state. This increases intelligibility if the synthetic speech is degraded with noise, and allows knowledge transfer for the hearing-impaired, through lip-reading. This paper provides an overview of existing speech synthesis and facial animation techniques, and discusses the limitations of each. The paper concludes with a description of a visual speech synthesis system developed at Bournemouth University, and a discussion of the audio-visual synchronisation issues.

Source: Manual

Preferred by: Reza Sahandi