This invention relates to systems and methods for creating visual images responsive to analyzed speech data so as to produce a graphicl representation of known type such as the lip movements corresponding to the speech sounds. The invention is particularly suitable for creating visual images of lip movements in films, video tapes, and on other recorded media.
In the production of many types of audio-visual media the speech sounds and the visual images are recorded simultaneously. For example, in the making of motion pictures or like audio visual recordings, the voice of the actor is recorded on the sound track at the same time that the actor is emitting speech sounds. Where the film is intended to be played as originally produced, the speech sounds of the sound track correspond to the lip movements emitted. However, it frequently happens that the audio portion or sound track is to be in a language other than the original one spoken by the actor. Under such circumstances a new sound tract in another language is "dubbed". When this is done the speech sounds do not correspond to the lip movements, resulting in an audio-visual presentation that looks unreal or inferior.
In animated cartoons it is also a problem to provide lip movements which correspond to the speech sound. This may be done, however, by utilizing individual art work or drawings for the lip movements, sometimes as many as several per second. Because of the necessity of making numerous drawings by hand or other laborious art techniques, the cost of animated cartoons tend to be expensive.