1. Field of Invention
This invention relates generally to animation producing methods and apparatuses, and more particularly is directed to a method for automatically animating lip synchronization and facial expression for three dimensional characters.
2. Description of the Related Art
Various methods have been proposed for animating lip synchronization and facial expressions of animated characters in animated products such as movies, videos, cartoons, CD's, and the like. Prior methods in this area have long suffered from the need of providing an economical means of animating lip synchronization and character expression in the production of animated products due to the extremely laborious and lengthy protocols of such prior traditional and computer animation techniques. These shortcomings have significantly limited all prior lip synchronization and facial expression methods and apparatuses used for the production of animated products. Indeed, the limitations of cost, time required to produce an adequate lip synchronization or facial expression in an animated product, and the inherent limitations of prior methods and apparatuses to satisfactorily provide lip synchronization or express character feelings and emotion, leave a significant gap in the potential of animated methods and apparatuses in the current state of the art.
Time aligned phonetic transcriptions (TAPTS) are a phonetic transcription of a recorded text or soundtrack, where the occurrence in time of each phoneme is also recorded. A "phonemes" is defined as the smallest unit of speech, and corresponds to a single sound. There are several standard phonetic "alphabets" such as the International Phonetic Alphabet, and TIMIT created by Texas Instruments, Inc. and MIT. Such transcriptions can be created by hand, as they currently are in the traditional animation industry and are called "x" sheets, or "gray sheets" in the trade. Alternatively such transcriptions can be created by automatic speech recognition programs, or the like.
The current practice for three dimensional computer generated speech animation is by manual techniques commonly using a "morph target" approach. In this practice a reference model of a neutral mouth position, and several other mouth positions, each corresponding to a different phoneme or set of phonemes is used. These models are called "morph targets". Each morph target has the same topology as the neutral model, the same number of vertices, and each vertex on each model logically corresponds to a vertex on each other model. For example, vertex #n on all models represents the left corner of the mouth, and although this is the typical case, such rigid correspondence may not be necessary.
The deltas of each vertex on each morph target relative to the neutral are computed as a vector from each vertex n on the reference to each vertex n on each morph target. These are called the delta sets. There is one delta set for each morph target.
In producing animation products, a value usually from 0 to 1 is assigned to each delta set by the animator and the value is called the "morph weight". From these morph weights, the neutral's geometry is modified as follows: Each vertex N on the neutral has the corresponding delta set's vertex multiplied by the scalar morph weight added to it. This is repeated for each morph target, and the result summed. For each vertex v in the neutral model: ##EQU1##
.vertline.delta setx.vertline.*morph weightx
where the symbol .vertline.xxx.vertline. is used to indicate the corresponding vector in each referenced set. For example, Iresult is the corresponding resultant vertex to vertex v in the neutral model .vertline.neutral.vertline. and .vertline.delta setx.vertline. is the corresponding vector for delta set x.
If the morph weight of the delta set corresponding to the morph target of the character saying, for example, the "oh" sound is set to 1, and all others are set to 0, the neutral would be modified to look like the "oh target. If the situation was the same, except that the "oh" morph weight was 0.5, the neutral's geometry is modified half way between neutral and the "oh" morph target.
Similarly, if the situation was as described above, except "oh" weight was 0.3 and the "ee" morph weight was at 0.7, the neutral geometry is modified to have some of the "oh" model characteristics and more of the "ee" model characteristics. There also are prior blending methods including averaging the delta sets according to their weights.
Accordingly, to animate speech, the artist needs to set all of these weights at each frame to an appropriate value. Usually this is assisted by using a "keyframe" approach, where the artist sets the appropriate weights at certain important times ("keyframes") and a program interpolates each of the channels at each frame. Such keyframe approach is very tedious and time consuming, as well as inaccurate due to the large number of keyframes necessary to depict speech.
The present invention overcomes many of the deficiencies of the prior art and obtains its objectives by providing an integrated method embodied in computer software for use with a computer for the rapid, efficient lip synchronization and manipulation of character facial expressions, thereby allowing for rapid, creative, and expressive animation products to be produced in a very cost effective manner.
Accordingly, it is the primary object of this invention to provide a method for automatically animating lip synchronization and facial expression of three dimensional characters, which is integrated with computer means for producing accurate and realistic lip synchronization and facial expressions in animated characters. The method of the present invention further provides an extremely rapid and cost effective means to automatically create lip synchronization and facial expression in three dimensional animated characters.
Additional objects and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objects and advantages of the invention may be realized and obtained by means of the instrumentalities and combinations particularly pointed out in the appended claims.