1. Field of the Invention
This invention relates to the field of parameters for describing object animation, particularly to methods of reducing the number of graphic articulation parameters (GAPs) that must be conveyed to render an object.
2. Description of the Related Art
The existing and developing Motion Picture Expert Group (MPEG) standards provide techniques for coding digital video signals over band limited channels. Included within the MPEG standards are parameter definitions which are to be used to describe the animation of 3-D objects, i.e., the movement of selected points of an image of a 3-D object from one video frame to the next. For example, graphic articulation parameters (GAPs) known as xe2x80x9cbody animation parametersxe2x80x9d (BAPs) have been defined for describing the animation of the human body, and a set of xe2x80x9cfacial animation parametersxe2x80x9d (FAPs) have been developed for describing the movements of a human face.
The MPEG-4 standard under development will include the capability to generate and transmit synthetic xe2x80x9ctalking headxe2x80x9d video for use in multimedia communication systems, and will use the FAP set to convey facial animation information. The FAP set enables model-based coding of natural or synthetic talking head sequences and allows intelligible reproduction of facial expressions, emotions and speech pronunciations at the receiver. Currently, the FAP set contains 68 parameters that define the shape deformation or movements of a face. For example, the parameter open_jaw defines the displacement of the jaw in the vertical direction while the parameter head_yaw specifies the rotational yaw angle of the head from the top of the spine. All the FAPs are defined with respect to a neutral face and expressed in a local coordination system fixed on the face.
The digitizing of video information typically produces very large amounts of data, which requires vast amounts of storage capacity if the data is to be stored, on a hard drive, CD-ROM, or DVD disc, for example. Similarly, transmitting the video data over a distance via some type of communications link requires a considerable amount of bandwidth. For example, the 68 parameters of the FAP set are defined as having 10 bits each. State-of-the-art modems provide 56 kbits/sec downstream capability from a central location to a home. Since the 68 FAPs represented by 10 bits at a 30 Hz video rate required only 20.4 kbits/sec, it is possible to transmit them uncoded and thus preserve their visual quality. However, this approach does not contemplate nor will it support the simultaneous transmission of multiple talking heads as part of a single video signal as may occur in a virtual meeting, for example, or the transmission of the FAPs as part of larger synthetic objects, for example, full-body animation.
To conserve storage space and to permit the use of currently available communications links, methods of reducing the amount of data required to animate an object are needed.
A data reduction and representation method as presented which reduces the number of parameters that must be stored or transmitted to animate an object, and provides a representation scheme which enables the animated object to be reconstructed from the reduced number of conveyed parameters.
An interpolation process is used to identify a number of parameters that can be derived from other parameters. A data structure, preferably a directed graph, is then created which depicts the identities of the xe2x80x9cderivedxe2x80x9d parameters, the xe2x80x9cdefiningxe2x80x9d parameters from which derived parameters can be interpolated, and the relationship between them. The parameters reside at nodes on the graph which are interconnected with directed links that indicate the xe2x80x9cparentxe2x80x9d to xe2x80x9cchildxe2x80x9d relationship of a defining parameter, to a derived parameter. Each directed link represents one or more interpolation functions, each of which defines how a derived parameter is to be interpolated from its respective defining parameters.
An extended rational polynomial is preferably used to specify the interpolation functions defined for respective derived parameters. For an interpolation to be carried out, the polynomial must be supplied a set of values (e.g., number of terms, coefficients, exponent values); supplying such a set of values to the polynomial enables a derived parameter to be interpolated from its defining parameters. Sets of values are determined for each of the derived parameters when creating the directed graph, and are stored or transmitted along with the graph. The graph and the sets of values are retrieved from storage or received over a communications link to set up a decoder. Frames containing defining parameters are then sent to the decoder, which performs the interpolations as directed by the graph and using the interpolation functions (by supplying the sets of values to the polynomial) to reconstruct the unstored or untransmitted parameters.
The data reduction and representation method described herein can be utilized for a number of purposes. As noted above, interpolating some parameters from other parameters reduces the amount of data that must be stored or transmitted. Interpolation also enables the creation of views of an object when information on such a view is not available. For example, a decoder could interpolate the left side of a face based on FAPs describing the animation of its right side. The method also permits a hierarchical relationship to be defined between xe2x80x9chigh levelxe2x80x9d parameters like the xe2x80x9cexpressionxe2x80x9d and xe2x80x9cvisemexe2x80x9d FAPs, and the lower level FAPs which can be interpolated from them. Many applications would benefit from the method""s data reduction capabilities, such as interactive 3-D games, computer kiosks and talking agents.