The invention relates to computer generated graphics and more specifically relates to compression of time dependent geometric data.
In three-dimensional (3D) graphics, moving objects are modeled using 3D geometric models. These models are typically represented as sets of 3D coordinates that define the position of a mesh of surface elements representing the surface of an object in a 3D space. To render a scene containing 3D object models, the graphics rendering pipeline first performs a series of geometric transformations to transform models from their local coordinate space to global or xe2x80x9cworldxe2x80x9d coordinates of the scene and then to viewing or xe2x80x9ccameraxe2x80x9d coordinates of a 2D view space. It then converts the transformed geometry and its attributes (color, shading) to an array of pixel values representing an output image. This process is typically repeated for each image frame in an animation sequence as the object models move about the scene.
A moving graphical object is expressed in terms of time-dependent geometry. The mesh representing the 3D positions of a model moves and deforms over time to simulate the motion of a 3D object. There are a number of motion models used to describe the motion of 3D geometry in animation. Relatively simple objects can be modeled using geometric transformations on rigid bodies. However, increased computing capacity and the demand for more realistic animation has increased the demand for applications involving real-time playback of complex animated models.
Sophisticated authoring tools and modeling programs such as the Softimage modeling tool from Avid Technology, Inc., are capable of creating extremely complex time-dependent geometry. Free-formed deformation lattices, joint envelopes, and physical simulation, and other manipulations can create complex moving geometry sequences. As real-time applications demand more than simple rigid models with animated transformations, it becomes more critical to develop ways to efficiently store and playback complex animated models with real-time performance.
In addition to increasingly sophisticated authoring tools, advances in 3D capture systems are also likely to increase the complexity of time-dependent geometry used in 3D graphics applications. The term xe2x80x9c3D capturexe2x80x9d refers to the process of generating a digitized 3D model of a real object. Range scanners currently produce static geometry sets. However, as range-scanner accuracy and speed improves, there will be more sources of large time-dependent geometric meshes. Simulation is another source of rich animated geometry. Finite-element methods produce realistic and complex animations that are too expensive to compute in real time.
As the sources for complex time-dependent geometry become more prevalent, there is an increasing need for more efficient ways to store and transmit time-dependent geometry to reduce memory and bandwidth requirements. Researchers have studied ways to compress static geometry. Please see xe2x80x9cGeometric Compression,xe2x80x9d Michael F. Deering, pp. 13-20, SIGGRAPH ""95; xe2x80x9cOptimized Geometry Compression for Real-Time Rendering,xe2x80x9d Mike M. Chow, pp. 347-354, Proceedings of IEEE Visualization, ""97; xe2x80x9cReal Time Compression of Triangle Mesh Connectivityxe2x80x9d, Stefan Gumhold and Wolfgang Straxcex2er, pp. 133-140, SIGGRAPH 98; xe2x80x9cGeometric Compression Through Topological Surgeryxe2x80x9d, Gabriel Taubin and Jarek Rossignac, ACM Transactions on Graphics, Vol. 17, No. 2, April 1998, pp. 84-115; xe2x80x9cProgressive Forest Split Compressionxe2x80x9d, Gabriel Taubin, Andre Gueziec, William Horn, and Francis Lazarus, pp. 123-132, SIGGRAPH 98; xe2x80x9cTriangle Mesh Compressionxe2x80x9d, Costa Touma and Crag Gotsman, Proceedings of Graphics Interface ""98, pp. 26-34; and xe2x80x9cDescription of Core Experiments on 3D Model Codingxe2x80x9d, Frank Bossen (editor), ISO/IEC JTC!/SC29/WG11 MPEG98/N244rev1, Atlantic City, October 1998. While this research addresses compression of static geometry, more work needs to be done to develop ways to compress a moving 3D geometry stream.
In contrast to compression of 3D geometry, the fields of still image and moving image compression are well developed. A variety of techniques can be used to compress still images, such as run-length encoding, JPEG coding, etc. There are also many techniques for compressing image sequences such as MPEG, AVI, etc. Researchers have even presented techniques to use 3D geometry to assist in movie compression. See xe2x80x9cMotion Compensated Compression of Computer Animated Frames,xe2x80x9d Brian K. Guenter, Hee Cheol Yun, and Russell M. Mersereau, pp. 297-304, SIGGRAPH ""93; xe2x80x9cPolygon-Assisted JPEG and MPEG Compression of Synthetic Images,xe2x80x9d Mark Levoy, pp. 21-28, SIGGRAPH ""95; and xe2x80x9cAccelerated MPEG Compression of Dynamic Polygonal Scenes,xe2x80x9d Dan S. Wallach, Sharma Kunapalli, and Michael F. Cohen, pp. 193-197, SIGGRAPH ""94.
In one respect, the traditional graphics rendering pipeline provides a form of compression of animated geometry in the case where an animated object is represented as a static, rigid body that is transformed using a series of animated transformation matrices. In this case, the time-dependent geometric model is reduced to a single mesh representing the rigid body and a series of animated transformation matrices that describe the rigid body""s motion over time. This simple separation into coherent parts allows the encoding of a large family of time-dependent animations because moving objects can be constructed as hierarchies of rigid objects. While this is an effective way to compress a limited class of time-dependent geometry, it does not fully address the need for a more general and flexible approach for compressing more complex animated models. Some forms of complex motion are not well simulated using a hierarchy of rigid bodies and associated transformation matrices. In addition, some models are not constructed from rigid bodies, but instead, originate from a geometry source such as an authoring tool or 3D capture tool where the geometry is not expressed in terms of rigid bodies.
The invention provides methods for coding time-dependent geometry and animation. Aspects of these methods can be implemented in encoders and decoders of time-dependent meshes representing animated 3D objects as well as 3D animation that varies over a dimension other than time. These techniques can be used to store and transfer a 3D geometry stream more efficiently. This is useful within a computer system to reduce bandwidth between a host processor or storage device and a graphics rendering engine. It is also useful for reducing transmission bandwidth between computers on a local or wide area network. In addition, these techniques are useful in dynamic compression contexts, where a geometry stream is encoded within time constraints, such as applications where the geometry stream is generated, coded and then decoded for immediate playback.
In general, the compression methods of the invention code a geometry stream by solving for low-parameter models of the stream and encoding the residual. A compressor operates on a time-dependent geometry structure representing 3D positions of an object at selected time samples. In particular, the coders described below focus on compressing a matrix of vertex positions that represents the 3D positions of a mesh (the columns of the matrix) for series of time samples in an animation sequence (the rows in the matrix represent meshes at selected time samples). The compressor approximates the mesh for each time sample and encodes the residual between the approximated mesh and the actual mesh from a row in the matrix. The compressor encodes a coherent portion of the geometry or base mesh, the residual, and parameters used to approximate the mesh. The decompressor decodes the compressed geometry stream and reconstructs the mesh for selected time samples from the coherent portion, the residual and the parameters used to approximate each mesh.
One form of coder is a basis decomposition coder. Using principal component analysis, a matrix of time dependent geometry can be decomposed into basis vectors and weights. The basis vectors and weights can be used to compute a residual by reconstructing an approximation of the original matrix from the basis vectors and weights, and then encoding the difference between original matrix and the approximate matrix. In this form of coder, the residual is encoded along with the basis vectors and weights. An alternative basis decomposition coder decomposes a matrix of time dependent geometry into basis vectors and weights and then uses column/row prediction on the weights to encode the weights.
Another form of coder is a column/row predictor. This form of coder exploits coherence in a matrix representing time-dependent geometry. For example, the temporal coherence of the matrix of vertex positions can be exploited by encoding the difference between neighboring rows in the matrix. The spatial coherence can be exploited by encoding the difference between neighboring columns in the matrix. To improve coding efficiency, the rows and columns can be sorted to minimize the differences between neighboring rows/columns. Column and row prediction applies to a matrix representing vertex positions, including the input matrix, the matrix of residuals, and the matrix of weights from a principal component analysis on either of these matrices.
The above coding techniques may be extended to other forms of geometric data used in animation. For example, they apply to coding of texture coordinates.