1. Field of Invention
This invention relates to computer animation, methods, systems and program products. More particularly, the invention relates to animated 3-D wireframe models encoded/decoded for Internet streaming application; methods, systems and program products.
2. Description of Prior Art
Image, video, audio and computer graphics represent a major source of multimedia in the present time. Increasing demand of modem applications such as audio and video conferencing or IP telephony has led audio, video and still image media to become particularly popular and fuelled further research and development in the processing of their signal, as well as in multimedia communications. As a result, international standardization effort took place.
Applications such as digital TV broadcasting, interactive 3-D games and e shopping combined with the high popularity of the Internet are rapidly changing the scenery demanding richer interactive multimedia services. Existing media gain new importance: 3-D graphics and animation are among them. Despite the existence of a plethora of file formats and encodings for 3-D data i.e. Virtual Reality Modeling Language (VRML 2.0), an efficient compression and animation framework is sought in the context of Motion Picture Expert Group (MPEG-4).
MPEG-4 attempts to provide the state-of-the-art standard that covers among others the aforementioned areas of 3-D scene compression and delivery through a tool set called Binary Format for Scenes (BIFS) BIFS is the compressed binary format in which 3-D scenes are defined, modified (BIFS-Command) and animated (BIFS-Anim). BIFS is derived from VRML that it extends by preserving backward compatibility. In addition to BIFS, the Synthetic/Natural Hybrid Coding (SNHC) tools yield reasonably high compression for still textures, face and body animation, and 3-D mesh coding. In MPEG-4, the two main animation tools, BIFS-Anim and ‘face-anim’ are based on a Differential Pulse Code Modulation (DPCM) system and arithmetic encoding that allows for low delay coding.
Designing a codec suitable for the best-effort Internet requires, besides the signal-processing domain, special consideration of the channel characteristics. These refer to packet loss, reordering and duplication, delay, delay variation (jitter) and, even fragmentation. Traditional packet audio/video tools, such as Robust Audio Tool (RAT) described in “Successful Multiparty Audio Communication over the Internet” by V. Hardman, M. A. Sasse, I. Kouvelas published in Communications of the ACM, Vol. 41, No. 5, 1998 and Video Conference Tool (VIC) described in “vic: A Flexible Framework for Packet Video” by S. McCanne, V. Jacobson published in ACM Multimedia '95, San Francisco, Calif. November 1998), Both tools use Real-Time Transport Protocol (RTP) described in “RTP: A Transport Protocol for Real Time Applications” by H. Schulzrinne, S. Casner, R. Frederick, V. Jacobson, published in RFC 1889, Internet Engineering Task Force. January 1996 and include adaptive playout buffering algorithms to cope with variable delay. Loss is not possible to predict, whether it is expressed in the short-term as some packets being independently dropped or in the longer-term in the form of loss bursts or network outages. In such cases, an error resilience scheme is desirable along with a good payload format design. Best common practice guidelines for writers of RTP payload formats are provided in “Guidelines for Writers of RTP Payload Format Specifications” by M. Handley, C. Perkins, published in RFC 2736, Internet Engineering Task Force. December 1999.
What is needed in the art is a coding technique for 3-D animated wireframe models suitable for Internet streaming applications, corrected for short-term and short average length bursts of up to 30%.
Prior art related to computer animation includes:
U.S. Pat. No. 5,818,463 entitled “Data Compression For Animated Three Dimensional Objects”, issued Oct. 6, 1998 discloses data which represents an animation sequence of a three dimensional object at a series of discrete time frames compressed by identifying characteristic features of the object; generating a quadrangular mesh representation of the object, whereby the object is mathematically defined by dividing it into one or more regions and hierarchically representing each region by a mesh, each mesh including three coordinate matrices which define the positions of nodes within the mesh; selecting from the mesh representation of the features a set of animation parameters which are capable of specifying changes in the mesh corresponding to the animation of the object; compressing each region mesh by applying pyramid progressive coding to the coordinate matrices for the mesh; storing the initial values for the animation parameters at the beginning of the animation sequence; and, at each time frame after the beginning of the animation sequence, estimating the current values of the parameters and compressing each parameter by estimating the change in the value of the parameter by subtracting its stored value for the previous time frame from its current value, quantizing the estimated difference, applying entropy coding to the quantized difference and updating the stored value with the decoded value.
U.S. Pat. No. 6,047,088 entitled “2D Mesh Geometry And Motion Vector Compression”, issued Apr. 4, 2000 discloses coding video data permits coding of video information with optional, enhanced functionality's. Video data is coded as base layer data and enhancement layer data. The base layer data includes convention motion compensated transform encoded texture and motion vector data. Optional enhancement layer data contains mesh node vector data. Mesh node vector data of the enhancement layer may be predicted based on motion vectors of the base layer. A back channel permits a decoder to affect how mesh node coding is performed in the encoder. The decoder may command the encoder to reduce or eliminate encoding of mesh node motion vectors. The back channel finds application in single layer systems and two layer systems.
U.S. Pat. No. 6,339,618 entitled “Mesh Node Motion Coding To Enable Object Based Functionality's Within A Motion Compensated Transform Video Coder”, issued Jan. 15, 2002 discloses single and progressive-resolution coding algorithms for the compression of 3-D polyhedral meshes. In the single-resolution mode, the mesh topology (or connectivity) is encoded by a constructive traversing approach applied to the dual graph of the original mesh while the mesh geometry is encoded by successive quantization and the bit-plane coding (achieved by context arithmetic coding). In the progressive-resolution mode, the mesh is represented by a coarse approximation (i.e., the base mesh) and a sequence of refinements. Both the base mesh and the refinement operations arc entropy coded so that a series of mesh models of continuously varying resolutions can be constructed from the coded bit stream. Topological and geometrical data of a 3-D mesh are encoded separately according to their importance and then integrated into a single bit stream. In decoding, the decoder finds from the bit stream the most important information and gradually adds finer detailed information to provide a more complete 3-D graphic model. The decoder can stop at any point while giving a reasonable reconstruction of the original model.
U.S. Pat. No. Application Publication 20010028744 entitled “Method For Processing Nodes In 3D Scene An Apparatus Thereof”, published Oct. 11, 2001 discloses a method and apparatus for processing nodes in 3-dimensional (3D) scene. The method includes the steps of identifying a 3D mesh node having 3D mesh information representing a 3D shape which is formed by constructing faces from vertices among nodes contained in a 3D scene to be processed; and encoding or decoding the identified 3D mesh node. Also, the method includes the step of transmitting or storing the 3D mesh information of the encoded 3D mesh node through an independent stream separate from the 3D scene description stream. According to the method, a node representing 3D mesh information having a huge volume of information in a 3D scene can be efficiently encoded and decoded so that the 3D scene can be efficiently transmitted and stored. By transmitting and storing 3D mesh information of a node representing encoded 3D mesh information, through an independent stream separate from 3D scene description information, the entire 3D scene cannot be affected even though encoded 3D mesh information has a huge volume.
None of the prior art discloses or suggests a coding scheme for 3-D wireframe models suitable for Internet streaming applications where the compression scheme gives efficient coding of IndexedFaceSet nodes and generates animation bitmasks similar to those in BIFS-Anim, and copes with delay and jitter in a robust manner to handle higher-level animations. The resulting bitstream has a flexible format for direct RTP packetization to use in IP streaming applications suitable for real-time applications such as videoconferencing and e-commerce avatars.