1. Field of the Invention
The present invention relates to a method and apparatus for encoding and decoding data, and more particularly, to a method and apparatus for encoding and decoding three-dimensional object data comprised of one of point texture data, voxel data, and octree data.
2. Description of the Related Art
One of the most fundamental goals of research on three-dimensional graphics has always been to generate as much realistic graphic screens as real images. Accordingly, research has been carried out on rendering technology using a polygonal model, and as a result, a variety of modeling and rendering techniques, capable of representing a very realistic three-dimensional environment, have been developed. However, strenuous efforts and a considerable amount of time are always required to establish complicated three-dimensional graphic models. In addition, a considerable amount of data is needed for representing a very realistic and sophisticated environment, which may end up in very low data storage and transmission efficiencies.
Currently, a polygonal model is generally adopted to represent three-dimensional objects in computer graphics. In this technique, an arbitrary shape can be roughly represented by a set of colorful polygons, for example, a set of triangles. Recent remarkable developments in software algorithms and graphic hardware have enabled a complicated object or scene to be visualized into a very realistic, still (or moving) polygonal model in real time.
For the past few years, research has been vigorously carried out on a technique of representing a three-dimensional object mostly because of difficulty in constructing polygonal models for a variety of objects in a real world, complexity of conventional rendering techniques, and limits in representing images as realistically as possible.
An application program needs a considerable number of polygons to represent three-dimensional objects. For example, a model for representing the human body in detail needs several millions of triangles, which is too many to deal with. Even though recent developments in three-dimensional measurement technology, such as three-dimensional scanning, make it possible to obtain sophisticated three-dimensional data whose errors are within an allowable bound, it is still very difficult and very expensive to obtain a polygonal model that perfectly matches with objects. In addition, a rendering algorithm for providing as realistic image representations as photographs is too complicated to support real-time rendering.
In the meantime, there is a relatively new method of representing or rendering an object having a complex geometrical structure, i.e., depth image-based representation (DIBR), which has been adopted in MPEG-4 Animation Framework extension (AFX). While polygonal meshes are generally used to represent objects in computer graphics, a set of reference images that cover a visible surface of a three-dimensional object is used to represent the three-dimensional object in DIBR. Each of the reference images is represented by a depth map, and the depth map presents an array of different distances between pixels on an image plane and the surface of the three-dimensional object. One of the biggest advantages of DIBR is that objects can be represented with high quality by simply using reference images without using polygonal models. In addition, the complexity of rendering a DIBR view is determined solely depending on the number of pixels constituting the DIBR view (i.e., the resolution of the DIBR view), irrespective of the complexity of a scene. DIBR includes SimpleTexture, PointTexture, and OctreeImage. PointTexture represents an object with PointTexture pixels seen from a predetermined camera position. Each of the PointTexture pixels is represented by its color and depth (a distance between each of the PointTexture pixels and the predetermined camera position) and other properties that help PointTexture rendering. A plurality of pixels may possibly be provided along lines of sight. A PointTexture image is generally comprised of a plurality of layers. FIG. 1 illustrates a simple example of a one-dimensional PointTexture image. PointTecture requires a considerable amount of data to realistically represent objects. In general, the more realistic images, the higher sampling density and the more data needed to be processed. Therefore, efficient compression of PointTexture images is strongly required. FIG. 2 illustrates node specifications of PointTexture. In FIG. 2, ‘depth’ and ‘color’ fields are the ones to be compressed.
Until now, research has not much been carried out on PointTexture, and thus there are only few conventional PointTexture-based methods. One of the conventional PointTexture-based methods is a layered depth image (LDI) compression method which has been disclosed by Dual and Li in “Compression of the Layered Depth Image”, IEEE Trans. Image Processing, Vol. 12, No. 3, pp. 365˜372, March 2003.
In the prior art, JPEG-LS algorithms have been adopted to compress-depth information. On the other hand, color information is compressed using existing coding standards. However, such JPEG-LS algorithms do not support progressive data compression and progressive data transmission.
In the meantime, a method of compressing a three-dimensional voxel surface model using pattern code representation (PCR) has been disclosed by C. S. Kim and S. U. Lee in “Compact Encoding of 3D Voxel Surface Based on Pattern Code Representation, IEEE Trans. Image Processing, Vol. 11, No. 8, PP. 932˜943, 2002. However, this method does not use a hierarchical octree structure and cannot facilitate progressive compression schemes. In addition, an octree-based data compression method [ISO/IEC JTC1/SC29/WG11 14496-16: 2003, Information Technology-Coding of Audio-Visual Objects—Part 16: Animation Framework extension] has been developed for MPEG 4 AFX. However, this method, like the above-mentioned conventional methods, cannot provide progressive bitstreams.