1. Field of the Invention
Embodiments of the present invention relate to encoding and decoding three-dimensional (3-D) data, and more particularly, to a method, medium, and apparatus for encoding and decoding 3-D data having any one of PointTexture, voxel, and octree data.
2. Description of the Related Art
3-D graphics typically require a huge amount of data. PointTexture data can be converted into voxel data and octree data, which also requires a huge amount of data. Such a huge amount of data requires high-capacity memory and high-speed data processing capabilities. Accordingly, efficient data compression becomes necessary, and the data processing becomes expensive.
Techniques for creating realistic graphic images are currently under study in the 3-D graphics field. One of these is an image-based rendering method. This method has the advantage of making it possible to reduce the required amount of data and processing time compared with a polygonal mesh model used in conventional 3-D data modeling methods. In addition, the image-based rendering method can provide more realistic images.
3-D objects are currently represented mainly using polygonal mesh models. Certain 3D forms can be represented by using triangles, rectangles, or other polygons. Recent developments in 3D graphics software and hardware techniques have brought real-time visualization of complicated objects and scenes by use of polygonal models in still or moving images.
Meanwhile, active research on other 3D representation methods have been in progress, with the goal of overcoming the difficulties of representing actual objects with polygonal models, the required long processing times due to the high complexity in renderings, and the difficulty of generating realistic images like photographs.
Certain applications require a huge number of polygons. For instance, detailed models for a human body typically need millions of polygons, and it is thus difficult to handle these polygons. While recent advances in 3D measurement systems, like 3D scanners, allow high-density 3D data to be obtained with permissible error, acquiring consecutive and perfect polygonal models is still difficult and expensive. In addition, rendering techniques for attaining high resolutions equal to that of a photograph require complicated processes, which makes real-time rendering difficult.
Depth image-based representation (DIBR) is a new method for representing and rendering 3D objects with complex geometries and has been adopted into MPEG-4 Animation Framework eXtension (AFX). Instead of representing objects with polygonal meshes, as done typically in computer graphics, DIBR represents a 3D object with a set of reference images covering its visible surface. Each reference image is represented by a corresponding depth map, which is an array of distances between the pixels in the image plane to the object surface. One of the advantages of DIBR is that reference images can provide high quality visualization of the object without using complex polygonal models. In addition, the complexity of rendering a DIBR view is only related to the number of pixels in the view (i.e., the resolution of the view) regardless of the scene complexity. The DIBR has three major formats: SimpleTexture, PointTexture, and OctreeImage. PointTexture represents an object with an array of pixels viewed from a single camera location. Each PointTexture pixel is represented by its color, depth (the distance from the pixel to the camera), and a few other properties supplementing PointTexture rendering. There can be multiple pixels along each line of sight, and thus a PointTexture usually includes of multiple layers. FIG. 1 shows a simple example of one-dimensional PointTexture. PointTexture typically requires a massive amount of data. Realistic images require higher sampling density and a tremendous amount of data. Therefore, the compression of PointTexture images should be performed efficiently. FIG. 2 shows a PointTexture node specification. Depth and color fields should be compressed in the node specification of FIG. 2.
There has been limited research on PointTexture. Duan and Li proposed an algorithm for compressing PointTexture images, i.e., an algorithm for compressing layered depth images (LDIs), J. Duan and J. Li, “Compression of the Layered Depth Image”, IEEE Trans. Image Processing, vol. 12, no. 3, pp. 365-372, March 2003. This algorithm uses the JPEG-LS algorithm to compress depth data. Also, color data is compressed by using existing coding standards. However, this algorithm does not support progressive compression and transmission.
An algorithm for compressing 3D voxel surface models based on pattern code representation (PCR) was proposed by C. S. Kim and S. U. Lee in “Compact Encoding of 3D Voxel Surface Based on Pattern Code Representation”, IEEE Trans. Image Processing, vol. 11, no. 8, pp. 932-943, 2002. However, this algorithm does not utilize a hierarchical octree structure, and also does not support progressive compression.
In MPEG-4 AFX, an algorithm for compressing an octree based on the prediction by partial matching (PPM) scheme was proposed in ISO/IEC JTC1/SC29NVG11 14496-16:2003, Information Technology—Coding of Audio-Visual Objects—Part 16: Animation Framework eXtension (AFX). However, this algorithm does not create progressive bitstreams. Also, this algorithm uses an octree-compression algorithm, which can compress only volume data with a fixed resolution, i.e., an equal number of pixels in width, height, and depth. In other words, this algorithm cannot compress data with a certain resolution, i.e., a different number of pixels in width, height, and depth.