The present invention relates generally to compressing three-dimensional graphics data, and more particularly to methods and apparatuses that provide lossy high compression ratios for three-dimensional geometry compression.
Modern three-dimensional computer graphics use geometry extensively to describe three-dimensional objects, using a variety of graphical representation techniques. Computer graphics find wide use in applications ranging from computer assisted design (xe2x80x9cCADxe2x80x9d) programs to virtual reality video games. Complex smooth surfaces in of objects can be succinctly represented by high level abstractions such as trimmed non-uniform rational splines (xe2x80x9cNURBsxe2x80x9d), and often detailed surface geometry can be rendered using texture maps. But adding more realism requires raw geometry, usually in the form of triangles. Position, color, and normal components of these triangles are typically represented as floating point numbers, and describing an isolated triangle can require upwards of 100 bytes of storage space.
Understandably, substantial space is necessary for three-dimensional computer graphics objects to be stored, e.g., on a computer hard disk or compact disk read-only memory (xe2x80x9cCD-ROMxe2x80x9d). Similarly, considerable time in necessary for such objects to be transmitted, e.g., over a network, or from disk to main memory.
Geometry compression is a general space-time trade-off, and offers advantages at every level of a memory/interconnect hierarchy. A similar systems problem exists for storage and transmission of two-dimensional pixel images. A variety of lossy and lossless compression and decompression techniques have been developed for two-dimensional pixel images, with resultant decrease in storage space and transmission time. Unfortunately, the prior art does not include compression/decompression techniques appropriate for three-dimensional geometry, beyond polygon reduction techniques. However, the Ph.D. thesis entitled Compressing the X Graphics Protocol by John Danskin, Princeton University, 1994 describes compression for two-dimensional geometry.
Suitable compression can greatly increase the amount of geometry that can be cached, or stored, in the fast main memory of a computer system. In distributed networked applications, compression can help make shared virtual reality (xe2x80x9cVRxe2x80x9d) display environments feasible, by greatly reducing transmission time.
Most major machine computer aided design (xe2x80x9cMCADxe2x80x9d) software packages, and many animation modeling packages use constructive solid geometry (xe2x80x9cCSGxe2x80x9d) and free-form NURBS to construct and represent geometry. Using such techniques, regions of smooth surfaces are represented to a high level with resulting trimmed polynomial surfaces. For hardware rendering, these surfaces typically are pre-tessellated in triangles using software before transmission to rendering hardware. Such software pre-tessellation is done even on hardware that supports some form of hardware NURBS rendering.
However, many advantages associated with NURBS geometric representation are for tasks other than real-time rendering. These non-rendering tasks include representation for machining, interchange, and physical analysis such as simulation of turbulence flow. Accurately representing trimming curves for NURBS is very data intensive, and as a compression technique, trimmed NURBS can not be much more compact than pre-tessellated triangles, at least at typical rendering tessellation densities. Finally, not all objects are compactly represented by NURBS. Although many mechanical objects such as automobile hoods and jet turbine blades have large, smooth areas where NURBS representations can be advantageous, many objects do not have such areas and do not lend themselves to such representation. Thus, while NURBS will have many applications in modelling objects, compressed triangles will be far more compact for many classes of application objects.
Photo-realistic batch rendering has long made extensive use of texture map techniques to compactly represent fine geometric detail. Such techniques can include color texture maps, normal bump maps, and displacement maps. Texture mapping works quite well for large objects in the far background, e.g., clouds in the sky, buildings in the distance. At closer distances, textures work best for three-dimensional objects that are mostly flat, e.g., billboards, paintings, carpets, marble walls, and the like. More recently, rendering hardware has begun to support texture mapping, and real-time rendering engines can also apply these techniques.
However, texture mapping results in a noticeable loss of quality for nearby objects that are not flat. One partial solution is the xe2x80x9csignboardxe2x80x9d, in which a textured polygon always swivels to face the observer. But when viewed in stereo, especially head-tracked VR stereo, nearby textures are plainly perceived as flat. In these instances, even a lower detail but fully three-dimensional polygonal representation of a nearby object would be much more realistic.
Polyhedral representation of geometry has long been supported in the field of three-dimensional raster computer graphics. In such representation, arbitrary geometry is expressed and specified typically by a list of vertices, edges, and faces. As noted by J. Foley, et al. in Computer Graphics: Principles and Practice, 2nd ed., Addison-Wesley, 1990, such representations as winged-edge data structures were designed as much to support editing of the geometry as display. Vestiges of these representations survive today as interchange formats, e.g., Wavefront OBJ. While theoretically compact, some compaction is sacrificed for readability by using ASCII data representation in interchange files. Unfortunately, few if any of these formats can be directly passed as drawing instructions to rendering hardware.
Another historical vestige in such formats is the support of N-sided polygons, a general primitive form that early rendering hardware could accept. However, present day faster rendering hardware mandates that all polygon geometry be reduced to triangles before being submitted to hardware. Polygons with more than three sides cannot in general be guaranteed to be either planar or convex. If quadrilaterals are accepted as rendering primitives, it is to be accepted that they will be arbitrarily split into a pair of triangles before rendering.
Modern graphics languages typically specify binary formats for the representation of collections of three-dimensional triangles, usually as arrays of vertex data structures. Thus, PHIGS PLUS, PEX, XGL, and proposed extensions to OpenGL are of this format form, and will define the storage space taken by executable geometry.
It is known in the art to isolate or chain triangles in xe2x80x9czigzagxe2x80x9d or xe2x80x9cstarxe2x80x9d strips. For example, Iris-GL, XGL, and PEX 5.2 define a form of generalized triangle strip that can switch from a zigzag to star-like vertex chaining on a vertex-by-vertex basis, but at the expense of an extra header word per vertex in XGL and PEX. A restart code allows multiple disconnected strips of triangles to be specified within one array of vertices.
In these languages, all vertex components (positions, colors, normals) may be specified by 32-bit single precision IEEE floating point numbers, or 64-bit double precision numbers. The XGL, IrisGL, and OpenGL formats also provide some 32-bit integer support. The IrisGL and OpenGL formats support vertex position component inputs as 16-bit integers, and normals and colors can be any of these as well as 8-bit components. In practice, positions, colors, and normals can be quantized to significantly fewer than 32 bits (single precision IEEE floating point) with little loss in visual quality. Such bit-shaving may be utilized in commercial three-dimensional graphics hardware, providing there is appropriate numerical analysis support.
In summation, there is a need for graphics compression that can compress three-dimensional triangles, and whose format may be directly passed as drawing instructions to rendering hardware. Preferably such compression should be readily implementable using real-time hardware, and should permit decompression using software or hardware.
The present invention discloses such compression.
According to the present invention, geometry is first represented as a generalized triangle mesh, which structure allows each instance of a vertex in a linear stream preferably to specify an average of between ⅓ triangle and 2 triangles. Individual positions, colors, and normals are quantized, with a variable length compression being applied to individual positions, colors, and normals. Quantized values are delta-compression encoded between neighbors to provide vertex traversal orders, and mesh buffer references are created. Histograms of delta-positions, delta-normals and delta-colors are created, after which variable length Huffman tag codes, as well as delta-positions, delta-normals and delta-colors are created. The compressed output binary stream includes the output Huffman table initializations, ordered vertex traversals, output tags, and the delta-positions, delta-normals, and delta-colors.
Decompression reverses this process. The decompressed stream of triangle data may then be passed to a traditional rendering pipeline, where it can be processed in full floating point accuracy, and thereafter displayed or otherwise used.