As the technologies develop rapidly, the complexity of 3-dimensional computer generated images increases at the same pace. One can easily build a computer model for very complicated 3D objects, like human movements using vertices and triangle meshes. This kind of 3D model can then be sent to a 3D computer graphics system where animated 3D images can be generated on a computer screen. Computer generated 3D animated images are widely used in 3D computer games, navigation tools and computer aided engineering design tools.
3D computer graphics system have to cope with the constant demands for more complex graphics and faster speed of display. As details in the display model increase, more and more graphics primitives and vertices are used. Also as texturing and shading techniques have evolved, especially with the use of programmable shader languages, more and more information is associated with vertex data (vertex parameter data). In some cases the vertex parameter data size can be around 100 32 bits words per vertex, and there may be a million vertices in a render of an image. So the memory space for the vertex parameter data in a 3D render can easily reach hundreds of MB.
Because of the amount of vertex parameter data a 3D computer graphics system needs to process, the performance of the system is often limited by vertex parameter data memory bandwidth. This is especially true for tile based 3D computer graphics systems, in which vertex parameter data written to internal memory may be read multiple times for the different tiles where the vertices from the primitives are needed to perform a render. It would be very beneficial for the performance of the 3D computer graphics systems to reduce the vertex parameter data bandwidth by compressing the vertex parameter data used in 3D rendering.
As is well known to those skilled in the art, tile based 3D computer graphics systems divide a render surface into a plurality of n×m pixel tiles. A primitive such as a triangle, line or point is only processed for tiles which overlap the primitive. The main steps performed for tiling in a tile based 3D computer graphics system are shown in FIG. 1.
In a 3D render, primitives contain certain shared vertices and primitives in similar locations may arrive sequentially in time. To make memory access for the vertex parameter data more efficient, a tile based 3D computer graphics system can define a bounding box of tiles around a primitive and restrict the number of incoming primitives in dependence on the tiles in the bounding box and the primitives they contain. This allows the vertex parameter data from primitives which overlap these tiles to be grouped together into primitive blocks. The primitives are constructed by indices which index into these primitive blocks. To control the buffer size of vertex parameter data there is normally a limit of a maximum number of vertices and primitives contained within a primitive block, for example 32 vertices and 64 primitives. The data structure from a primitive block is shown in FIG. 2. There are Primitive Block Header Words at the start used for the definition of vertex parameter data in the primitive block, such as number of vertices and number of primitives, as 20 in FIG. 2. The Primitive Block Header Words are followed by vertex parameter data from a number of vertices in the primitive block, as 21 in FIG. 2.
In this scheme some of the primitives from a primitive block may be referenced by some tiles and the other primitives may be referenced in other tiles during the 3D render. The access for the vertex parameter data in the primitive block requires random access to the primitive block from the data stream. Also the vertex parameter data in a primitive block may be needed for renders in different tiles, so the vertex parameter data is written once and may be read multiple times.
The general requirements for the algorithm of 3D vertex parameter data compression are fast speed, lossless compression, and minimum memory space used by the compression and decompression algorithms themselves. This is because of the demand for fast speed and high quality 3D computer graphics system to be implemented in a small silicon area in an integrated circuit.
For tile based 3D computer graphics system the additional requirements for vertex parameter data compression algorithms are the ability of random data access from a compressed data stream, and fast and simple algorithms in decompression.
Some of the general lossless compression algorithms such as Huffman coding/decoding need a general sized data buffer to perform the compression. This is not suitable for a 3D computer graphics system with a limited silicon area. Run Length encoding does not need the extra data buffer for compression, but like the other entropy encoding algorithms, data compression is performed on sequentially accessed data streams such as a colour data stream in a video display. If used in a tile based 3D computer graphics system the whole vertex parameter data stream for a primitive block needs to be decompressed before any vertex data can be accessed. This is extremely inefficient for tile based rendering especially if the primitive blocks contain large triangles covering many tiles, in which case the whole vertex parameter data stream is decompressed many times even when only a few vertices from the primitive blocks are used.
Normally vertex parameter data values are stored as 32 bits floating point values in a 3D computer graphics system. Using fixed point representation for the floating point vertex data values can compress vertex data in a primitive block well. A floating point value can be represented by an integer together with a fixed number of fractional bits in the fixed point format. The method will cause reduced accuracy but may work well on X and Y coordinates data from vertices. Because the display resolution on a computer graphics screen is fixed to a fraction of a pixel unit, X and Y coordinates from primitives rendered on screen are converted from the original floating point values into screen values which have limited resolution.
For other vertex parameter data like Z for depth, RHW and texture coordinate sets, high accuracy of the data needs to be maintained through the 3D display pipeline. Artifacts in the rendered images may be caused by reduced accuracy of representation in these vertex parameter data.
Some vertex data compression algorithms compress vertex parameter data values according to the geometrical location of the vertices. For example a vertex is chosen as the origin in a triangle mesh, the difference values (delta values) between the vertex parameter data and the parameter data from the origin vertex are stored instead of the full vertex parameter data values. The delta values can be represented by integers or fixed point values with a reduced range to compress the data stream. This kind of algorithm works well for the vertices from a triangle mesh where the vertex parameter data values among the vertices is in a limited range. The compression ratio is related to the number of bits required to represent the delta values. Very often triangle meshes such as long triangle strips may contain vertices for which the range of the vertex data values is big in a primitive block. In this case compression will not be possible due to many bits being needed to store the delta values.
To reduce the vertex parameter data memory bandwidth in tile based 3D computer graphics system all primitives from an input stream are pre processed to remove any primitives which are either off screen, back facing, clipped or too small to be displayed. After pre processing the remaining primitives are merged into primitive blocks with a fixed number of vertices and written into internal parameter memory for 3D processing. Therefore the vertices in a primitive block are not guaranteed to belong to a single triangle mesh, the ranges of vertex parameter data values in a primitive block may be too big to be compressed with delta values from vertex origins.