The real-time rendering of three-dimensional graphics has a number of appealing applications on mobile terminals, including games, man-machine interfaces, messaging and m-commerce. Since three-dimensional rendering is a computationally expensive task, dedicated hardware must often be built to reach sufficient performance. Innovative ways of lowering the complexity and bandwidth usage of this hardware architecture are thus of great importance. The main bottleneck, especially for mobile terminals, is memory bandwidth. A common technique for reducing memory bandwidth usage is depth buffer compression.
Primitives, such as triangles, are usually drawn in a non-sorted order. In order to make sure that only the triangles that are closest to the eye are written, a depth buffer is generally used. The depth buffer holds, for each pixel, the depth, i.e. distance to the eye, for that particular pixel. Before writing a new pixel, the corresponding depth is first read from the depth buffer. The new pixel is only written if the new depth is smaller than the previously written depth. The new depth value must then be written to the depth buffer. This reading and writing of depth values will generate a lot of memory accesses, which limits performance.
Depth buffer compression works by dividing the depth buffer into tiles or pixel blocks, and storing the tiles in a compressed format. When reading a depth buffer value, the entire depth tile is read and decompressed. The depth values are then modified, and before writing the tile to memory again, it is compressed. Since rasterization is usually done on a per-tile basis, it is often not a problem to read and write an entire tile at once instead of reading and writing on a per-pixel basis.
Since this decompression and compression might happen several times for a particular tile, it is important that the compression is lossless, i.e. non-destructive.
The depth buffer contains the depth of each pixel, and since the scene is made up of planar triangles, all pixels in a tile that stems from a certain triangle will be collinear. In particular, if all pixels in a tile come from the same triangle, all pixels in the tile will be collinear. In such a case, it will be possible to obtain a lossless representation of the tile by just storing the plane equation of the triangle in question, instead of storing the individual pixel depths. Many depth buffer algorithms work this way. Hasselgren and Akenine-Möller provide an extensive review of the known depth buffer compression schemes in their paper [1].
Hasselgren and Akenine-Möller also present, in the paper [1], an improvement of the prior art schemes that use differential pulse code modulation (DDPCM). Their key contribution is that, instead of using a correction value of {−1, 0, 1}, it is possible to get by with only one bit per pixel. The discovery that they made is that, since the slope in the x-direction is alternating between two values, it is always possible to use the smaller slope and then use a 1-bit correction value for the slope per pixel.
Compared to the previous DDPCM schemes, where the correction values are one of {−1, 0, 1}, and thus use two bits per pixel to encode, Hasselgren and Akenine-Möller use a one-bit-per-pixel correction value, thus saving one bit per pixel. This translates to 13 bits saved in a tile of 4×4 pixels, or 61 bits saved for an 8×8 tile. However, even with this bit savings Hasselgren and Akenine-Möller are only able to compress about 93% of the possible depth range in a lossless way. If an object comes too close to the camera or viewer, the depth buffer compression algorithm will fail and the depth tiles must be stored uncompressed.