Technology is described herein for compressing image data for use in graphics applications, such as three dimensional (3D) game applications, and the like. The disclosed technology at least offers specific advantages, such as low overhead and low latency implementation; thereof, the disclosed technology may be used for compressing in real time images sent to the frame buffer and/or for compressing image data residing in the frame buffer.
Compressing of image data residing in the frame buffer may be used for creating compressed textures, for example, during render-to-texture operations as supported by common graphics APIs, such as OpenGL® and DirectX®. In another aspect of the disclosed technology, compressing of image data residing in the frame buffer may be used for reducing the memory bandwidth between the frame buffer and the display. In yet another aspect of the disclosed technology, compressing images residing in the frame buffer with a fixed rate compression leads to a reduction of the size of the frame buffer and the frame buffer can be accessed with random accesses.
The disclosed compression technologies are block based, that is, compression is applied to non-overlapping or disjoint portions of a source image, called blocks hereafter. In block based compression, a source image is divided into blocks and blocks can be of any desired and suitable size or shape. In the disclosed technology, each block is preferably rectangular and the preferred size and shape is four-by-four pixels. However, different block sizes/shapes/arrangements are also possible. For example, an arrangement in which blocks comprising two-by-eight or eight-by-two pixels can be used.
Those skilled in the art will recognize that a conventional block based compression process (e.g., S3TC texture compression) includes the parsing of the source image blocks so as to extract respective components. The components may be main color information, color index information, main alpha information, and alpha index information. Also, additional control information may be extracted e.g., if the source image block is monochrome.
Block based image compression with fixed compression ratios are extensively employed to compress textures. Compressing textures in a block based way with fixed compression ratio is a preferable technique, because the decompression process requires typically a small number of cycles, e.g., 1 cycle, thus it can be performed in real time. A major limitation of conventional block based image compression techniques is that the associated compression process requires multiple cycles to be completed, thus, it is not suitable for compressing “rendered data” in real time. Another limitation of typical block based image compression techniques is that the implementation of the image compression process is complex and it has a significant associated hardware cost. Upon the complete reading of the present disclosure, those skilled in the art will recognize that in the disclosed technology, at least the above limitations, among other limitations not described thus far, have been overcome.
The term “rendered data” is used herein to identify computer generated graphics data that are in a displayable format e.g., in RGB (red-green-blue) format. Rendered data are generated typically by a graphics processing hardware or software system and they are typically located in a frame buffer. The format of rendered data will be detailed in the Detailed Description part of the present disclosure.
Those skilled in graphics technology know that a texture could be image data (either photographic or computer generated), transparency data, smoothness data, etc. Generating realistic computer graphics typically requires multiple textures of high quality. Providing the textures to the rendering unit requires tremendous computer memory and bandwidth typically not present in mobile and handheld devices. Texture compression may significantly decrease memory and bandwidth requirements.
Therefore, texture compression has been widely employed in graphics hardware. However, as known in the art, image compression in general and texture compression in particular has proven to be complex and several different approaches have been proposed. Among the various compression schemes described in the art, the most suitable ones are the block based schemes with fixed compression ratio. This is because the compression schemes simplify the memory address generation process, especially the memory accesses to continuous memory blocks (known as burst accesses), since the address generation unit (AGU) of the graphics hardware is implemented based on simple arithmetic operations e.g., multiplications in which the multiplicand is a power of two number.
Some block based fixed rate image compression schemes are described briefly below.
Vector Quantization, referred as VQ hereafter, is amongst the oldest types of image compression. VQ operates by identifying a limited set of representative pixels or representative groups of pixels among all the pixels of the source image block. The set of representative pixels or group of pixels is usually termed as dictionary or codebook; the term codebook will be used hereafter. For each pixel of the source block, an index to the most closely approximate codebook entry is calculated. VQ is able to report high compression ratios while retaining acceptable quality, but it is not uncommon to generate significant visual artifacts, such as smooth gradients. Generally, it can be considered that the per-bit quality achieved by VQ compression is relatively low, thus the technique has been superseded by more efficient techniques.
ETC is another block based, fixed rate encoding scheme for compressing image data. The acronym ETC stands for Ericsson Texture Compression. In ETC, the source image block is divided into two subblocks typically referred as chunks. For example, an uncompressed four-by-four pixel block is split into two four-by-two or two-by-four pixel chunks. For each chunk, a representative base color is calculated. Apart from the two base colors, the remaining bits are used as indices to indicate specific, pre-calculated numerical values. The values are used as offsets to the base color of a chunk, that is, the values are added to the base color and the result of the addition is used as the final color of a pixel. The process is repeated for all the pixels of a source image block.
ETC2 scheme expands ETC1 in a backwards compatible way to provide a compression of higher quality by introducing three additional modes. However, as is known in the art, encoding only one base color for a given chunk may result in relatively poor image quality, especially for chunks with diverse colors. In addition, ETC2 has very limited support for encoding images with transparency information.
DXTn (also referred as S3TC or DXTC), as suggested in U.S. Pat. No. 5,956,431, is a block based, fixed rate image compression scheme that has been widely employed by the graphics standards, e.g., by the OpenGL standard of the Khronos group. In DXT1 (an embodiment of the DXTn group of compression schemes), two 16-bit RGB representative base colors are calculated for each four-by-four pixel image block. The representative base colors will be referred to as endpoints hereafter. Based on the mode of operation, apart from the two endpoints, one or two additional colors are generated as the result of the linear interpolation between the two endpoints. The linear interpolation is performed using predefined weighted values. Finally, an index of 2-bits for each pixel is produced to choose among the colors i.e., the two endpoints and the one or two interpolated colors.
The DXTn family of compression algorithms includes DXT1, DXT2, DXT3, DXT4, and DXT5. The compressed format of all variations of DXT1 is 128-bits long, while DXT1 compressed format is 64-bits long. Also, in all the schemes of DXTn, the source image block is a four-by-four pixels block. DXT1 is primary used to compress RGB color data however there is a specific arrangement in DXT1 to indicate if a one or more pixels are fully transparent.
In DXT2, DXT3, DXT4, and DXT5 arrangements, the 64-bits (among the 128-bits) are used to compress the RGB color data of a block in a way very similar to DXT1. The remaining 64-bits are used to encode transparency (i.e., alpha) information. As will be recognized by those skilled in the art of computer graphics, DXT2 and DXT3 are suitable for compressing blocks in which the changes in alpha values across the block are considered as “sharp”. To the contrary, DXT4 and DXT5 are suitable for compressing blocks in which the changes in alpha values across the block are considered as “gradient”.
The applicants want to acknowledge that the terms “sharp” and “gradient” are intentionally not explicitly defined herein. An explicit definition of the said terms is not considered necessary for the complete understanding of the present technology. Both terms are used to describe two different arrangements in the relation between the arithmetic values of the alpha channel within the pixels belonging to a source image block.
One exemplary deficiency in the art is a method/technique for compressing high quality image data including alpha values of different arrangements, e.g., for compressing to generate high quality image blocks in which the changes in alpha values across an image block are considered either gradient or sharp.
Apart from the encoding of transparency values, it is also known by those skilled in image compression that DXTn represents the RGB color data quite well in the majority of the cases. However, there are specific image block arrangements in which DXTn compression results in poor image quality. Those specific example cases are briefly explained below.
First, DXTn results in poor image quality in image blocks having many different color tints. By way of a non-limiting example, this may occur in image blocks in which the colors include near black, near white, and some other, more saturated colors. In this particular example, the two encoded colors, along with implied or interpolated colors, may not accurately represent the colors of a source image block. This is because blending two of the three colors may not produce the third color.
Second, the low precision of the endpoints in DXTn and the small number of interpolants can generate undesirable noise on color gradients; this is usually termed as blocking effect. The effect is more pronounced when the color gradients are oriented diagonally within the block.
Third, DXTn results in low image quality in image blocks that have multiple separate color gradients at different orientations in the color space. This is because one or more of the color gradients must be ignored during the encoding process. This case happens frequently in images known as bump maps.
FXT1 is another block based, fixed rate compression scheme. In essence, FXT1 is similar to DXTn with some additional block types and also FXT1 contains a 4 bits-per-pixel (bpp) compression mode for encoding images with Alpha values. However, it is known in the art, in many cases FXT1 may suffer from the same problems appearing in DXTn and, overall, the gains in image quality of FXT1 over DXTn were never conclusive.
PVRTC and its extension PVRTC2 are yet another block based, fixed rate compression scheme. The idea behind the schemes is to scale an image down to a fraction of its initial size and then scales it up so as to get a good approximation of the initial image. Under this scenario, the actual compression is achieved by storing the downscaled version and including some additional data in order to end up with an accurate representation of the source image.
As is known in the art, PVRTC and PVRTC2 schemes work very well for some specific arrangements of image data (e.g., in smooth gradients) and they can scale well to very low bit rates (e.g., in 2 bpp). To the contrary, in some other types of image data, the schemes may lead to blurred images and/or may miss some high frequency details, e.g., to introduce specific visual artifacts, like high frequency modulation noise and image ringing noise. In addition, as will be recognized by those skilled in the art, the fact that the PVRTC and PVRTC2 require in some cases random accesses to three neighboring blocks in order to process a given input block complicates significantly the memory management of a system that uses the compression schemes.
Yet another block based, fixed rate compression scheme is ASTC. ASTC stands for Adaptive Scalable Texture Compression. In essence, ASTC shares many features with the formats described so far. In ASTC, the blocks of the source image are encoded to 128-bit vectors. However, ASTC supports input image blocks of different sizes and shapes.
ASTC is considered as the block based, fixed rate compression scheme that results in a higher image quality compared to compression schemes mentioned so far, but the hardware implementation of a ASTC compressor is complex, it has a significant hardware cost and the high latency of the compression process may require a large additional hardware cost for buffering throughout the system to compensate for the latency. Therefore, the applicants believe that ASTC is an impractical scheme for real time compression, e.g., for compressing image data sent to the frame buffer and/or image data residing in the frame buffer.
A review of the current block based, fixed rate image compression schemes and their limitations, as presented herein, reveals that there remains scope for improvements in compressing in real time image data sent to the frame buffer and/or in compressing image data residing in the frame buffer in graphics processing systems. Therefore, there is a need for methods that maximizes the accuracy of compressed images, both of color and transparency data, while minimizing storage, memory bandwidth requirements, and encoding hardware complexities, while also compressing image data blocks into convenient sizes to maintain alignment for random accesses to one or more pixels.