Field
This invention relates to a method and apparatus for compressing image data and a method and apparatus for decompressing image data. The invention is useful in computer graphics systems, and in particular, in computer graphics systems that generate displays of three dimensional images on a two dimensional display and apply texture image data to surfaces in the 3D image, using stored compressed texture data.
Related Art
In 3D computer graphics, surface detail on objects is commonly added through the use of image-based textures, as first introduced in 1975 by Ed Catmull (“Computer Display of Curved Surfaces”, Proc. IEEE Comp. Graphics, Pattern Recognition and Data Structures. May 1975). For example, a 2D bitmap image of a brick wall may be applied, using texture mapping, to a set of polygons representing a 3D model of a building to give the 3D rendering of that object the appearance that it is made from bricks.
Since a complex scene may contain very many such textures, accessing this data can result in two related problems. The first is simply the cost of storing these textures in memory. Consumer 3D systems, in particular, only have a relatively small amount of memory available for the storage of textures and this can rapidly become filled, especially if 32 bit per texel—eight bits for each of the Red, Green, Blue and Alpha (translucency) components—textures are used.
The second, and often more critical problem, is that of bandwidth. During the rendering of the 3D scene, a considerable amount of texture data must be accessed. In a real-time system, this can soon become a significant performance bottleneck.
Finding solutions to these two problems has given rise to a special class of image compression techniques commonly known as texture compression. A review of some existing systems can be found in “Texture Compression using Low-Frequency Signal Modulation”, (S. Fenney, Graphics Hardware 2003) or the related patent, GB2417384. Some more recent developments are documented in “iPACKMAN: High-Quality, Low-Quality Texture Compression for Mobile Phones” (Ström and Akenine-Möller, Graphics Hardware 2005) and the follow-up work, “ETC2: Texture Compression using Invalid Combinations” (Ström and Pettersson, Graphics Hardware 2007).
One system for compressing and decompressing image data that is particularly well suited to texture data is described in GB2417384, the contents of which are incorporated herein by reference. In the system of GB2417384 image data is stored in a compressed form comprising two or more low resolution images together with a modulation data set. The modulation data set describes how to combine the low resolution images to provide the decompressed image data.
The decompression process of GB2417384 will now be briefly described with reference to FIG. 1. The process is normally applied to colour data but is shown here in monochrome for reproduction reasons. The compressed data includes two low-resolution colour images, 100 and 101, and a full resolution, but low precision, scalar image 102 forming a modulation data set. The data of the low resolution images are upscaled, preferably using bilinear, biquadratic or bicubic interpolation, to produce two corresponding virtual images, 110 and 111. Note that the upscaled virtual images lack much of the detail of the final image.
Pixels, 112 and 113, from their respective virtual images, 110 and 111, and the corresponding scalar value, 120, from the full resolution, low precision scalar data, 102, are sent to a blending/selection unit, 130, which blends/selects, on a per-texel basis, the data from 112 and 113 in response to 120, to produce the decompressed data, 141 of the image, 140. The mode by which the combination is done is chosen on a region-by-region basis.
For reference purposes, the storage format for the data of the preferred embodiment of GB2417384 is given in FIG. 2. Data is organised in 64-bit blocks, 200, at the rate of one 64-bit block per 4×4 group of texels for the 4bpp embodiment, or one per group of 8×4 texels for the 2bpp embodiment. Two ‘base colours’, 201 and 202, correspond to the two representative colours or, equivalently, a single pixel from each of the low resolution colour images 100 and 101 of FIG. 1. Each single pixel from the low resolution colour images corresponds to an (overlapping) region of texels in the decompressed image. Each such region is approximately centred on a 4×4 (or 8×4) block of texels in the decompressed image, but is larger than 4×4 (8×4) due to the upscale function. The upscaling of the low resolution images is preferably performed using a bilinear, biquadraticor bicubic interpolation with the appropriate number of near neighbour pixels in the low resolution images. The following embodiments will assume bilinear is the primary interpolation method, as this requires at most 4 samples from each of the low resolution images to produce one upscaled A or B value, but alternative upscaling methods, using potentially more samples from the A and B images, could employed.
A single bit flag, 203, then controls how the modulation data, 204, is interpreted for the 4×4 (or 8×4) set of texels. The sets of texels that are controlled by each flag are shown in FIG. 7.
The 4bpp preferred embodiment of GB2417384 has two modulation modes per region, where each region is a 4×4 set of texels. The first mode allows each texel to select one of (a), the colour of texel from 100, (b), the colour from the texel of 101, (c), a 3:5 blend of 100 and 101, or a 5:3 blend of 100 and 101. The second mode replaces the 3:5 and 5:3 blends with a pair of 1:1 blends, one of which uses the blended alpha value, and another with the same RGB values as but with the alpha component set to 0, i.e. fully transparent.
It should be noted that FIG. 1 is merely describing the concept of the decompression process and that the upscaled virtual images, 110 and 111, are unlikely to be produced and stored in their entirety in a practical embodiment. Instead, small sections of the virtual images, preferably 2×2 pixel groups, may be produced and discarded, ‘on the fly’, in order to produce the requested final texels.
In GB2417384, the base colour data may be in one of two possible formats as shown in FIG. 2. Each of the colours 201, 202 may independently be either completely opaque, in which case format 210 is used, or partially or fully translucent, in which case format 211 is used. A one-bit flag 212 that is present in both colours 201 and 202 determines the choice of format for each representative colour. If this bit is ‘1’, then the opaque mode is chosen, in which case the red 213, green 214, and blue 215 channels are represented by five, five and five bits respectively for base colour B, and five, five, and four bits respectively for base colour A. Note that the reduction in bits for base colour A is simply due to reasons of space. Alternatively, if flag 212 is ‘0’, the corresponding colour is partially transparent and the colour contains a three bit alpha channel, 216, as well a four bit Red, 217, four bit Green, 218, and a Blue field, 219, which is either four or three bits for colour 201 or 202 respectively. Since a fully opaque colour is implied by field 212, the alpha field, 216, does not need to encode a fully opaque value.
Improvements to the system described in GB2417384 are presented in our International Patent application publication number WO2009/056815. This describes how the system of GB2417384 can be improved to accommodate certain types of images, or sections of images not handled particularly well by GB2417384, such as textures which include large discontinuities at certain boundaries and those where a few quite distinct colours are used in a localised region.
This later application takes advantage of the fact that the level of flexibility offered by the scheme GB2417384 as shown in FIG. 2 is in excess of what is needed in the vast majority of situations. Therefore, the ability to have both representative colours A and B independently determined whether they are fully opaque or partially transparent has been determined not to be needed and the encoding scheme used replaced with that shown in FIG. 4. In this, data is again encoded in 64-bit units with two representative colours B 301, and A 302, a modulation mode bit, 303, and the modulation data, 304. Two additional 1 bit fields, alpha mode, 305 and hard flag, 306 are also included. To accommodate these additional fields the colour fields 301 and 302 are both reduced by 1-bit in size relative to the encoding of GB2417384.
A single opacity or alpha flag is used and is set to 1 if both the colours are fully opaque. If it is 0 then both are potentially partially transparent. Therefore, the flag that determined the opacity of base colour A now becomes a (hard transition flag) which is used to create additional modes. When colours are in the translucent mode then the stored 3-bit alpha channel for base colour B is expanded to 4-bits.
The modulation mode bit and hard transition flags combine to produce four different modes for interpreting the per pixel modulation bits available from the modulation data.
This system allows for a first improvement to the method of GB2417384 in situations where there are large colour discontinuities in the images such as long horizontal and/or vertical boundaries between texels. These can occur naturally but they are more frequent when multiple smaller textures are assembled into a single larger texture atlas for efficiency reasons as shown in FIG. 4. This texture 180 is composed of numerous smaller textures such as 181 Although such texture atlases may be assembled prior to compression, it is also useful to be able to compress the subtextures separately and then assemble the compressed pieces. This requires the ability to force the discontinuities at certain boundaries, which are assumed to lie on multiples of 4 or 8 pixels, to stop the unrelated but adjoining image data from ‘interfering’ when later assembled into an atlas texture.
Further improvement on GB2417384 arise in the case of texels which have more than two distinctly different colours, such as the situation shown in FIG. 3. This contains adjacent red, blue and green strips. Such extreme rates of colour change are relatively rare in natural images but can be more frequent in artist drawn images or diagrams. In practice certain colour combinations of the pairs of representative colours described in GB2417384 are not used. Each representative colour has a single bit indicating if it is fully opaque or partially transparent. This level of flexibility has been determined not to be necessary in practice and so one of the two opacity flags has been re-assigned to designate additional compression modes for regions of the texture.
One new mode for regions of Texels is used when a strong colour discontinuity is assumed to occur between certain Texels in that region. This simplifies the assembly of texture atlases, especially from pre-compressed sub-textures as well as enhancing a small number of cases in normal images.
A second new mode extends the discontinuity concept by allowing Texels in indicating regions to arbitrarily choose colours from a subset of the nearest four neighbouring pairs of representative colours. This takes the advantage of providing a larger palette of colours but avoids the use of a palette memory.
The contents of GB2417384 and WO2009/056815 are hereby incorporated by reference.
Although the additional modes introduced by WO2009/056815 improve the quality of the compressed result relative to GB2417384, there are situations where it is desirable for each pixel to be able specify its modulation value with greater precision or, say, to select from a larger palette of colours. In either case, this requires a greater number of bits to encode such data but increasing the overall storage is undesirable.