Presentation and rendering of images and graphics on data processing systems and user terminals, such as computers, and in particular on mobile terminals have increased tremendously the last years. For example, graphics and images have a number of appealing applications on such terminals, including games, 3D maps and messaging, screen savers and man-machine interfaces.
However, rendering of textures, and in particular graphics, is a computationally expensive task in terms of memory bandwidth and processing power required for the graphic systems. For example, textures are costly both in terms of memory, the textures must be placed on fast on-chip memory, and in terms of memory bandwidth, a texture can be accessed several times to draw a single pixel.
In order to reduce the bandwidth and processing power requirements, an image (also referred to as texture) encoding method or system is typically employed. Such an encoding system should result in more efficient usage of expensive on-chip memory and lower memory bandwidth during rendering and, thus, in lower power consumption and/or faster rendering. This reduction in bandwidth and processing power requirements is particularly important for thin clients, such as mobile units and telephones, with a small amount of memory, little memory bandwidth and limited power (powered by batteries).
One texture encoding method is referred to as ETC1 (Ericsson Texture Compression, version 1) which is further described in “iPACKMAN: High-Quality, Low-Complexity Texture Compression for Mobile Phones” by Jacob Strom and Tomas Akenine-Moller, Graphics Hardware (2005), ACM Press, pp. 63-70.
Today, ETC1 is available on many devices. For instance, Android supports ETC1 from version 2.2 (Froyo), meaning that millions of devices are running ETC1.
ETC1 was originally developed to be an asymmetric codec; decompression had to be fast, but compression was supposed to be done off-line and could take longer. However, recent developments have made it important to be able to compress an image to ETC1-format very quickly.
Another texture compression format is DXT1 (DirectX Texture compression, codec 1). However, platforms built for OpenGL ES may support ETC1 but not DXT1. It is therefore desired to be able to transcode DXT1 textures to ETC1. This way, after transcoding, rendering can be done from ETC1 instead, for which there is hardware support. However, the transcoding has to be fast enough for the user not to notice. It has been estimated that transcoding 20 Megapixels in less than 30 seconds is the upper limit. Ideally, it should be faster than that, perhaps 5 seconds.
To that end, it is desired to be able to transcode DXT1 textures to ETC1 textures quickly.
It should be noted that the ETC1 encoding is beneficial under many circumstances, not only when transcoding from DXT1 data.
A problem is that current methods for transcode DXT1 textures to ETC1 are not fast enough. Also, image quality has been sacrificed in order to obtain faster encoding.
As an example, the software package “etcpack” that Ericsson provides to Khronos users has three modes; “fast”, “medium” and “slow”. Even the “fast” mode takes around 640 seconds to encode 20 Megapixel of RGB8 data on a mobile device (exemplified by a Sony Ericsson Xperia X10 mini). This is more than 20 times the stipulated 30 seconds. A more stripped-down version of the same code called “average”, takes about 264 seconds, but this is still eight times slower than necessary.
One of the things that take time is that the ETC1 codec has two modes, “flipped” and “non-flipped”. To understand how this works the ETC1 codec will be described a bit more:
ETC1 compresses a 4×4 blocks by treating them as two half-blocks. Each half-block gets a “base color”, and then the luminance (intensity) can be modified in the half-block. This is illustrated in FIG. 1.
The left image in FIG. 1 is divided into blocks that are further divided into half-blocks that are either lying or standing. Only one base color per half-block is used. In the middle image of FIG. 1, per pixel luminance is added. The resulting image is shown in the right image of FIG. 1.
The half-blocks can either be of 2×4 pixels referred to as “standing”, or “non-flipped” half-blocks, or they can be two 4×2 blocks referred to as “lying”, or “flipped” half-blocks.
Typically an encoder would try both of these configurations and select the one, flipped or non-flipped, that resulted in the smallest error between the decoded 4×4 block and the original. However, due to time restrictions, there is no time to try both configurations. Instead, it has been determined to always use the “flipped” configuration. This method is called “fixed flip”, since the flip bit is fixed to 1. Compared to the “average” configuration, this cuts the compression time roughly in half, meaning that the time is reduced from 264 seconds to 132 seconds. This is still more than 4 times the desired 30 seconds. Unfortunately, for many blocks, the flipped configuration is not the optimal choice. This will mean that image artifacts appear in the image, artifacts that are clearly visible to the end user. The artifacts are very disturbing in areas where a standing, non-flipped block would have been much better than a flipped one. An example can be seen in FIG. 2.
A full original/DXT1 image is shown in FIG. 2(a). A zoomin of the DXT1 image is shown in FIG. 2(b). The result of “average” compression is shown in FIG. 2(c) and the result of “fixed flip” compression is shown in FIG. 2(d). The artifacts are much bigger in (d) than in (c).