1. Field of the Invention
The present invention relates primarily to the field of software, and in particular to a method and apparatus for a dynamic bandwidth adaptive image compression and de-compression scheme.
Portions of the disclosure of this patent document contains material that is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure as it appears in the Patent and Trademark Office file or records, but otherwise reserves all rights whatsoever.
2. Background Art
Computers are often used to process, play back, and display data, especially data that contains images, and is commonly termed as video data. This video data may come from sources such as storage devices, on-line services, VCRs, cable systems, broadcast television tuners, etc. Video data is memory intensive, that is, video data requires large amounts of memory for storage and use by a computer system. CD-ROMs or DVD-ROMs provide one solution to the problem of storing large amounts of data. However, even the storage capabilities of a CD-ROM or a DVD-ROM can be exceeded when storing motion picture length video data.
To reduce the transmission bandwidth and memory requirements when working with video data, various compression schemes have been developed so that less storage space is needed to store video information and a smaller bandwidth is needed to transmit it. Prior art video compression schemes include Motion JPEG, MPEG-4, QuickTime, etc.
Compression
Compression is a scheme for reducing the amount of information required to represent data, and is mainly applied to data that contains images, sounds, and graphics, or to files that are too large. Data compression schemes are used, for example, to reduce the size of a data file so that it can be stored in a smaller memory space. Data compression schemes may also be used to compress data prior to its transmission from one site to another, reducing the amount of time required to transmit the data. This second reason is adopted in packet switched networks like the Internet where bandwidth is limited.
To access the compressed data, it is first decompressed into its original form. A compressor/de-compressor, commonly known as a codec, is typically used to perform the compression and decompression of data. Some common codecs are the 2D Run Length Encoding scheme, Entropy Encoding scheme (which covers compression schemes like gzip and LZW, etc.), and Discrete Cosine Transform (DCT). One measure of the performance or efficiency of a codec is its “compression ratio”. Compression ratio refers to the ratio of the number of bits of uncompressed data to the number of bits of compressed data. Compression ratios may be 2:1, 3:1, etc.
Data compression may also be required when the input/output rate of a particular data receiver is less than the data rate of the transmitted data. This can occur when providing video data to computer systems. Video data of frame size 320 times 240 (320×240) is provided at rates approaching 7 megabytes per second. This rate is greater than the rate of commonly used I/O subsystems of personal computers. Some approximate representative rates of common I/O subsystems found on personal computers are:
Serial Communications:1–2 kilobytes/secISDN:8–16 kilobytes/secEthernet:1–10 megabytes/secCD-ROM:0.15–4.8 megabytes/secSCSI Disk:0.5–40 megabytes/sec
Another measure of video codec compression ratio is the average compressed bits-per-pixel. This measure is useful in describing video compression because different conventions are used for calculating the size of uncompressed video, i.e., some use 24 bits-per-pixel RGB (Red-Green-Blue), and others use 4:2:2 sub-sampled 16 bits-per-pixel YUV (Yellow under Violet). The averaging accounts for potentially different strategies employed for frames in a sequence. The bandwidth requirements for a sequence of frames is calculated by multiplying the average compressed bits-per-pixel and the number of frames per second, and dividing the resulting product by the number of pixels in each encoded frame.
Limitations of Compression Schemes and Algorithms
There are two types of conventional compression schemes: lossy (irreversible) and lossless (reversible). Lossless compression schemes, also called coding schemes, compress and decompress each image frame without the loss of any pixel data. In other words, the image frame can be decompressed without degrading the visual quality of the image. Lossless compression schemes are those for which the coding algorithms yield decompressed images identical to the original digitized images. These schemes, in general, are required in applications where the pictures are subjected to further processing, e.g. for the purpose of extraction of specific information. Lossy schemes, in contrast, suffer a loss of image information and result in a decrease in the quality of the image on decompression (image reproduction).
Certain conventional lossy image compression schemes achieve either better reproduction quality or better compression ratios on images having certain visual attributes. For example, one lossy compression scheme, the vector quantization scheme, works best when images have limited color palettes, or have regions with limited color palettes. Another lossy compression scheme, motion compensation compression, achieves better compression ratios and reproduction quality when portions of the image on a frame are translated portions of a previous frame.
Thus, some portions of a video frame have visual attributes that are better suited for one type of lossy compression as compared to another. Therefore, if a single compression scheme is applied to each image frame in a video, some portions of the image may have a degraded reproduction quality as compared to other portions better suited to the applied compression scheme.
Nearly all video compression schemes are lossy, i.e., information is inevitably discarded in the compression process. A measure of quality is how much of this lost information is noticed by a human observer. However, there is not a consistent, objective model of human perception that can be applied. A simple, concrete, quality metric that is frequently used is the Mean-Squared-Error (MSE) that measures the error on a per-pixel basis from the uncompressed original.
Most lossy compression schemes are designed for the human visual system and may destroy some of the information required during processing. Thus, images from digital radiology in medicine or from satellites in space are usually compressed by reversible methods. Lossless compression is generally the choice also for images obtained at great cost, for which it may be unwise to discard any information that later may be found to be necessary, or in applications where the desired quality of the rendered image is unknown at the time of acquisition, as may be the case in digital photography. In addition, lossless may be preferred over lossy in applications where intensive editing or repeated compression/decompression is required: the accumulation of error due to a lossy iteration may become unacceptable.
Most compression algorithms are computationally complex, which limit their application since very complex algorithms often require expensive hardware to assist in the compression. A useful number to measure computational complexity of software-based compression algorithms is MIPS per megapixel/sec, i.e., essentially instructions/pixel. For example, an algorithm just capable of compressing 320×240 pixels per frame at 30 frames per second on a 40 MIPS machine has a computational complexity of 40,000,000 (320×240×30)/congruent 17 instructions/pixel.
It is desirable to use a compression and decompression scheme which yields a homogeneous image after decompression. A homogeneous image is one having a consistent reproduction quality across the image. When an image is not homogeneous, the areas that have a particularly degraded reproduction quality attract an observer's attention more readily than the better reproduced areas. Thus, there is a need for an efficient image compression system and method that will compress images and produce a homogeneous effect across each frame on decompression.
Another disadvantage of existing prior art compression schemes is their inability to provide adequate quality of playback in terms of format (spatial resolution), frame rate (temporal resolution) and color fidelity. In addition, existing prior art schemes do not adequately compensate for the low data output rate of CD-ROMs or DVD-ROMs.
With respect to spatial resolution, many prior art schemes do not provide a “full screen” of video output. Here, full screen is defined as 640×480 color pixels. Many prior art compression schemes provide a small “box” that displays video data. Such small displays are difficult to view, and do not provide adequate playback of video data. With respect to temporal resolution, many of the prior art schemes provide “choppy” playback of video data, with jerky motion, and pauses in playback while new frame data is being generated.
Many source images include high resolution color information. For example, the source image may have a color resolution of 15, 24, or 32 bits per pixel. Many computer systems are only capable of providing 8 bit per pixel color output. This requires that the large number of colors of the source image be mapped to a smaller number of colors that can be displayed by the computer system. This step involves the use of a color look-up table (LUT). Prior art compression schemes typically rely on the host computer system to provide a color LUT. These color LUTs are generally not optimized for the particular source image, resulting in unsatisfactory color display.
Another disadvantage of prior art compression schemes is that they are either “symmetrical” or “asymmetrical”. Symmetry refers to the ratio of the computational complexity of compression to that of decompression. Codecs are frequently designed with a greater computational load on the compressor than the de-compressor, i.e., they are asymmetric. While this may be a reasonable strategy for “create-once, play-many” video sequences, it limits the range of applications for the codec. Asymmetric compression schemes are not suitable for teleconferencing, for example, since teleconferencing requires essentially real-time processing and substantially equivalent compression and decompression rates. On the other hand, symmetrical compression schemes attempt to compress the data in the same time it takes to display the data. Typically, symmetrical compression schemes compress the data in a single pass, in real time, or as close to it as possible. This limits the performance of the scheme, especially on a network like the Internet, where not only the total number of users is unknown prior to the transfer of data, but that number constantly changes. If there are more users than the network can handle, the network can stall in order to accommodate the requests of all the users. This increases the time to transfer data and makes a symmetrical compression scheme undesirable.
Compression Scheme: Wavelet Transform
Sub-band encoding schemes like Wavelet Transform rely upon the representation that an image display signals for representing lower frequency regions of an image with stronger electrical power than those representing higher frequency regions. Since the lower frequency regions contain more visual information, a larger number of bits is assigned to represent the lower frequency regions. One advantage in using the Wavelet Transform scheme is that it almost eliminates ‘blocking’ because it does not apply itself to a predetermined number of image-blocks like some compression schemes. It rather applies to a contiguous image, which reduces the compression computational load. Another advantage of the Wavelet Transform is that in order to compress an image a portion of each block is assigned a predetermined number of bits based on the power of the CPU. Another advantage of using the Wavelet Transform is to use a discrete and orthogonal frequency filter of a predetermined characteristic. Using this filter, the scheme provides multi-resolution expressions and a zonally variable basis, which is yet another advantage of using the Wavelet Transform. In other words, as the scheme is recursively repeated, the resolution is reduced by half. In addition, the band dividing characteristic of the filters used in the scheme allows octave divisions.
Despite the above described advantages of using a Wavelet Transform compression scheme, the compression rate is desired for improvement especially for large image data. One reason for improvement is the increased use of computer graphics that can be easily downloaded or transferred via the packet switched network like the Internet. Even though prior art schemes, like Wavelets, adjust the compression factor dynamically to adjust to the available bandwidth, none of the prior art schemes use techniques whereby static sections of an image are compressed differently than dynamic sections, or use the least amount of CPU time to compress and de-compress the image. In other words, prior art schemes use a static approach to compress and de-compress an image, and this results in not only unnecessary use of CPU time, but also unacceptable delays to transmit the image over a network like the Internet.