Image compression reduces the amount of data necessary to represent a digital image by eliminating spatial and/or temporal redundancies in the image information. Compression is necessary in order to efficiently store and transmit still and video image information. Without compression, most applications in which image information is stored and/or transmitted would be rendered impractical or impossible.
Generally speaking, there are two types of compression: lossless and lossy. Lossless compression reduces the amount of image data stored and transmitted without any information loss, i.e., without any loss in the quality of the image. Lossy compression reduces the amount of image data stored and transmitted with at least some information loss, i.e., with at least some loss of quality of the image.
Lossy compression is performed with a view to meeting a given available storage and/or transmission capacity. In other words, external constraints for a given system may define a limited storage space available for storing the image information, or a limited bandwidth (data rate) available for transmitting the image information. Lossy compression sacrifices image quality in order to fit the image information within the constraints of the given available storage or transmission capacity. It follows that, in any given system, lossy compression would be unnecessary if sufficiently high compression ratios could be achieved, because a sufficiently high compression ratio would enable the image information to fit within the constraints of the given available storage or transmission capacity without information loss.
The vast majority of compression standards in existence today relate to lossy compression. These techniques typically use cosine-type transforms like DCT and wavelet compression, which are specific types of transforms, and have a tendency to lose high frequency information due to limited bandwidth. The “edges” of images typically contain very high frequency components because they have drastic gray level changes, i.e., their dynamic range is very large. Edges also have high resolution. Loss of edge information is undesirable because resolution is lost as well as high frequency information. Furthermore, human cognition of an image is primarily dependent upon edges or contours. If this information is eliminated in the compression process, human ability to recognize the image decreases.
Fractal compression, though better than most, suffers from high transmission bandwidth requirements and slow coding algorithms. Another type of motion (video) image compression technique is the ITU-recommended H.261 standard for videophone/videoconferencing applications. It operates at integer multiples of 64 kbps and its segmentation and model based methodology splits an image into several regions of specific shapes, and then the contour and texture parameters representing the region boundaries and approximating the region pixels, respectively, are encoded. A basic difficulty with the segmentation and model-based approach is low image quality connected with the estimation of parameters in 3-D space in order to impart naturalness to the 3-D image. The shortcomings of this technique are obvious to those who have used videophone/videoconferencing applications with respect particularly to MPEG video compression.
Standard MPEG video compression is accomplished by sending an “I frame” representing motion every fifteen frames regardless of video content. The introduction of I frames asynchronously into the video bitstream in the encoder is wasteful and introduces artifacts because there is no correlation between the I frames and the B and P frames of the video. This procedure results in wasted bandwidth. Particularly, if an I frame has been inserted into B and P frames containing no motion, bandwidth is wasted because the I frame was essentially unnecessary yet, unfortunately, uses up significant bandwidth because of its full content. On the other hand, if no I frame is inserted where there is a lot of motion in the video bitstream, such overwhelming and significant errors and artifacts are created that bandwidth is exceeded. Since the bandwidth is exceeded by the creation of these errors, they will drop off and thereby create the much unwanted blocking effect in the video image. In the desired case, if an I frame is inserted where there is motion (which is where an I frame is desired and necessary) the B and P frames will already be correlated to the new motion sequence and the video image will be satisfactory. This, however, happens only a portion of the time in standard compression techniques like MPEG. Accordingly, it would be extremely beneficial to insert I frames only where warranted by video content.
The compression rates required in many applications including tactical communications are extremely high as shown in the following example making maximal compression of critical importance. Assuming 5122 number of pixels, 8-bit gray level, and 30 Hz full-motion video rate, a bandwidth of 60 Mbps is required. To compress data into the required data rate of 128 kbps from such a full video uncompressed bandwidth of 60 Mbps, a 468:1 still image compression rate is required. The situation is even more extreme for VGA full-motion video which requires 221 Mbps and thus a 1726:1 motion video compression rate. Such compression rates, of course, greatly exceed any compression rate achievable by state of the art technology for reasonable PSNR (peak signal to noise ratio) values of approximately 30 dB. For example, the fourth public release of JPEG has only a 30:1 compression rate and the image has many artifacts due to a PSNR of less than 20 dB, while H320 has a 300:1 compression ratio for motion and still contains many still/motion image artifacts.
The situation is even more stringent for continuity of communication when degradation of power budget or multi-path errors of wireless media further reduce the allowable data rate to far below 128 kbps. Consequently, state of the art technology is far from providing multi-media parallel channelization and continuity data rates at equal to or lower than 128 kbps.
Very high compression rates, high image quality, and low transmission bandwidth are critical to modem communications, including satellite communications, which require full-motion, high resolution, and the ability to preserve high-quality fidelity of digital image transmission within a small bandwidth communication channel (e.g. T1). Unfortunately, due to the above limitations, state of the art compression techniques are not able to transmit high quality video in real-time on a band-limited communication channel. As a result, it is evident that a compression technique for both still and moving pictures that has a very high compression rate, high image quality, and low transmission bandwidth and a very fast decompression algorithm would be of great benefit. Particularly, a compression technique having the above characteristics and which preserves high frequency components as well as edge resolution would be particularly useful.
In addition to transmission or storage of compressed still or moving images, another area where the state of the art is unsatisfactory is in automatic target recognition (ATR). There are numerous applications, both civilian and military, which require the fast recognition of objects or humans amid significant background noise. Two types of ATR are used for this purpose, soft ATR and hard ATR. Soft ATR is used to recognize general categories of objects such as tanks or planes or humans whereas hard ATR is used to recognize specific types or models of objects within a particular category. Existing methods of both soft and hard ATR are Fourier transform-based. These methods are lacking in that Fourier analysis eliminates desired “soft edge” or contour information which is critical to human cognition. Improved methods are therefore needed to achieve more accurate recognition of general categories of objects by preserving critical “soft edge” information yet reducing the amount of data used to represent such objects and thereby greatly decrease processing time, increase compression rates, and preserve image quality.