1. Field of Invention
The present invention relates to a compression encoder, and more particularly to a progressive differential motion JPEG codec.
2. Description of Related Arts
With the convergence of computers, communications and media, video compression techniques have become increasingly important. Video compression is often used to translate video images (from camera, VCR, laser discs, etc.) into digitally encoded data. The digitally encoded data may then be easily transferred over a network such as the Internet. When desired, the compressed images are then decompressed for viewing on a computer monitor or other such device. Usually it is necessary to compress the digital images due to constraints of limited bandwidth of the Internet for the transmission of compressed digital images and the desire to decrease access or download time of the picture via the Internet. Higher compression of a digital image means that more digital images can be stored on a memory device (such as diskette, hard drive or memory card) and these images can be transferred faster over limited bandwidth transmission lines (such as telephone lines, internet, etc). However, the problem is that it is insufficient to sustain high-quality decomposed images with higher compression of images. Besides, the problem with Internet streaming is that presently there is still relatively low bandwidth available, and thus picture quality tends to be poor. For example, a 1 Mps download capability (e.g., with DSL) limits a consumer's real-time ability to 1 Mps (at best), which is insufficient to sustain high-quality images. As a result, the streaming approach generally provides a much poorer viewing experience. Thus, efficient and effective compression and decompression of images is highly important and desirable.
One of the most popular and widely used techniques of image compression is the Joint Photographic Experts Group (JPEG) standard. The JPEG standard operates by mapping an 8×8 square block of pixels into the frequency domain by using a discrete cosine transform (DCT). Coefficients obtained by the DCT are divided by a scale factor and rounded to the nearest integer (a process known as quantizing) and then mapped to a one-dimensional vector via a fixed zigzag scan pattern. This one-dimensional vector is encoded using a combination of run-length encoding and Huffman encoding.
FIGS. 1A and 1B are a high level block diagram illustrating the basic operations of JPEG in the compression, transmission, and reconstruction of a source image. The source image is represented by one or more components, each of which includes an array of multi-bit pixels. A grayscale image would include a single component while a color image would include up to three components. The operations shown apply to each component.
Data representing a source image data 110 is communicated to a compression encoder 120 to provide compressed image data 160. This data may be stored as a file for subsequent retrieval and reconstruction, or it may be transmitted on some communication medium to a remote location for immediate or subsequent reconstruction. In any event, it is contemplated that the compressed image data 160 will be communicated to a decompression decoder 130 to provide reconstructed image data. The compression encoder 120 uses certain data structures for the compression, and relevant portions of these must be communicated as side information for use by decompression decoder in the image reconstruction. The particular compression technique under discussion contemplates a single set of side information that applies to the entire image component.
In the proposed JPEG standard, the compression encoder 120 includes a discrete cosine transform (DCT) stage 121, a quantizer 122, and a Huffman encoder 123. The decompression decoder 130 includes a Huffman decoder 131, a dequantizer 132, and an inverse discrete cosine transform (IDCT) stage 133. The side information includes a quantization table 140 used by the quantizer 122 or the dequantizer 132 and a set of Huffman code tables 150 used by the Huffman encoder 123 or the Huffman decoder 131.
The image data is divided into blocks of 8 pixels by 8 pixels, and each block is separately processed. Positions within an 8×8 block are denoted by a double subscript, with the first subscript referring to the row and the second subscript referring to the column.
In addition, the source image data could be grayscale image data or color image data. There are a number of ways in which a color image can be broken into components. Standard monitors use the RGB characterization where R, G, and B are the red, green and blue components. Standard television broadcasting (NTSC) uses the YUV characterization where Y is the luminance component and U and V are the chrominance components (approximately red and blue). Printers use the CMYK characterization where C, M, Y, and K are the cyan, magenta, yellow, and black components. The CCIR 601 standard describes a linear transformation between the RGB characterization and the YUV characterization. As shown in FIG. 1A, the source image data 110 are linearly transformed from RGB domain into YUV domain. As shown in FIG. 1B, the decompressed image data are linearly transformed from YUV domain into RGB domain.
Although JPEG is a popular and widely used compression technique, it has several disadvantages. For example, one disadvantage of JPEG is that at low bit rates the DCT produces irregularities and discontinuities in a reconstructed image (known as tiling or blocking artifacts). Blocking artifacts cause the boundary between groups of 8×8 blocks of pixels to become visible in the reconstructed image. These blocking artifacts cause an undesirable degradation in image quality. Another disadvantage of JPEG is that JPEG cannot perform image reconstruction that is progressive in fidelity. In other words, if an image is encoded at a certain fidelity and a lower fidelity is later desired (for example, due to limited bandwidth or storage availability), the image must be decoded and re-encoded.
In order to overcome these shortcomings of JPEG, most modern image compression techniques use a wavelet transform technique followed by a quantization and entropy encoding. Wavelet transform (WT) is preferred over the DCT used in JPEG because WT does not have blocking artifacts and WT allows for image reconstruction that is progressive in resolution. Moreover, WT leads to better energy compaction and thus better distortion/rate performance than the DCT.
Most current WT-based compression techniques decompose an image into coefficients and use some form of entropy encoding (such as adaptive Huffman encoding or arithmetic encoding) of the coefficients to further compress the image. These types of encoding, however, can be quite complex and use, for example, complex symbol tables (such as in adaptive Huffman encoding) or complex data structures (such as zerotree data structures) that depend on the data types. Thus, most current WT-based techniques are complex and difficult to implement.
In order to accomplish a high quality level of decompressed images, lower latency time due to compression procedure, decompression procedure, and network latency, and smooth network loading, the present invention provides a solution for the above needs.