The present invention relates generally to compression and decompression of data. More specifically, the present invention relates to a fast, low-complexity video coder/decoder.
A number of important applications in image processing require a very low cost, fast and good quality video codec (coder/decoder) implementation that achieves a good compression ratio. In particular, a low cost and fast implementation is desirable for low bit rate video applications such as video cassette recorders (VCRs), cable television, cameras, set-top boxes and other consumer devices. In particular, it is often desirable for such a codec to be implemented on a low-cost, relatively small, single integrated circuit.
In general, an image transform codec consists of three steps: 1) a reversible transform, often linear, of the pixels for the purpose of decorrelation, 2) quantization of the transform values, and 3) entropy coding of the quantized transform coefficients. In general, a fast, low cost codec is desirable that would operate on any string of symbols (bits, for example) and not necessarily those produced as part of an image transform. For purposes of illustration, though, and for ease of understanding by the reader, a background is discussed in the context of compression of video images, although the applicability of the invention is not so limited.
A brief background on video images will now be described. FIG. 1 illustrates a prior art image representation scheme that uses pixels, scan lines, stripes and blocks. Frame 12 represents a still image produced from any of a variety of sources such as a video camera, a television, a computer monitor etc. In an imaging system where progressive scan is used each image 12 is a frame. In systems where interlaced scan is used, each image 12 represents a field of information. Image 12 may also represent other breakdowns of a still image depending upon the type of scanning being used. Information in frame 12 is represented by any number of pixels 14. Each pixel in turn represents digitized information and is often represented by 8 bits, although each pixel may be represented by any number of bits.
Each scan line 16 includes any number of pixels 14, thereby representing a horizontal line of information within frame 12. Typically, groups of 8 horizontal scan lines are organized into a stripe 18. A block of information 20 is one stripe high by a certain number of pixels wide. For example, depending upon the standard being used, a block may be 8xc3x978 pixels, 8xc3x9732 pixels, or any other in size. In this fashion, an image is broken down into blocks and these blocks are then transmitted, compressed, processed or otherwise manipulated depending upon the application. In NTSC video (a television standard using interlaced scan), for example, a field of information appears every 60th of a second, a frame (including 2 fields) appears every 30th of a second and the continuous presentation of frames of information produce a picture. On a computer monitor using progressive scan, a frame of information is refreshed on the screen every 30th of a second to produce the display seen by a user.
As mentioned earlier, compression of such video images (for example) involves transformation, quantization and encoding. Many prior art encoding techniques are well-known, including arithmetic coding. Arithmetic coding is extremely effective and achieves nearly the highest compression but at a cost. Arithmetic coding is computational intensive and requires multipliers when implemented in hardware (more gates needed) and runs longer when implemented in software. As such, coders that only perform shifts and adds without multiplication are often desirable for implementation in hardware.
One such coder is the Z-coder, described in The Z-Coder Adaptive Coder, L. Bottou, P. G. Howard, and Y. Bengio, Proceedings of the Data Compression Conference, pp. 13-22, Snowbird, Utah, March 1998. The Z-coder described achieves high compression without the use of multipliers. Although the Z-coder described in the above paper has the promise to be an effective codec, it may not perform as well as described.
Therefore, a compression technique for data in general and for video in particular is desirable which may be implemented in hardware of modest size and very low cost. It would be further desirable for such a compression technique to take advantage of the benefits provided by the Z-coder.
To achieve the foregoing, and in accordance with the purpose of the present invention, a modified Z-coder is disclosed that achieves low cost, fast compression and decompression of data.
A fast, low-complexity, entropy efficient video coder for wavelet pyramids is described, although the invention is not limited to video compression nor to a transform using wavelets. This coder approaches the entropy-limited coding rate of video wavelet pyramids, is fast in both hardware and software implementations, and has low complexity (no multiplies) for use in ASICs. It uses a modified Z-coder to code the zero/non-zero significance function and Huffman coding for the non-zero coefficients themselves. Adaptation is not required. There is a strong speed-memory trade-off for the Huffman tables allowing the coder to be customized to a variety of platform parameters.
The present invention is implementable in a small amount of silicon area, at a modest cost in coding efficiency. With only 15% of the coefficients requiring coding of the coefficient value, speed and efficiency in identifying that minority of values via the significance function is an important step. The average run of correct prediction of significance values is about 20, so efficient run coding is important. While the importance of the 3 bits of context and the asymmetry strongly indicates the use of an arithmetic coder, an arithmetic coder can be too costly.
The requirement for a fast algorithm implementable in minimal silicon area demands that something other than a traditional arithmetic coder be used. In particular, multiplies are to be avoided as they are very expensive in silicon area. The modified Z-coder presented herein provides a codec that avoids multiplies, provides very good compression and functions appropriately to encode and decode bit streams.
Another advantage of the modified Z-coder is its simplicity and speed in view of hardware implementation. In one embodiment in software non-optimized for speed, the modified Z-coder is several orders of magnitude faster than the commercial (well optimized) MPEG2 software encoder used for the same quality. An optimized modified Z-coder should achieve 20-30 times improvement in performance with respect to MPEG2.