1. Field of the Invention
The present invention relates to an image/video compression system, method, and implementation in order to provide a high-speed, high-compression, high-quality, multiple-resolution, versatile, and controllable visual communication system. Specifically, the present invention is directed to a wavelet transform (WT) system for digital data compression in image/video signal processing. Due to a number of considerations and requirements of the visual communication device and system, the present invention is directed to providing a highly efficient image/video compression algorithm, WT Software Implementation, WT Hardware Implementation, and Multiple Resolution Video Conference (Mr. Vconf) methodology for local area multiple-point to multiple-point visual communication.
2. Description of the Related Art
In recent years, visual communications have become an exploding market due to current advances in video/image data compression development, the maturity of telecommunication facilities, and awell developed very large scale integrated (VLSI) technology. A well-developed visual communication system should take into consideration optimizing the processing speed, the versatility of the communication functions, the quality of image/video signals, and the controllability of the overall system. For instance, with regards to the processing speed, a good visual system should be able to process high-resolution video (e.g., 24-bit color with 288×352 resolution) in real time (30 frames per second).
The Joint Photographic Experts Group (JPEG) bitmap compression format is widely adopted for compressing either full-color or gray-scale digital images of “natural”, real-world scenes, rather than non-realistic images, such as cartoons or line drawings, black-and-white (1 bit-per-pixel) images or moving pictures. The JPEG color data is stored in values of luminance (brightness) and chrominance (hue). Technically speaking, the image/video compression includes three basic operations: transformation, quantization, and data packing. Usually a signal transformation is used to transform a signal to a different domain, perform some operations (e.g., quantization, and data packing), on the transformed signal and inverse transformation, back to the original domain. The transformed values are quantized or mapped to smaller set of values to achieve compression. In other words, quantization is the operation to sample data by a finite number of levels based on some criteria such as image distortion and compression ratio. The actual data compression divides the image into 8×8 pixel squares (i.e., block processing) and calculates a Discrete Cosine Transform (DCT) for that square. Then the DCT coefficients are quantized and a Variable Length Code compression scheme is applied. The commercially available graphic software packages provide different levels of compression, usually a low to medium level, which is sufficient and economical for web image files sent via modem over a telephone line or when removable media storage capacity is limited. However, when another application demands higher degree of compression, the higher the loss of detail. In particular, the quantization step results in blurring images on sharp edges under a high compression ratio, such as 100:1.
WT (or subband coding or multiresolution analysis) is a better choice than the classical DCT standard JPEG applications in order to faithfully recreate the original images under high compression ratios due to its lossless nature. The lossless nature of WT results in zero data loss or modification on decompression so as to support better image quality under higher compression ratios at low-bit rates and highly efficient hardware implementation. Extensive research in the field of visual compression has led to the development of several successful compression standards such MPEG 4 and JPEG 2000, both of which allow for the use of Wavelet-based compression schemes. WT has been popularly applied to image and video coding applications because of its higher de-correlation WT coefficients and energy compression efficiency, in both temporal and spatial representation. In addition, multiple resolution representation of WT is well suited to the properties of the Human Visual System (HVS).
The principle behind the wavelet transform is to hierarchically decompose the input signals into a series of successively lower resolution reference signals and their associated detail signals. At each level, the reference signals and detailed signals contain the information needed for reconstruction back to the next higher resolution level. One-dimensional DWT (1-D DWT) processing can be described in terms of a filter bank, wavelet transforming a signal is like passing the signal through this filter bank wherein an input signal is analyzed in both low and high frequency bands. The outputs of the different filter stages are the wavelet- and scaling function transform coefficients. A separable two-dimensional DWT process is a straightforward extension of 1-D DWT. Specifically, in the 2-D DWT process, separable filter banks are applied first horizontally and then vertically. The decompression operation is the inverse of the compression operation. Finally, the inverse wavelet transform is applied to the de-quantized wavelet coefficients. This produces the pixel values that are used to create the image.
There are probably as many wavelet transforms as there are wavelets. Due to the infinite variety of wavelets, it is possible to design a transform which maximally exploits the properties of a specific wavelet, such as morlets, coiflets, wavelants, slantlets, brushlets and wavelet packets. The computational complexity of WT is always a concern for industrial applications. Many traditional techniques struggled to balance the transmission time, memory requirements, and the image quality. The WT computation includes two components: lift/scale operations (i.e., lifting) and split/join operations (i.e., rearrangement of sample required as a result of upsampling/downsampling). The lifting scheme of WT is really general yet suitable for experimenting while its in-place and integer properties made it extremely useful for embedded systems when memory was still expensive. For example, U.S. Pat. Pub. No. 2002/0006229 A1 titled “System and Method for Image Compression and Decompression” used integer wavelet transform to simplify the computation at the cost of image quality. As another example, U.S. Pat. Pub. No. 2002/0001413 A1 entitled “Wavelet Transform Coding Technique” evaluates and applies alternative coding modes (depending upon data types such as color, gray scale, text, or graphic) for each image block. By adopting a 2–6 (L-H) wavelet filter and a Haar filter for subband decomposition, 11 groups of data are organized into a tree structure and then each coefficient is coded with the same number of bits to control the total number of bits per block under a budgeted number. As the price of memory has significantly reduced, there is no need to complicate the computation operations so as to minimize memory usage. New wavelet architectures designed with little concern for memory are in demand.
U.S. Pat. Pub. No. 2002/0006229 applies a multilevel uniform thresholding method for quantization. The wavelet coefficients are matched against threshold values, and if the values are less than the established threshold values specified, then the resultant value is set to zero. The wavelet coefficients are then quantized to a number of levels depending upon which quadrant is being processed, and the desired compression or quality factor. As mentioned, this quantization essentially scales the wavelet coefficients and truncates them to a predetermined set of integer values rather than using the real numbers with an appropriate number of floating points, which can be very important in image compression to make few coefficients zeros, especially for high spatial frequencies, to maintain better image quality. A quantization that reserves an appreciable number of floating points to improve image quality without incurring over-burdening calculations is desired.
The compression ratio is not only affected by the quantization but also the subsequent data packing. In the packing operation, the quantized wavelet coefficients are packed using various techniques including run length coding and Huffman coding to efficiently encode large numbers of zeros in sequences of binary data. To unpack the data requires the lookup of Huffman words, the decoding of the run length codes, and other data packing techniques. The outcome quantized wavelet coefficients are then de-quantized. U.S. Pat. No. 6,124,811 titled “Real Time Algorithms and Architectures for Coding Images Compressed by DWT-Based Techniques” proposed the adaptive run length coding for the blocking strategy in the JPEG standard. The adaptive run length coding incorporates the standard run-length coding with a modified Huffman coding to allow a variable and exceptional large run-length value to be encoded while keeping a fixed length structure. This modification tried to maximize the compression gained during encoding to reduce the storage/transfer memory size required for the image data. The coding architecture, as a whole or a portion, such as the Huffman coding, can be customized for different WT lifting schemes to improve the total system efficiency.