A portion of the disclosure of this patent document contains material which is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure, as it appears in the Patent and Trademark Office patent file or records, but otherwise reserves all copyright rights whatsoever.
This invention relates to an improved method and apparatus for coding and decoding digital images. More particularly, the present invention is directed towards visual pattern image coding which partitions an image into image blocks, each image block being reducible into mean intensity value, image code, and gradient magnitude. These three indicia represent a more efficient method of coding complex digital images. Furthermore, the invention represents a substantial improvement in terms of the trade offs between image quality, computational speed, and data compression over conventional image coding and decoding techniques.
There are numerous methods presently used to code and decode digital images. Most methods require substantial amounts of computation and are not practical for real-time use. To transmit a digital image, enormous amounts of information must be handled. Conventional coding and decoding techniques often require that digital images be stored in memory and then manipulated in accordance with code and decode algorithms. In a typical 4 MHz television signal with 8 bits per pixel (BPP) of gray-level resolution with a 512.times.512 pixel array, approximately 2.09 million bits are required to transmit and store a single picture frame. Such a voluminous bit requirement quickly fills most modern computer storage media. With the advent of digital images and the means by which those images can be coded prior to transmission and storage, bit manipulation can be minimized. Two influential and relevant image coding schemes are presently used, Block Truncation Coding (BTC) and Vector Quantization (VQ), and have achieved great notoriety. Both conventional methods are designed to reduce the BPP rate to a level below 1.0. Furthermore, these methods also strive to maintain the quality of the original image with little image distortion.
BTC uses a partitioning scheme whereby an image is truncated into blocks of subimages. Operations are performed upon each individual block so that the end result is a binary bit stream or code representing that particular block. After each block is coded, the code is transmitted along a conventional transmission media to its target or destination. Upon arriving at the destination, the code is received by a decoder which arranges the decoded image in the same location as the original image.
BTC is fairly simple and fast. Coding is achieved by three basic steps. The first step is computing an average mean intensity value from among all the digital intensity numbers for each partitioned block. Once the mean intensity is computed, the mean is subtracted from each pixel intensity value to obtain a deviation. Deviation is representative of the number of pixels that have intensity values above and below the mean intensity and the standard deviation amount above and below the mean intensity. Thus, the third step involves transmitting a binary code corresponding to the mean intensity, pixel intensity values that are above or below that mean and standard deviation of each pixel value. These three indicia represent all the information needed for each partitioned block. Once the coded indicia are transmitted, the decoder functions merely to decode and reconstruct a decoded image from each set of indicia.
The essence of the BTC technique is the ease by which all three indicia can be represented in either a first or second moment. Each moment, commonly referred to as A or B moments, combine the essence of all three indicia by simple mathematical calculation. For a detailed explanation of calculations of first and second moments n BTC coding see D. J. Healy and Robert Mitchell, "Digital Video Band Width Compression Using Block Truncation Coding," I.E.E.E. Trans. Commun., Vol. Com-29, No. 12, pp. 1809-1817, Dec. 1981. Although BTC methods provide simple and fast coding, BPP rates are fairly high. BPP rates are recorded, on the average, to be around 1.3. Because compression ratios are generally inversely proportional to BPP rates, BTC methods prove inadequate in applications requiring high compression ratios. Using 8 bit pixel intensity values, BTC generally can only achieve compression ratios that are less than 8:1.
The second coding technique that has gained popularity along with BTC, is VQ coding. VQ coding is a relatively new image coding method that has recently attracted much attention. Like BTC, VQ partitions an image into numerous image blocks. Each image block must be mapped into an average or mean image block representative of all the blocks contained within a block cluster. See, e.g., Y. Linde, A. Buzo and R. M. Gray, "An Algorithm for Vector Analyzer Design," I.E.E.E. Trans. Common., Vol. Com-28, pp. 84-95, Jan. 1983. The clustering algorithm collects a large number of blocks drawn throughout the same sample images. Thus, each coded block is compared with code vectors in a codebook predefined in memory media. Each block is coded by transmitting a code of the closest block to it in the codebook. Decoding is fast and is achieved by a simple look-up in the codebook of the image vector having the specific target code. Since the coder and decoder employ the same codebook, only the index of the code vectors need be transmitted.
The advantage of VQ coding is its inherent ability to achieve lower BPP rates, or conversely, higher compression ratios. By coding a relatively small set of codes rather than the details of each block, VQ coding can achieve compression ratios of approximately 10:1 to 15:1.
Although compression rates are high, a major problem encountered in VQ coding is the time required to perform the block search when coding. VQ must decide upon a centroid image block for each cluster and match blocks to be placed within that cluster to the centroid image block value. This process requires a great deal of time-consuming computation. Along with the complexity and time-consumption problem, VQ coding also presents problems such as redundancy in the code vectors and low-quality image reproduction if there are an insufficient number of code vectors. Many redundant image blocks must be mapped and compared to the centroid block thereby unnecessarily adding time to the vector search operation. Furthermore, if the centroid block of a given cluster has a substantially different intensity value than the centroid block of another cluster, the transmitted code vectors corresponding to the widespread intensity values will produce low quality, low resolution decoded images.