1. Field of the Invention
The present invention relates to technology for encoding image data.
2. Description of the Related Art
First, encoding systems according to the prior art will be considered.
Known image data encoding systems include the Joint Photographic Coding Experts Group (JPEG), which is an international standard system. The JPEG system, as it is designed for mounting in a hardware form, takes a long processing time if executed on a software basis. The encoding process of the JPEG system is outlined below.
(1) Pixel value data are cut out in blocks each having eight pixels by eight lines from image data entered in the order of raster scanning;
(2) Pixel values in each cut-out block are subjected to discrete cosign transform (DCT);
(3) The resultant transform coefficient is subjected to linear quantization; and
(4) The resultant quantization index is subjected to Huffman""s encoding.
Among these steps, the quantity of computation is particularly large for DCT. DCT can be accomplished by multiplying twice eight rows by eight columns. Therefore 16 rounds of product addition are needed per pixel.
Generally speaking, an irreversible encoding system is a technique to reduce the quantity of data without inviting visual deterioration in image quality by removing unnecessary redundant components in the reproduction of images. Redundant components can be broadly classified into redundancy in resolution and redundancy in gray-scale. The JPEG system achieves a high encoding efficiency by addressing both types of redundancy in combination.
In order to remove redundant components in resolution, relatively heavy processing loads are required, such as frequency conversion and space filtering. On the other hand, removal of redundant components in gray-scale requires only relatively light processing loads, such as quantization and discarding of lower-order bits. If an irreversible encoding system for removing only redundant components in gray-scale, the system can be expected to take no long processing time even if executed on a software basis.
Systems focusing on redundant components in gray-scale include the block truncation coding (BTC) systems. One example is the BTC system disclosed in the Japanese Published Examined Patent Application NO. 8-2083. The process of encoding by the BTC system is outlined below.
(1) Pixel value data are cut out in blocks each having four pixels by four lines from image data entered in the order of raster scanning;
(2) The difference between the maximum and minimum values (dynamic range) is computed for the pixel values of each cut-out block;
(3) Blocks whose dynamic ranges are not broader than a predetermined first threshold are subjected to one-level quantization;
(4) Blocks whose dynamic ranges are broader than a predetermined second threshold are subjected to four-level quantization;
(5) All other blocks are subjected to two-level quantization;
(6) The quantization characteristics at each level are computation according to the pixel value of each block; and
(7) The computed quantization characteristics and quantization index are subjected to entropy encoding.
The BTC system is subject to less processing loads than the JPEG system, but it is liable to give rise to the following problems.
First, since the number of gray-scale levels for quantization is switched block by block, the blocks are susceptible to distortion. For instance, an input image whose density profile is shown in FIG. 18 is considered. In the figure, the horizontal axis represents the positions of pixels, and the vertical axis and pixel values. This input image, when subjected to BTC encoding and BTC decoding, is turned into an output image whose density profile is shown in FIG. 19. In this process, a level difference in pixel value arises on the boundary between blocks of 1 to 4 in pixel position including an edge and blocks of 5 to 8 in pixel position including no edge. In such a case, because of the characteristics of human vision, even a slight level difference is likely to be detected as a block-shaped distortion.
Second, there is an overhead for encoding and transmitting the quantization characteristics. The quantization characteristics have the number of quantization levels, the reference level and the level intervals. Where the gray-scale accuracy of input pixels is 8 [bits/pixel], the original data quantity of additional information is 1.6+8+8=17.6 [bits]. If the compression ratio of entropy encoding is 2, the quantity of additional information will be 8.8 [bits]. Since the data quantity of a block is 16 [Bytes], the ratio of the additional information will be about {fraction (1/14.5)}. The smaller the block size, the less likely the block distortion to occur, but this would mean a greater ratio of the additional information, and there is a practical limit to block size reduction.
Third, determining quantization characteristics block by block entails heavy processing loads. In order to determine quantization characteristics, the largest pixel value and the smallest pixel value in each block are found out. Further, to compute the reference level and the level intervals, the average of pixel values in each block are figured out. In particular where four-level quantization is chosen, division by three is required. Of these loads, that of calculating the largest and smallest values is especially heavy.
Fourth, the BTC system involves complex encoding and decoding units. The BTC system, as illustrated in FIG. 20, encodes quantization indexes computed block by block, put together into pages, and represented on a bit plane basis by a reversible binary encoding system, for instance the Modified Read (MMR) system, which is an international standard system. On the other hand, the additional information indicating the quantization characteristics is variable-length encoded block by block.
Since two different ways of processing, block processing and bit plane processing, are needed, the overall configuration of the system is made complex. To add, in FIG. 20, reference numeral 91 denotes a blocking unit; 93, a quantization unit; 94, a paging unit; 95, a binary encoding unit; and 96, a variable length encoding unit.
An overall approach according to the present invention will be described below.
The challenge is to obtain an irreversible image encoding apparatus which is susceptible to less processing loads than the JPEG system, free from block distortion which the BTC system entails and can be realized in a simpler configuration than the BTC system.
An encoding system to obviate the shortcomings of the BTC system will be discussed below. First, quantization characteristics are switched pixel by pixel without blocking. Second, to dispense with the transmission of quantization characteristics as additional information, inverse quantized pixel values, i.e. pixel values whose number of gray-scale levels is limited, are encoded instead of encoding the quantization index. Third, in order to alleviate the load of determining the quantization characteristics, only pseudo-contours are taken note of.
This system, since it involves the restriction of the number of gray-scale levels but no blocking, is free from block distortion. Only pseudo-contours need to be taken note of as a distortion attributable to encoding. The biggest challenge here is to work out a technique by which quantization can be carried out with the smallest possible number of gray-scale levels while restraining the generation of pseudo-contours.
The ease of pseudo-contour generation varies with the image pattern. For instance, in the part of an image pattern having edges, distortion is unlikely to be detected even if there is some quantization error. In contrast, in a uniform gray-scale part with little noise, even a slight quantization error is likely to be detected as a pseudo-contour. Even in the same gray-scale part, a high noise level would have an effect to mask the pseudo-contour. Based on these facts, a model is supposed in which large areas where there are consecutive similar pixel values adjoin each other and a pseudo-contour is detected when a gray-scale level gap occurs on any boundary between such areas. Next, as an image pattern most susceptible to the generation of a pseudo-contour (stress pattern), a succession of uniform stripes gradually varying in pixel value from one to next is supposed. As the parameters representing the characteristics of the stress pattern, the width S of each stripe and the difference D in pixel value between stripes are chosen. Furthermore, for various combinations of S and D, the range in which no pseudo-contour can be detected is found out by a sensory evaluation test. Results such as what are shown in FIG. 11, for example, can be obtained. Entries xe2x80x9cPRESENTxe2x80x9d in FIG. 11 mean that a pseudo-contour is present between matched S and D. By matching the pattern of each part of the input image with the stress pattern and measuring the parameters S and D, the extent of pseudo-contour generation can be predicted. In this process, the closer the pattern of the input image to the stress pattern, the higher the accuracy of prediction, and the less close, the higher the predictable level of generation. However, from the viewpoint of preventing pseudo-contour generation, these are welcome characteristics because any prediction error that may arise is on the safe side.
In the above-described adaptive quantization on a pixel-by-pixel basis while restraining the generation of pseudo-contours, transmitting the quantization index and quantization characteristics for each individual pixel would result in too low an efficiency of encoding. In view of this, now is considered an alternative way whereby pixel values restricted in the number of gray-scale levels by quantization or inverse quantization are directly transmitted.
The challenge here is to find out an efficient way to carry out entropy encoding of gray-scale-restricted image data. Obviously, the probability of the emergence of exactly the same pixel value is higher with gray-scale-restricted image data than with ordinary image data. Therefore, the conventional predictive encoding system can be usefully applied. For instance, according to the JPEG independent system, which is an international standard system, an already encoded pixel a adjacent to a pixel x to be encoded is referenced, and the pixel value difference between the referenced pixel a and the pixel x to be encoded, i.e. the prediction error e=(xxe2x88x92a), is subjected to Huffman""s encoding. Since restriction of gray-scale levels increases the probability for the same pixel values to adjoin each other, the frequency of a 0 difference in pixel value rises, resulting in an enhanced efficiency of encoding.
However, this method will have no effect unless the same pixel values are positioned in the range of pixels to be referenced in prediction. The reference range of predictive encoding, constrained by the hardware dimensions and the quantity of arithmetic operation, is in the order of only a few pixels.
Yet, in actual image data, similar pixel values are often sporadically positioned in repetition in a much wider range. For instance, in an image of blue sky, the same blue color appears between clouds. Therefore, by registering in a dictionary pixel values that have emerged in the past and, when any entry in the dictionary is hit, encoding that dictionary index, the efficiency of encoding can be improved.
Further, if the conventional predictive encoding system is applied to a gray-scale-restricted image, the number of gray-scale levels which a pixel value, predicted by a prediction formula, can take will increase. For instance, it is now supposed that the number of gray-scale levels of an image which originally has 256 gray-scale values is restricted to 64 gray-scale levels. By predictive encoding, if a predicted value x is computed by the prediction formula x=(a+b)/2 from a pixel b immediately above the pixel to be encoded x and the pixel a immediately to its left, the values the predicable value x will prove greater than 64 gray-scale levels, sometimes a value impossible from the outset, Then there will be more kinds of values than the prediction error e=(xxe2x88x92x) can take, resulting in no better encoding efficiency. By restricting the gray-scale levels of the result of computing the prediction formula as well, the number of values the prediction error can take decreases, and the encoding efficiency can be enhanced accordingly.
In view of the problems noted above, the present invention provides a quantization characteristics determining apparatus having a unit that selects the values of pixels peripheral to a pixel to be encoded out of input pixel values and to output them; a unit that quantizes and inverse quantizes the values of the peripheral pixels according to preset quantization characteristics; a unit that measures the length S of the consecutive occurrence of the same pixel values from the inverse quantized pixel values and/or the pixel value differences D between the pixel to be encoded and the peripheral pixels; and a unit that determines quantization characteristics from the measured length S and/or the pixel value differences D.
This configuration makes it possible to readily determine the quantization characteristics, which give yardsticks for avoiding image quality deterioration by pseudo-contours, from the length S and the pixel value differences D.
In this configuration, the input pixel values may as well be preprocessed by a low-pass filter.
Alternatively, the relationship of a length S and/or the values of pixel value differences D, both figured out in advance by a sensory evaluation test, to the frequency of pseudo-contour generation may be held, and the quantization characteristics may be determined from the relative magnitude or magnitudes of the measured length S and/or the values of pixel value differences D.
Further in view of the above-noted problems, the invention provides an image encoding apparatus having a unit that inputs pixel values; a unit that determines quantization characteristics on a pixel-by-pixel basis according to the characteristics of the input pixel values; a unit that quantizes or inverse quantizes the values of the input pixel values according to the determined quantization characteristics; and a unit that subjects image data gray scale-restricted by quantization or inverse quantization to entropy encoding.
This configuration restricts the gray-scale levels by quantization or inverse quantization, and the number of gray-scale levels of image data (the number of kinds of pixel values) is thereby reduced. Accordingly, the image data can be encoded at a high efficiency by entropy encoding.
Further in this configuration, the entropy encoding unit may be provided with a unit that selects the values of reference pixels peripheral to a pixel to be encoded out of gray-scale-restricted pixel values and to output them; a unit that computes the predicted value of the pixel to be encoded from the values of the reference pixels on the basis of a prediction formula, and to compute as a prediction error any difference between the pixel value of the pixel to be encoded and its predicted value; and a variable-length encoding unit that subjects the prediction error to variable-length encoding and to output the resultant code.
Alternatively, the entropy encoding unit may be provided with a unit that selects the values of reference pixels peripheral to a pixel to be encoded out of gray-scale-restricted pixel values and to output them; a unit that computes the predicted value of the pixel to be encoded from the values of the reference pixels on the basis of a prediction formula, and to compute as a prediction error any difference between the pixel value of the pixel to be encoded and its predicted value; a dictionary encoding unit that stores gray-scale-restricted pixel values in a dictionary, to output, if the pixel value of the pixel to be encoded hits any entry in the dictionary, that dictionary index, or to output, if no entry is hit, identifying information indicating the failure to hit and to update the dictionary appropriately; a selecting unit that selects the dictionary index if any entry in the dictionary is hit or, if no entry is hit, to select and output the prediction error; and a variable-length encoding unit that subjects the prediction error to variable-length encoding and to output the resultant code.
Or the prediction error computing unit may be provided with a prediction unit that computes the predicted value from the value of the reference pixel; a unit that quantizes or inverse quantizes the predicted value according to preset quantization characteristics; and a prediction error computing unit that computes difference between the quantized or inverse quantized predicted value and the pixel value of the pixel to be encoded.
Further, the quantization/inverse quantization unit may output as they are the data of the n higher-order bits of an m-bit pixel value as the quantization index, use as it is the quantization index as the data of the n higher-order bits of the inverse quantized value, cut out the higher-order data of higher-order mxe2x88x92n bits from the quantization index, and use them as the data of lower-order mxe2x88x92n bits of the inverse quantized value.
Also in view of the above-noted problems, the present invention provides an image encoding apparatus having a unit that inputs pixel values; a unit that determines characteristics of restricting the number of pixel values on a pixel-by-pixel basis according to the characteristics of the input pixel values; a unit that converts the input pixel values into output pixel values according to the determined characteristics of restricting the number of pixel values; and a unit that subjects image data restricted in the number of pixel values by the characteristics of restricting the number of pixel values to entropy encoding.
In this configuration as well, since the variety of pixel values before the entropy encoding is restricted, the efficiency of entropy encoding is enhanced. Any method to realize the pixel value number characteristics may be used only if it involves matching to reduce the number of elements of the pixel value set. The number of gray-scale levels may as well be reduced by quantization or inverse quantization processing or the like as described above. Thus, the unit that converts input pixel values into output pixel values may be a number of gray-scale levels restricting unit, such as a unit that carries out quantization or inverse quantization. Or else, the variety of output pixel values may as well be restricted by the use of two or more tables differing in the number of output elements. In this case, the unit that converts input pixel values into output pixel values serves as a tabulating unit for these tables. Multiple tables may also be prepared on the basis of the appearance frequency of pixel values in the whole image.
According to the invention, there is further provided a computer-readable recording medium recording a computer program for causing a computer to execute encoding, having: a step to generate data quantized according to a pixel to be encoded and pixels peripheral to it with preset quantization characteristics; a step to measure the length S of the consecutive occurrence, in the peripheral pixels, of the same data as the quantized data of the pixel to be encoded pixel values and/or the differences D between the pixel to be encoded and the peripheral pixels; a step to determine the quantization characteristics of the pixel to be encoded from the measured length S and/or the values of pixel value differences D; and a step to quantize the value of the pixel to be encoded according to the determined quantization characteristics.