1. Field of the Invention
The present invention generally relates to conversion and encoding of signals, such as image signals, and specifically relates to generation of encoded data by conversion and encoding, and recompression of the encoded data.
2. Description of the Related Art
In conversion and encoding of an image using wavelet transform, technology is disclosed by Japanese Patent Publication No. JP 6-326990 A, wherein a greater number of smaller quantization steps are provided to a lower frequency subband than a higher frequency subband that is provided with a lesser number of larger (wider) quantization steps such that human vision properties are adequately reflected when linear quantization of a wavelet coefficient is performed.
Further, in order to minimize the mean square value of errors generated in a signal after reverse frequency conversion of the subband obtained by decoding a signal that is encoded by conversion and encoding, technology that uses an inverse value (or an integral multiple value thereof) of the square root of subband gain as the step size for linear quantization of each subband in the case of encoding is disclosed by J. Katto and Y. Yasuda, “Performance Evaluation of Subband Encoding and Optimization of its Filter Coefficients,” Journal of Visual Communication and Image Representation, vol.2, pp. 303-313, December 1991.
As for human vision properties, a measurement example of human vision sensitivity is disclosed by J. Katto and Y. Yasuda, “Performance Evaluation of Subband Encoding and Optimization of its Filter Coefficients,” Journal of Visual Communication and Image Representation, vol.2, pp. 303-313, December 1991. Further, a standard document of JPEG 2000 (refer to, for example, Yasuyuki Nomizu, “Next-Generation Image Encoding Method JPEG 2000,” Triceps, Inc., Feb. 13, 2001) provides an example of weights of subbands based on the human vision sensitivity, details of which are disclosed by Marcus J. Nadenau and Julien Reichel, “Opponent Color, Human Vision and Wavelets for Image Compression,” Proceedings of the Seventh Color Imaging Conference pp. 237-242, Scottsdale, Ariz., Nov. 16-19, 1999, IS&T.
Generally, a process of conversion and encoding includes frequency conversion of original signals to subbands, quantization of frequency domain coefficients constituting the subbands, and entropy encoding of the quantized coefficients, which are performed in this sequence, and is referred to as Procedure 100. Here, the subband is a group of the “frequency domain coefficients” that are classified for each of predetermined frequency bands. The “frequency domain coefficients,” which are also called frequency coefficients or coefficients, are DCT coefficients if the frequency conversion is carried out by DCT (discrete cosine transform), and wavelet coefficients if the conversion is carried out by wavelet transform. Further, as is widely known, the quantization is carried out to raise the compression ratio of data, and a typical method is linear quantization wherein coefficients are divided by a constant that is called the step size. An example of this type of conversion encoding is disclosed by Yasuyuki Nomizu, “Next-Generation Image Encoding Method JPEG 2000,” Triceps, Inc., Feb. 13, 2001.
Now, given that the frequency coefficients are quantized and entropy encoded by Procedure 100, when the compression ratio of encoded data is desired to be raised, decoding the entropy encoded signal, de-quantization of the frequency coefficients that are decoded, re-quantization of the de-quantized frequency coefficients, and entropy encoding have to be performed in this sequence, which is called Procedure 101. This poses a problem in that, in addition to Procedure 101 being redundant, errors at the time of de-quantization have effect at the time of re-quantization, and there is a problem of producing cumulative errors.
To cope with the problem, in recent years an encoding method, which is also known as a “post quantization” method, enabling recompression without decoding the encoded signals has been proposed. Since the recompression is performed not by decoding the encoded signal, but by discarding unnecessary codes in the state of the entropy code, cumulative errors do not occur. A representative example of the post quantization method is JPEG 2000. In such a “recompression-able” encoding method as above, first, lossless (or almost lossless) encoded data are generated and held, and then, the encoded data are recompressed at a desired compression ratio by discarding unnecessary codes as desired.
In order to enable recompression by discarding codes, a method called “bit plane encoding” is used, wherein frequency coefficients are decomposed into bit planes, and each bit plane is independently encoded. In bit plane encoding, compression is performed by outputting only selected codes of high-order bit planes, which is implemented by one of the following processes:
(i) entropy encoding is performed on only selected high-order bit planes; and
(ii) entropy encoding is performed on bit planes beyond necessity (typically, all bit planes), and the entropy codes of selected low-order bit planes are discarded.
The implementation referred to above as (ii) finally outputs only the codes of selected high-order bit planes, and is the recompression. In bit plane encoding, compression is fundamentally realized by discarding bit planes, or entropy codes thereof, not by linear quantization of the coefficients. Further, as mentioned above, the post quantization can be performed either in the encoding process, or in a separate process after completing the encoding. In this specification, “post quantization” means both cases.
Now, in either case of (i) and (ii) above, a problem yet to be solved is how required high-order bit planes (or unnecessary low-order bit planes) are determined such that objectives, such as minimizing a mathematical quantization error, and optimizing subjective quality of the image, are met. This is discussed in more detail.
First, the case wherein required high-order bit planes (or, unnecessary low-order bit planes) are determined such that a mathematical quantization error (mean-square value of errors) is minimized at a given compression ratio is considered.
When the entropy encoded data are decoded, the procedure 100 is followed in the reverse sequence. Specifically, the quantized frequency coefficients are de-quantized, put into a reverse frequency conversion process, and signal values are reproduced. Here, in the reverse frequency conversion process, “a gain when the frequency coefficients are de-converted to the signal values” is different for every subband. Subband gain Gs is defined as the “square of the gain.” An error Δe generated by quantization of the frequency coefficients is multiplied by the square root of the subband gain through the inverse transform for reproducing the signals, and is represented by √{square root over ( )}Gs×Δe.
As disclosed by the non-patent reference 2, generally, in order to minimize the mean square errors generated in a signal after the inverse transform (the signal consisting of multiple signal values) at a given compression ratio, a simple encoding method is to perform linear quantization of each subband by the inverse value (or a value equal to the inverse value multiplied by a constant) of the square root of the subband gain. Accordingly, in the case of a conventional encoding method that does not use bit plane encoding, if coefficients are quantized by the step size (or a value equal to the step size multiplied by a constant), which is in inverse proportion to the square root of the subband gain, the mean square errors are minimized.
Now, a typical flow of the process using 5×3 wavelet transform in JPEG 2000 includes wavelet transform of an original signal to subbands, and only required high-order bit planes (or high-order sub bit planes) of wavelet coefficients are encoded for every subband, which are performed in this sequence, and called Procedure 102. Here, the sub bit planes are subsets of bit planes.
As described above, linear quantization is not performed according to the method using 5×3 wavelet transform. For this reason, the technique and means for minimizing the mean square error concerning the signal after the inverse transform of the linear quantization cannot be applied. Rather, in the case of the bit plane encoding, technique and means for determining required high-order bit planes (or unnecessary low-order bit planes) that generate the minimum mean square error have not been clarified. Much less, when a bit plane is divided into two or more subsets (i.e., sub bit planes), and encoding is performed for every sub bit plane, the technique and means for determining required high-order bit planes (or unnecessary low-order bit planes) that generate the minimum mean square error are further unclear. This is another problem to be solved.
Further, a typical flow of the process using 9×7 wavelet transform in JPEG 2000 includes wavelet transform of an original signal to subbands, linear quantization of wavelet coefficients for every subband, and encoding only required high-order bit planes (or high-order sub bit planes) of the quantized wavelet coefficients for every subband, which are performed in this sequence, and called Procedure 103.
In this case, “linear quantization of the coefficients by the step size that is in inverse proportion to the square root of the subband gain” is possible. However, performing linear quantization at the encoding stage is not suitable for the purpose of obtaining “coded data of a desired compression ratio by generating and holding lossless (or almost lossless) encoded data, and by discarding unnecessary codes as desired.” While it is desirable to minimize quantization in the encoding stage, and to perform post quantization thereafter when using the 9×7 wavelet transform, the technique and means for minimizing the mean square errors generated in the signal after an inverse transform are not clear. Much less, the technique and means in the case of encoding for every sub bit plane are even less clear. This poses another problem to be solved.
Next, obtaining “the optimal quality of image for a given compression ratio” is considered.
As indicated by the patent reference 1, human vision is more sensitive to a lower frequency region than a higher frequency region. Accordingly; the human vision sensitivity is higher for quantization errors in lower frequency subbands than in higher frequency subbands. Therefore, an effective method for linear quantization of wavelet coefficients includes a smaller step size to lower frequency subbands, and a larger step size to higher frequency subbands such that the human vision sensitivity is properly reflected in the linear quantization process, as Yasuyuki Nomizu, “Next-Generation Image Encoding Method JPEG 2000,” Triceps, Inc., Feb. 13, 2001 discloses.
Although this method cannot be applied to the case wherein 5×3 wavelet transform is used by JPEG 2000, it can be applied to the 9×7 wavelet transform such that “coefficients are quantized with the step size in inverse proportion to the magnitude of the human vision sensitivity corresponding to the frequency of subbands.” However, it is not suitable for achieving the objective to “obtain data at a desired compression ratio by generating and holding lossless (or almost lossless) encoded data, and by discarding unnecessary codes afterwards.” While it is also desirable to minimize quantization at the encoding step, and to perform post quantization afterwards, when using 9×7 wavelet transform, the technique or means for determining required high-order bit planes or high-order sub bit planes (alternatively, unnecessary low-order bit planes and low-order sub bit planes) so that the optimal quality of image can be visually obtained in the case of the post quantization are not clear. This poses another problem to be solved.
Further, considering that the human vision property is sensitive to “the quantization errors of pixels, not errors of frequency conversion coefficients,” it is desirable that both the human vision sensitivity and square roots of subband gain be considered at the post quantization. In addition, in bit plane encoding, discarding codes of n low-order bit planes (representing frequency coefficients) has the same effect as carrying out linear quantization of the frequency coefficients by 2 to the n-th power, and this is the reason for the process being called post quantization.