This invention relates to an image coding/decoding method and to a recording medium on which a program for this method has been recorded. More specifically, the invention relates to an image coding/decoding method that is capable of reproducing a high-quality image (CG images, animated images and natural images, etc.) at high speed even with comparatively low-performance hardware (CPU, memory, etc.) such as that of a game machine, and to a recording medium on which the program of this method has been recorded.
FIGS. 16 and 17 diagrams (1), (2), for describing the prior art, in which FIG. 16 illustrates the concept of image compression processing in accordance with the JPEG standard.
In accordance with the JPEG standard, which currently is the main scheme for compressing still pictures, an image is divided into blocks of 8xc3x978 pixels, as shown in FIG. 16(A), and the block is converted to DC (a mean value) and to coefficient values ranging from a fundamental frequency to a frequency which is 63 times the fundamental frequency by a two-dimensional DCT (Discrete Cosine Transform). By utilizing the fact that the frequency components of a natural image concentrate in the low-frequency region, each coefficient value is quantized at a different quantization step to such an extent that image quality will not decline, and variable-length coding (Huffman coding) is performed after the quantity of information is reduced.
In a case where such coded image data is utilized in a home game machine, the fact that there are limitations upon CPU performance and memory capacity results in various disadvantages when an image compression method (JPEG, etc.) involving a large decoding burden is implemented by software. For example, with the JPEG scheme, 64 codes for undergoing variable-length coding are generated for a block of 8xc3x978 pixels, thereby inviting an increase in computation load at the time of decoding.
FIG. 16(B) illustrates a portion of Huffman code.
With Huffman coding, a coefficient H13, etc., having a high frequency of appearance is coded by bits of relatively short code length, and a coefficient H6, etc., having a low frequency of appearance is coded by bits of relatively long code length. As a consequence, these codes are packed into each octet (byte) unevenly in the manner illustrated. This increases computation load greatly at the time of decoding. In a conventional game system, therefore, the state of the art is such that sacrifice of image quality is inevitable in order to decode images at a speed that allows the images to appear as a moving picture.
In regard to image quality, the higher the frequency component, the coarser the precision with which quantization is carried out. As a result, image information concerning contours is lost, thereby producing mosquito noise. Such a scheme is not suitable for the compression of characters and animated images. In particular, since game software makes abundant use of artificial images (CG images and animated images, etc.), a decline in subjective image quality due to mosquito noise is a major problem.
In this regard, the following literature (1) to (5) has been reported in recent years:
(1) Michael F. Barnsley, Lyman P. Hurd xe2x80x9cFRACTAL IMAGE COMPRESSIONxe2x80x9d, A K Peters Ltd., 1993;
(2) Takashi Ida, Takeshi Datake xe2x80x9cImage Compression by Iterative Transformation Codingxe2x80x9d, 5th Circuit and System Karuizawa Workshop Collection of Papers, pp. 137-142, 1992;
(3) Hyunbea You, Takashi Takahashi, Hiroyuki Kono, Ryuji Tokunaga xe2x80x9cImproving LIFS Image Coding Scheme by Applying Gram Schmidt Orthogonalizationxe2x80x9d, The Institute of Electronics, Information and Communication Engineers Research Report, vol. NLP-98, no. 146, pp. 37-42, 1998;
(4) Toshiaki Watanabe, Kazuo Ozeki xe2x80x9cStudy of AC Component Prediction Scheme Using Mean Valuesxe2x80x9d, Image Coding Symposium (PCSJ89), pp. 29-30, October 1989; and
(5) Takashi Takahashi, Ryuji Tokunaga xe2x80x9cHigh-Speed Computation Algorithm for Predicting AC Components from Block Mean Values of Imagesxe2x80x9d, The Institute of Electronics, Information and Communication Engineers Papers, Vol. J81-D-II, No. 4, pp. 778-780, April 1998.
Schemes (1) and (2) relate to fractal coding as a compression method involving little decoding computation load, scheme (3) relates to improvements in adaptive orthogonal transformation having a coding efficiency equivalent to that of the JPEG scheme, and schemes (4) and (5) concern AC-component prediction based upon block mean values (DC values).
Among these, scheme (3) is a block coding method which divides an image into square blocks of Kxc3x97K pixels and approximates all blocks by the AC-component prediction method, fractal conversion method or adaptive orthogonal transformation, depending upon an allowable error Z. The AC-component prediction method is utilized in block coding by mean-value separation in which an AC component (addition data) of a local block is found from the block mean values (DC values) of blocks surrounding the local block, and the residual between this and an image of interest is coded. Adaptive orthogonal transformation is a method in which use is made of the autosimilarity possessed by an image, a base vector for approximating a block image is extracted from an image (nest) corresponding to a vector quantization code book, and an orthogonal base system of the minimum required dimensions is constructed by the Gram Schmidt Method.
However, the fractal conversion of schemes (1) and (2) necessitates iterative computation in decoding and consumes work space on the scale of the image plane. This scheme is therefore not suitable for video game machines.
The adaptive orthogonal transformation of scheme (3) utilizes addition data, the size of which is equivalent to that of the image of interest, as a nest. As a consequence, a very large work space is required at the time of decoding. Furthermore, though image quality is improved because blocks decompressed at the time of decoding are sequentially written back to corresponding blocks in the nest, the load imposed by address computation and data transfer is great. Further, Huffman coding is applied to compression of the coordinates of the base and to sampling coefficients in scheme (3). In the case of a natural image, however, large deviations are absent in the frequency of occurrence of any base whatsoever. Consequently, not only is there no improvement in compression efficiency but what is expended is only the amount of computation of Huffman code. With adaptive orthogonal transformation, there are cases where, depending upon the image, the number of orthogonal bases of the minimum required number of dimensions is large. When the number of bases is large, however, the number of bits used is greater than when the residual vector is coded directly, and coding efficiency declines as a result.
With the AC component prediction method of scheme (4), there is a tendency for overshoot or undershoot to occur in the vicinity of contours in the image, and image quality tends to be degraded in the case of artificial images in which luminance rises sharply.
In the case of the indirect method of AC component prediction of scheme (5), not only is the load on the CPU great but it is also required to have a storage area for interpolated values that are generated along the way.
FIG. 17 illustrates the concept of the indirect method of AC component prediction.
In accordance with indirect method of AC component prediction, as shown in FIG. 17(A), the DC values of sub-blocks S1xcx9cS4 in a block S of interest are estimated in accordance with the following equations from the DC values (S, U, R, B, L) of four surrounding blocks and the block of interest:
S1=S+(U+Lxe2x88x92Bxe2x88x92R)/8
xe2x80x83S2=S+(U+Rxe2x88x92Bxe2x88x92L)/8
S3=S+(B+Lxe2x88x92Uxe2x88x92R)/8
S4=S+(B+Rxe2x88x92Uxe2x88x92L)/8
In FIG. 17(B), the above equations are applied recursively, whereby the pixel values of four pixels P1xcx9cP4 in sub-block S1 can be estimated in accordance with the following equations:
P1=S1+(U3+L2xe2x88x92S3xe2x88x92S2)/8
P2=S1+(U3+S2xe2x88x92S3xe2x88x92L2)/8
P3=S1+(S3+L2xe2x88x92U3xe2x88x92S2)/8
P4=S1+(S3+S2xe2x88x92U3xe2x88x92L2)/8
Similarly, the pixel values of four pixels P1xcx9cP4 in sub-block S2 can be estimated in accordance with the following equations:
P1=S2+(U4+S1xe2x88x92S4xe2x88x92R1)/8
P2=S2+(U4+R1xe2x88x92S4xe2x88x92S1)/8
P3=S2+(S4+S1xe2x88x92U4xe2x88x92R1)/8
P4=S2+(S4+R1xe2x88x92U4xe2x88x92S1)/8
The pixel values of four pixels P1xcx9cP4 in the sub-blocks S3, S4 can be estimated in a similar manner.
However, in order to obtain all of the predicted values P1xcx9cP4 of the original image by starting from the initial DC values (S, U, R, B, L), the above-cited equations must applied in steps to achieve refinement and not only is a memory for storing intermediate values necessary but there is also an increase in computation load upon the CPU.
The present invention has been devised in view of the problems of the prior art mentioned above and an object thereof is to provide an image coding/decoding method in which the compression rate of computer-utilizable image data is raised while the image data is maintained at a high image quality, and in which computation load at the time of decoding can be reduced, as well as a recording medium having the program of this method recorded thereon.
The foregoing object is attained by the arrangement of FIG. 1(A). Specifically, an image coding method according to claim (1) of the present invention comprises dividing image data into a plurality of pixel blocks and generating a DC image comprising mean values of respective ones of the blocks; separating from each pixel block a corresponding block mean value and obtaining a residual vector for every block; and in a case where the magnitude of the residual vector becomes equal to or greater than an allowable value, obtaining one or two or more orthogonal bases for approximating the residual vector by an adaptive orthogonal transformation that employs a nest of the DC image; and coding an orthogonal base system comprising a linear combination of these orthogonal bases.
In claim (1) of the present invention, the load upon the CPU and memory at the time of image coding/decoding is greatly alleviated by virtue of the arrangement in which one, two or more orthogonal bases are obtained from the nest of the DC image. Further, an improvement in image quality can be expected in an artificial image such as a CG image or animated image which has many flat luminance components and, as a result, exhibits strong correlation with respect to the DC image.
Preferably, in claim (2) of the present invention, in a case where the magnitude of the residual vector is less than the allowable value in claim (1) of the present invention, information indicative of number of bases=0 is coded instead of obtaining orthogonal bases.
For example, with regard to an artificial image in which luminance rises sharply, code indicative of number of bases=0 is generated in limited fashion with respect to a pixel block in which a contour portion does not exist nearby (i.e., a pixel block in which the magnitude of the residual vector is less than the allowable value).
On the decoding side, on the other hand, the indirect method of AC component prediction or the direct method of AC component prediction according to the present invention is applied in limited fashion to the corresponding pixel block, thereby making it possible to reproduce the pixel block with little CPU and memory load.
Preferably, in claim (3) of the present invention, in a case where total amount of code of the orthogonal base system obtained in claim (1) of the present invention is equal to or greater than total amount of code of the residual vector, the residual vector itself is coded instead of the orthogonal base system. As a result, no purposeless decline in image compression rate is brought about.
Preferably, in claim (4) of the present invention, in a case where a first error  less than dnk greater than  with respect to (between) a residual vector  less than d greater than  becomes less than the allowable value owing to a number nk of bases obtained in advance in claim (1) of the present invention, second error vectors  less than dm greater than  following use of m (0xe2x89xa6m less than nk) bases in the order in which the bases were obtained are scalar-quantized by quantization coefficients Qyk predetermined in accordance with a number yk (=nkxe2x88x92m) of remaining bases that have not been used, and the result is subjected to scalar inverse quantization; the smallest error is selected from among third error vectors  less than dxe2x80x2m greater than  of nk types, which are obtained by subtracting the scalar inverse-quantized second error vectors from the second error vectors  less than dm greater than , and the first error  less than dnk greater than ; and the corresponding bases and, if necessary, the corresponding second error vector, are coded.
One example of the coding method will be described in detail with reference to (c), (d) in FIG. 12(A). In (c) of FIG. 12(A), assume that there has been obtained a linear combination
 less than d greater than ≈xcex21 less than v1 greater than +xcex22 less than v2 greater than +xcex23 less than v3 greater than xe2x80x83xe2x80x83(nk=3)
comprising nk (e.g., nk=3) bases for which the magnitude ∥ less than dnk greater than ∥ of the error with respect to the residual vector  less than d greater than  is made less than an allowable value Z. It should be noted that  less than v greater than  indicates that v is a vector.
A first error vector  less than d3 greater than = less than dnk greater than  in the case where nk=3 bases have been used and second error vectors  less than d0 greater than xcx9c less than d2 greater than  following use of m (0xe2x89xa6m less than nk) bases in the order in which they were obtained are related as follows:
In (d) of FIG. 12A, the second error vectors  less than d0 greater than xcx9c less than d2 greater than  are scalar-quantized (and clipped) by respective ones of quantization coefficients Qyk (e.g., Q3=6, Q2=7, Q1=8) predetermined in accordance with the remaining number of bases yk=3, 2, 1 not used, and these are subjected to inverse quantization by the same quantization coefficients Qyk to thereby find inverse-quantized second error vectors  less than d0Qxe2x80x2 greater than xcx9c less than d2Qxe2x80x2 greater than .
 less than d0Qxe2x80x2 greater than =[ less than d0 greater than /Q3]xc3x97Q3
 less than d1Qxe2x80x2 greater than =[ less than d1 greater than /Q2]xc3x97Q2
 less than d2Qxe2x80x2 greater than =[ less than d2 greater than /Q1]xc3x97Q1
where the symbol [ ] indicates that the result of computation is made a whole number.
Furthermore, there are found third error vectors  less than dxe2x80x20 greater than xcx9c less than dxe2x80x22 greater than , which are obtained by subtracting these inverse-quantized second error vectors from respective ones of the second error vectors  less than d0 greater than ,  less than d1 greater than ,  less than d2 greater than .
 less than dxe2x80x20 greater than = less than d0 greater than xe2x88x92 less than d0Qxe2x80x2 greater than 
 less than dxe2x80x21 greater than = less than d1 greater than xe2x88x92 less than d1Qxe2x80x2 greater than 
 less than dxe2x80x22 greater than = less than d2 greater than xe2x88x92 less than d2Qxe2x80x2 greater than 
The magnitudes of these third error vectors  less than dxe2x80x20 greater than xcx9c less than dxe2x80x22 greater than  are not necessarily greater than the magnitude of the first error vector  less than d3 greater than . In other words, the final decoding error may be smaller when, as the result of coding the two bases xcex21 less than v1 greater than +xcex22 less than v2 greater than  and utilizing the amount of code of the one remaining base to scalar-quantize the second error vector  less than d2 greater than , the error after decoding becomes the third error vector= less than dxe2x80x22 greater than , than when, as the result of coding the three xcex21 less than v1 greater than +xcex22 less than v2 greater than +xcex23 less than v3 greater than , the error after decoding becomes the first error vector= less than d3 greater than .
Accordingly, in claim (4) of the present invention, from among the third error vectors  less than dxe2x80x20 greater than ,  less than dxe2x80x21 greater than ,  less than dxe2x80x22 greater than  and first error vector  less than d3 greater than , the smallest error is selected and the corresponding bases and, if necessary, the corresponding second error vector, are coded.
In terms of the example set forth above, if the third error vector  less than dxe2x80x22 greater than  gives the smallest error, {xcex21 less than v1 greater than +xcex22 less than v2 greater than } is adopted as the orthogonal base system and these bases are coded. In addition, the second error vector  less than d2 greater than  is coded (scalar-quantized) conjointly. The quantization coefficients Qyk are decided in such a manner that the code thus generated will not exceed the total amount of code in a case where nk-number of bases are used. As a result, image quality can be improved without increasing the amount of code per pixel block.
Thus, the residual vector  less than d greater than  can be coded with higher precision by virtue of the arrangement of claim (4) of the present invention in which joint use is made of adaptive orthogonal transformation and, if necessary, scalar quantization of second error vectors after m-number of bases have been used. Further, when this is carried out, the second error vectors are scalar-quantized by the quantization coefficients Qyk conforming to the remaining number of bases yk (=nkxe2x88x92m), and these are coded to an amount of code equivalent to the remaining number yk of bases. As a result, only image quality is increased and not the total amount of code per pixel block. In addition, code length per pixel block can be put into easily decodable form (a multiple of a prescribed number of bits), thereby making it possible to greatly reduce computation load at the time of decoding.
An image coding/decoding method according to claim (5) of the present invention based upon the arrangement shown in FIG. 1(B) is such that on the basis of a total of five items of DC image data of upper, bottom, left, right blocks U, B, L, R inclusive of a local block S comprising a block mean value of Kxc3x97K pixels, items of pixel data P1xcx9cP4 of (K/2)xc3x97(K/2) pixels of a first sub-block S1 at the upper left of the local block S are obtained in accordance with the following equations:
P1=S+(2U+2Lxe2x88x922Sxe2x88x92Bxe2x88x92R)/8
P2=S+(2Uxe2x88x92Bxe2x88x92R)/8
P3=S+(2Lxe2x88x92Bxe2x88x92R)/8
P4=S+(2Sxe2x88x92Bxe2x88x92R)/8
items of pixel data P1xcx9cP4 of pixels (K/2)xc3x97(K/2) of a second sub-block S2 at the upper right of the local block S are obtained in accordance with the following equations:
P1=S+(2Uxe2x88x92Bxe2x88x92L)/8
P2=S+(2U+2Rxe2x88x922Sxe2x88x92Bxe2x88x92L)/8
P3=S+(2Sxe2x88x92Bxe2x88x92L)/8
P4=S+(2Rxe2x88x92Bxe2x88x92L)/8
items of pixel data P1xcx9cP4 of pixels (K/2)xc3x97(K/2) of a third sub-block S3 at the lower left of the local block S are obtained in accordance with the following equations:
P1=S+(2Lxe2x88x92Uxe2x88x92R)/8
P2=S+(2Sxe2x88x92Uxe2x88x92R)/8
P3=S+(2B+2Lxe2x88x922Sxe2x88x92Uxe2x88x92R)/8
P4=S+(2Bxe2x88x92Uxe2x88x92R)/8
and/or items of pixel data P1xcx9cP4 of pixels (K/2)xc3x97(K/2) of a fourth sub-block S4 at the lower right of the local block S are obtained in accordance with the following equations:
P1=S+(2Sxe2x88x92Uxe2x88x92L)/8
P2=S+(2Rxe2x88x92Uxe2x88x92L)/8
P3=S+(2Bxe2x88x92Uxe2x88x92L)/8
P4=S+(2B+2Rxe2x88x922Sxe2x88x92Uxe2x88x92L)/8
In claim (5) of the present invention, CPU and memory load are greatly alleviated at the time of image coding/decoding by virtue of the arrangement in which image data of Kxc3x97K pixels in the local block S is obtained directly in-stepless fashion from the neighboring DC image data S, U, B, L, R inclusive of the local block. It should be noted that the method of claim (5) of the present invention can be utilized in the prediction of AC components at the time of image coding and in the reproduction of AC components at the time of image decoding.
Further, a recording medium according to claim (6) of the present invention is a computer-readable recording medium on which has been recorded a program for causing a computer to execute the processing described in any one of claims (1) to (5) of the present invention.