1. Field of the Invention
The present invention generally relates to an image compression apparatus, an image decompression apparatus, an image compression method, an image decompression method, a computer program product, and a computer-readable recording medium recording a computer-readable program.
2. Description of the Related Art
As image input technology and output technology therefor progress, color still images with high definition are in great demand these days. Taking a digital camera (DC) as an example of an image input apparatus, price reduction of a high-performance charge coupled device (CCD) having the number of pixels of three million or more has been achieved, and the high-performance CCDs have been widely used for DC products at popular prices. The CCD is greatly indebted to progress in silicone processing or semiconductor device technology for the improvement of its performance. According to the progress, the trade-off problem between miniaturization and signal-to-noise ratio has been solved. Additionally, it is said that the number of pixels will maintain an upward trend for some time.
Further, brilliant progress has been achieved in realizing high definition and low prices in products in the hard copy field such as a printer and a dye sublimation printer, and in the soft copy field of flat panel displays such as a CRT, a LCD (liquid crystal display), and a PDP (plasma display panel).
Since such image input/output products of high performance and low price have been introduced to the market, demand for high-definition still images has been increasing. It is predicted that demand for high-definition still images will increase in every field hereafter. Actually, such a trend has been accelerated by developments in technology relating to networks including personal computers (PCs) and the Internet. Especially recently, opportunities for communicating images have risen sharply since the diffusion speed of mobile devices including mobile phones and notebook type personal computers is very fast. Therefore, it is concluded that there will be more and more demand for achieving multifunctional high performance image compression/decompression technology that facilitates handling high-definition still images.
At the present time, JPEG (Joint Photographic Experts Group) is most widely used as an image compression/decompression algorithm that facilitates the handling of such high-definition still images. Additionally, JPEG 2000 became an international standard in 2001. JPEG 2000 has an algorithm of higher performance than JPEG, and at the same time, has significant multiple functions installed. For this reason, JPEG 2000 is expected to succeed JPEG as the image compression/decompression standard format of the next generation for high-definition still images.
FIG. 1 is a block diagram for explaining the basics of a JPEG algorithm. The JPEG algorithm includes a color space transformer/inverse transformer 150, a discrete cosine transformer/inverse transformer 151, a quantization/reverse quantization part-152, and an entropy coder/decoder 153.
Generally, in order to obtain a high compression rate, an irreversible encoding scheme is used. Thus, complete compression/decompression of original image data, that is, a so-called lossless compression, is not performed. However, according to this method (irreversible encoding), it is possible to eliminate the problems such as increase in transmission time and in memory size required for processing. Since JPEG has the above-mentioned advantage, JPEG is currently the most widely used compression/decompression algorithm for still images.
FIG. 2 is a block diagram for explaining the basics of the JPEG 2000 algorithm. The JPEG 2000 algorithm includes a two-dimensional reversible wavelet transformer/inverse transformer 161, a quantization/reverse quantization part 162, an entropy coder/decoder 163, and a tag processing part 164.
As mentioned above, currently, JPEG is the most widely used compression/decompression method for still images. However, demand for achieving further improved high-definition still images still exists. Accordingly, JPEG is gradually reaching its technical limit. For example, block noise and mosquito noise are becoming more conspicuous, as high definition of original images is further achieved. In other words, deterioration of image quality in JPEG files is becoming no longer negligible. For this reason, improvement in image quality at a low-bit rate, that is, in the area of high compression rate, is recognized as the most important issue of technical development.
JPEG 2000 has been developed as an algorithm that can solve the above-mentioned problems. In addition, it is predicted that in the near future, JPEG 2000 be used together with the JPEG format that currently is a mainstream format.
Comparing FIGS. 1 and 2, one of the most different points is the transforming method. JPEG employs discrete cosine transform (DCT), while JPEG 2000 employs discrete wavelet transform (DWT). The main reason why JPEG 2000 employs DWT is that DWT offers an advantage in achieving better image quality in a high compression area than DCT.
Further, another significant difference between JPEG and JPEG 2000 is that JPEG 2000 includes the tag processing part 164 for forming codes at the final stage. The tag processing part 164 generates and interprets code streams. Additionally, JPEG 2000 can realize various useful functions by employing code streams. For example, FIGS. 3A, 3B, 3C and 3D are schematic diagrams, each showing sub-bands of each decomposition level in a case where the number of decomposition levels is three. As shown in FIGS. 3A, 3B, 3C and 3D, JPEG 2000 can stop the compression/decompression process of still images in an arbitrary layer (decomposition level) corresponding to a layer of octave division in DWT on a block basis.
Additionally, as shown in FIGS. 17 and 18, in many cases, JPEG and JPEG 2000 include color space transformers/inverse transformers 150 and 160, respectively, as-input/output parts of original images. The color space transformers/inverse transformers 150 and 160 correspond to parts that perform transformation or inverse transformation from RGB color systems including components of primary colors, red (R), green (G) and blue (B) namely, or YMC color systems including components of complementary colors, yellow (Y), magenta (M) and cyan (C), to YrCb color systems or YUV color systems.
In the following, a detailed description will be given of the JPEG 2000 algorithm.
FIG. 4 is a schematic diagram showing examples of components of a color image divided into tiles. Generally, as shown in FIG. 4, each of the components 181 (R), 182 (G) and 183 (B) (here, the RGB primary color system is shown) of the color image is divided into rectangular areas (tiles) 181t, 182t and 183t, respectively. The compression/decompression process is performed on each of the tiles, for example, R00 through R15, G00 through G15, and B00 through B15 independently.
In coding, data of each tile of each component are input to the color space transformer/inverse transformer 160 so that color space transformation is performed on the data. Thereafter, the data are applied with a two-dimensional wavelet transformation (direct transformation) by the two-dimensional reversible wavelet transformer/inverse transformer 161 so that the data are spatially divided into frequency bands.
FIG. 3A illustrates an original image tile (0LL, decomposition level: 0) obtained by dividing the original image into the tiles. The original image tile is transformed with a two-dimensional reversible wavelet transformation so as to divide the original image tile into sub-bands (1LL, 1HL, 1LH and 1HH) on the decomposition level 1 as shown in FIG. 3B. Then, subsequently, a low frequency component 1LL in level 1 is transformed with a two-dimensional reversible wavelet transformation. Thus, the low frequency component 1LL is divided into sub-bands (2LL, 2HL, 2LH and 2HH) on the decomposition level 2 as shown in FIG. 3C. Similarly, a low frequency component 2LL is transformed with a two-dimensional reversible wavelet transformation so that the low frequency component 2LL is divided into sub-bands (3LL, 3HL, 3LH and 3HH) on the decomposition level 3 as shown in FIG. 3D.
Further, in FIGS. 3A, 3B, 3C and 3D, the sub-bands that are targets of coding in the respective decomposition levels are indicated by gray parts. For example, as shown in FIG. 3D, when the decomposition level is three, the gray sub-bands (3HL, 3LH, 3HH, 2HL, 2LH, 2HH, 1HL, 1LH and 1HH) are targets of the coding. In this case, the subband 3LL is not coded.
Next, a bit that is a target of coding is fixed in a determined order of coding. A context is generated from bits around the target bit(s) by the quantization/reverse quantization part 162. The entropy coder/decoder 163 performs coding on the tiles in each component by probability estimation by receiving the context and the target bit.
In this way, the coding process is performed on every component of the original image on a tile by tile basis.
Last, the tag processing part 164 combines all coded data from the entropy coder/decoder 163 into a single codestream, and at the same time, performs a process of adding tags to the codestream thereof.
FIG. 5 briefly illustrates the structure of the codestream. The codestream includes a main header 191, tile-part headers 192, bit-streams 193 and a tag 194. The main header 191 and the tile-part headers 192 are tag information. The main header 191 is added to the beginning of the codestream. The tile part header 192 is added to each beginning of a tile part structuring each of the tiles. The bit stream 193 follows each of the tile-part headers 192. The bit streams 193 are coded data of the respective tiles. The tag 194 is added to the end of the codestream.
On the other hand, in decoding, contrary to the coding, the image data are generated from the data of each tile of each component in the codestream. In this case, the tag processing part 164 interprets the tag information added to the codestream that is externally input to the tag processing part 164. Then, the tag processing part 164 divides the codestream into codestreams corresponding to respective tiles of respective components. Thereafter, the tag processing part 164 independently performs an encoding process on each of the code streams corresponding to the respective tiles of the respective components. Positions of the bits that are targets of encoding are determined according to an order based on the tag information in the codestream. At the same time, the quantization/reverse quantization part 162 generates a context from an arrangement of bits (coding of which has been completed) surrounding the target bit. The entropy coder/decoder 163 receives the context and the codestream, performs encoding by possibility estimation so as to generate the target bit, and writes the generated target bit to a position of the target bit.
Thus encoded data are spatially divided for each frequency spectrum. Thus, each tile of each component of the image data is restored by transforming the decoded data with a two-dimensional reversible wavelet transformation by the two-dimensional reversible wavelet transformer/inverse transformer 161. The restored data are transformed to data-of the original color system by the color space transformer/inverse transformer 160.
Additionally, the above-mentioned idea of “tile” of JPEG 2000 can be used for the conventional JPEG compression/decompression format as an image area that is handled independently.
In the above, a description is given of a general still image. However, the above-mentioned technique may be also used for moving images. That is, by structuring each frame of a moving image as a single still image, it is possible to create (encode) or display (decode) video data at a frame speed appropriate for an application. This is a function called the motion compression/decompression process of still images. Additionally, the phrase “motion still image” is used here to indicate continuous still images in which one frame corresponds to one still image.
This method offers a function that is not provided for a video file of MPEG format widely used for moving images. In other words, the method has an advantage in that a still image of a high quality can be handled on a frame basis. Accordingly, the method is beginning to attract the attention of business fields such-as broadcasting stations. It is highly likely that the method will come into wide use for the consuming public.
Among the specifications required for the compression/decompression algorithm of the motion still images, it is processing speed that is very different from the general compression/decompression algorithm of still images. The reason is that the frame rate, having an influence on the quality of a moving image, depends on the processing speed. Therefore, at the present time, only limited methods that highly depend on hardware such as ASIC and DSP can realize the function. It is conceived that it is necessary to wait for progress in such as process device technology in the semiconductor field, parallelizing compiler technology in the software field and the like.
However, according to the above-described conventional technology, there is a problem in that “borders of tiles” stand out when the compression/decompression process is performed under a condition where the compression rate is high. Actually, volume of data of an image becomes very large when an original image that is a target of the compression/decompression process is spatially very large, or when each color component has a deep gradation level. Such a technical problem is newly raised as the above-described demand in the market for high-definition still images become higher.
When the compression/decompression process is performed on an original image having a very large volume of data, an extremely large memory area is required for maintaining a process result and a working area that processes image data. Additionally, process time required for compression or decompression also becomes very long. In order to avoid such problems, generally, an original image is divided into a rectangular region, a so-called “tile”, namely, and the compression/decompression process is performed on the tile independently. Thanks to the idea of dividing a space into “tiles”, it is possible to control the increase in size of memory and process time up to a practical level. As an idea of handling an image by dividing the image into regions, in addition to the above-described “tile”, there is a unit called “block”. The “block” is used in conventional JPEG, and includes 8×8 pixels. The object of employing the “block” is to divide the image into units of frequency transformation. On the other hand, the object of employing the “tile” is to divide the image into units of entropy coding for memory reduction and parallel operation. Thus, the “block” and the “tile” differ fundamentally. In other words, the “block” is a unit used for an operation performed in a preliminary step toward coding.
However, the new problem, that is, the above-described “elicitation (or revealing) of borders of tiles” has arisen due to dividing an original image into tiles. This phenomenon occurs when decoding compressed image data that are generated by encoding the original image by a lossy compression under a condition of high compression rate of the original image. Especially, the phenomenon has in many cases a great influence on a subjective level of image quality in displaying a moving image that employs a high compression rate.
The reason can be explained as follows. That is, a target area of calculations unexpectedly extends to areas (outside of borders of a tile(s)) having no image data when a low-pass filter/high-pass filter of a horizontal direction, and-a low-pass filter/high-pass filter of a vertical direction that are used for performing a two-dimensional wavelet transformation carry out respective filter calculations. The rate of the extension is larger as the decomposition level becomes deeper.