In recent years, as the technologies for image input apparatuses such as digital cameras, scanners, and the like have improved, the resolution of image data captured by such input apparatus is increasing. A low-resolution image requires only a small image data size and never disturbs transfer and storage processes. However, the image data size becomes huge with increasing resolution, thus requiring a long transfer time and a large storage size.
Hence, upon transferring or storing an image, it is a common practice to remove redundancy of an image or to reduce the data size by processing an image within a visibly allowable range using high-efficiency coding. A coding scheme that can perfectly reconstruct an original image by decoding is called lossless coding, and a coding scheme which can obtain a visibly approximate image, but cannot perfectly reconstruct an original image is called lossy coding. In case of lossy coding, it is important to reduce the code size by changing a portion where slight deterioration is visibly inconspicuous, but such process largely depends on the characteristics of images. Image data includes various types: a natural image generated by taking a silver halide photo of a person, landscape, or the like, and scanning the photo using a scanner, or directly taking such object photo by a digital camera, a text/line image obtained by rasterizing text/line information, a CG image obtained by rendering two-dimensional image data or a three-dimensional shape generated by a computer, and the like. In order to obtain high reproduced image quality, the required resolution and the required number of gray levels vary depending on such image types. In general, a text/line image requires a higher resolution than a natural image.
As one conventional scheme of high-efficiency coding, a method using wavelet transformation is used. In the conventional scheme, an image to be encoded is decomposed into a plurality of frequency bands (subbands) using discrete wavelet transformation. Transform coefficients of respective subbands undergo quantization and entropy coding by various methods to generate a code sequence. As a wavelet transformation method of an image, as shown in FIGS. 4A, 4B, and 4C that show its processes, an image to be encoded (FIG. 4A) undergoes a one-dimensional transformation process in the horizontal and vertical directions to be decomposed into four subbands. Furthermore, a method of repetitively decomposing only a low-frequency subband (LL subband) is normally used. FIG. 5 shows an example of subbands obtained when one-dimensional transformation is repeated twice.
As one of merits of image coding using wavelet transformation, scalable decoding of spatial resolutions is easy to implement. When wavelet transformation is done, as shown in FIG. 5, and coefficients of respective subbands are encoded and transferred in turn from low-frequency subband LL toward high-frequency subband HH2, the decoding side can decode images while gradually increasing resolution, i.e., a reconstructed image of ¼ resolution upon receiving coefficients of LL subband, that of ½ resolution upon receiving LL, LH1, HL1, and HH1, and that of an original resolution upon additionally receiving LH2, HL2, and HH2.
However, the aforementioned conventional high-efficiency coding method is not so efficient since it does not consider that different resolutions are required to obtain a high-quality natural image and text/line image upon encoding image data that includes both a natural image and text/line image.
When an image includes portions having different required resolutions like in a text/photo mixed image, data required to decode high resolution for only a required region is encoded, and data required to decode high resolution for a region that does not require high resolution is discarded, using scalability of spatial resolution of wavelet transformation.
However, the aforementioned conventional high-efficiency coding method is not so efficient, since image data must be temporarily scanned at high resolution upon encoding image data which includes regions having different required resolutions such as an image that includes both a natural image and text/line image.
The present invention has been made in consideration of the aforementioned problems, and has as its object to implement efficient image coding upon encoding image data that include regions which require different resolution levels.
It is another object of the present invention to make encoding that generates a code sequence, which allows the decoding side to specify a resolution of interest early upon decoding encoded image data.