(1) Field of the Invention
The present invention relates to an image data encoding apparatus which is capable of progressive transmission of an image.
(2) Description of the Related Art
When transmitting an original high-resolution image, such as a home page image, over the Internet, the progressive transmission algorithm is most commonly used. In the home page image, text data and picture or photographic data, both represented by bit maps, often coexist. Although the bit-map representation of text or characters is not efficient for the transmission, the bit-map representation of the home page image is needed for the progressive transmission because the receiver equipment does not always include the fonts that are exactly the same as the fonts in the transmitter equipment.
In progressive transmission of an image, a low resolution representation of the image is first sent. This low resolution representation requires very few bits to encode. The image is then updated, or refined, to the desired fidelity by transmitting more and more information. To encode an image for progressive transmission, it is necessary to create a sequence of progressively lower resolution images from the original higher resolution image. The JBIG specification recommends generating one lower resolution pixel for each two by two block in the higher resolution image.
When a home page image is transmitted over the Internet by the progressive transmission and displayed on the receiver equipment, the operator on the display of the receiver equipment can quickly recognize the contents of the received image. If the received image is not the desired one, the operator can quit the receiving of the image and the time and cost for the transmission between the transmitter and the receiver can be saved.
The JBIG algorithm proposed by the Joint Bilevel Image Experts Group defines a standard for the progressive encoding of bilevel images (see ISO/IEC International Standard 11544; ITU-T T.82). The progressive reduction standard (PRES) of the JBIG is essentially a progressive scheme that encodes a bilevel image by creating representations at lowering resolutions in a bottom-up manner. Resolution is halved at every stage. The image is then transmitted in a top-down manner, that is, lower resolutions first. Resolution reduction is done by using a block of pixels in the higher resolution image and pixels already encoded in the lower resolution image. These values are used as an index into a predefined lookup table that can be specified by the user.
Japanese Laid-Open Patent Application No.2-137475 discloses a multi-resolution encoding method for the progressive transmission. This encoding method is aimed at reducing the amount of bits required to be encoded. In the encoding method of the above publication, the highest resolution image which requires the largest amount of bits to encode is transformed into a bilevel image. The highest resolution image is reconstructed on the receiver equipment by using the pixels in the bilevel image and pixels in the second highest resolution image.
Japanese Laid-Open Patent Application No.4-322562 discloses a multi-resolution encoding method for the progressive transmission. This encoding method is aimed at increasing the efficiency of the multi-resolution encoding. In the encoding method of the above publication, an original high resolution image is divided into edge portions and the remaining portions. Only the edge portions of the original image are transformed into bilevel image data, and the difference pixels between the original image and the bilevel image are created. The bilevel image data and the difference pixels are separately encoded by using suitable encoding schemes that are different from each other. The efficiency of the multi-resolution encoding is thus increased.
Generally, when the bit-map representation of characters is transmitted by using the progressive transmission, the characters in the reconstructed images from early stages of the transmission on the receiver equipment are often not readable to a human viewer or the operator on the receiver equipment. From the point of view of speedy recognition of the received image, the time and cost for the progressive transmission of the bit-map representation of the unreadable characters are not contributory but detrimental.
The PRES method of the JBIG is a progressive scheme that encodes a bilevel image, and it creates representations at lowering resolutions in a bottom-up manner regardless of whether or not the received image is readable on the receiver equipment. Hence, the PRES method does not eliminate the above-mentioned problem.
The encoding method of Japanese Laid-Open Patent Application No.2-137475 is useful only for the saving of the time and cost for the transmission of the highest resolution image but not contributory to the saving of the time and cost for the transmission of the lower resolution images. Hence, the method of the above publication does not cure the shortcomings of the PRES method.
The encoding method of Japanese Laid-Open Patent Application No.4-322562 requires the division of the original image into the edge portions and the remaining portions, and the amount of bits required to be encoded is considerably increased which causes the time for the transmission of the entire image to be increased. Further, the hardware which is needed for implementing the encoding method of the above publication is complicated, which will raise the cost for the hardware. Hence, the method of the above publication also does not cure the shortcomings of the PRES method.
As described above, when a conventional multi-resolution encoding method is used for encoding the bit-map representation of characters, the inclusion of unreadable-character bits in the lower resolution images before they are transmitted to the receiver equipment is considerably probable. This causes the increase of the amount of unreadable-character bits needed for the progressive transmission, and this is detrimental to speedy searching and recognition of reconstructed images obtained by the progressive transmission. A detailed description will now be given of the above-mentioned problems of the conventional multi-resolution encoding method with reference to FIG. 6A through FIG. 6C.
FIG. 6A, FIG. 6B and FIG. 6C show results of reconstructed images from the coded data produced by a conventional multi-resolution encoding method.
Specifically, FIG. 6A show a result of a reconstructed image based on the entire coded data of an original high-resolution image in which a photograph area and a text area coexist. FIG. 6B shows a result of a reconstructed low-resolution image based on only the fourth-level coded data and the low-frequency coded data of the original image. FIG. 6C shows a result of a reconstructed lower-resolution image based on only the low-frequency coded data of the original image.
For example, the reconstructed image of FIG. 6A has a resolution of 75 dpi (dots per inch), and the reconstructed image of FIG. 6C has a resolution of 37.5 dpi. In the conventional multi-resolution encoding method, it is possible to selectively use a desired one of the encoding schemes such as those of FIG. 6B and FIG. 6C. That is, when displaying a reconstructed image on a low-resolution CRT (cathode-ray tube) display device the encoding scheme of FIG. 6B is selected, while when displaying a reconstructed image on a lower-resolution portable terminal the encoding scheme of FIG. 6C can be selected.
As shown in FIG. 6C, when the conventional multi-resolution encoding method is used, the characters in the text area of the reconstructed image are not easily readable to a human viewer. The conventional multi-resolution encoding method allows the inclusion of unreadable-character bits in the lower resolution image before the image is transmitted to the receiver equipment. This causes the increase of the amount of unreadable-character bits needed for the progressive transmission, and this is detrimental to speedy searching and recognition of the reconstructed image obtained by the progressive transmission.
Further, when viewing as a thumbnail for searching and recognition of a document, the document generally contains both important words and negligible words from the point of view of the searching and recognition. As shown in FIG. 6B, when the conventional multi-resolution encoding method is used, the characters in the text area of the reconstructed image may be readable to a human viewer, but both important words and negligible words with the same resolution are contained in the reconstructed image in a mixed manner. Hence, it is difficult that the reconstructed image obtained by the progressive transmission has appropriate readability for a specific purpose such as viewing as a thumbnail for searching and recognition.
An object of the present invention is to provide an improved image data encoding apparatus in which the above-described problems are eliminated.
Another object of the present invention is to provide an image data encoding apparatus which ensures speedy searching and recognition of reconstructed images obtained by the progressive transmission while avoiding the increase of the amount of unreadable-character bits needed for the transmission.
Another object of the present invention is to provide an image data encoding apparatus which enables the reconstructed images obtained by the progressive transmission to have appropriate readability for a specific purpose such as viewing as a thumbnail for searching and recognition.
The above-mentioned objects of the present invention are achieved by an image data encoding apparatus which produces compressed image data from input image data through a multi-resolution encoding in order to carry out a progressive transmission, the apparatus including: a wavelet transform unit which produces a first subband coefficient and a second subband coefficient as an output in response to each of blocks of the input image data, each block having a predetermined number of pixels, the first and second subband coefficients being produced by performing a multiple decomposition process in which only the first subband coefficient is further decomposed; a first text transform unit which produces an intensity coefficient of a first kind as well as a set of coefficients of a second kind as an output in response to each of the blocks of the input image data, the intensity coefficient including significant bits, and only the intensity coefficient being further decomposed in a multiple decomposition process of the first text transform unit; a second text transform unit which produces a set of coefficients of the second kind as an output in response to each of the blocks of the input image data, without producing no coefficient of the first kind, one of the coefficients of the second kind including significant bits but not being further decomposed; and a selector which selects one of the output of the wavelet transform unit, the output of the first text transform unit and the output of the second text transform unit, as an input to an entropy coder, in accordance with a combination of an area-discriminating signal and a text-transform control signal.
The image data encoding apparatus of the present invention makes it possible to suitably set one of the first text transform and the second text transform to a specific text area of the original image. The image data encoding apparatus of the present invention can avoid the inclusion of unreadable-character bits in the lower resolution images before the subsequent-level decomposition is performed. In the progressive transmission, the lower resolution images are first sent. Accordingly, by suitably setting one of the first text transform and the second text transform to a specific text area of the original image, the image data encoding apparatus of the present invention allows speedy searching and recognition of the reconstructed images obtained by the progressive transmission while avoiding the increase of the amount of the unreadable-character bits needed for the transmission. Further, the image data encoding apparatus of the present invention allows the reconstructed images obtained by the progressive transmission to have appropriate readability for a specific purpose such as viewing as a thumbnail for searching and recognition.