1. Field of the Invention
The present invention relates to an image processing apparatus, an image processing method, and a program and a storage medium therefor.
2. Related Background Art
Recently, as a result of the current widespread employment of scanners, the digitization of documents has become a popular practice. However, for the storage, in bit map form, of a full-color, A4-sized digital document that has been scanned at 300 dpi, for example, a huge amount of memory, upwards of 24 Mbytes, must be allocated. And such a large data recording is not amenable to being attached to and transmitted with mail.
Therefore, JPEG, a well known compression technique, is commonly used to compress full-color image data. With JPEG, however, although it is very effective when used to compress natural images, such as photographs, and the quality of the images produced when it is used is high, when high-frequency portions, such as symbols, are compressed using JPEG, image deterioration called mosquito noise occurs, and the compression rate is also reduced. Therefore, since generally an office document includes many symbol portions, after a document is binarized, MMR is used to compress the binary document and obtain the coordinates of symbol portions and the representative colors of the symbols therein, so that an office document prepared in color can be easily represented. Further, for a complicated color document, such as a magazine, an area to be compressed is divided into background and symbol portions, and while the background is compressed using JPEG, symbols are binarized using an optimal threshold value and the obtained binary images are compressed using MMR, following which color information is added to the obtained MMR data. In this manner, even a fairly complicated color document can be represented using a small data file.
Therefore, a technique is required for calculating the representative color of symbols in a symbol portion. The following is an example conventional method used for calculating the representative symbol color.
First, a rough, three-dimensional histogram is prepared for multi-valued image data in a black portion by referring to the binary image of a symbol area. Then, a fine histogram is prepared for the pixels of a multi-valued image that corresponds to the highest value in the rough three-dimensional histogram, and the highest value that is thereby obtained is determined to be the representative color.
However, when the above method is employed to calculate the representative color of symbol colors, although a desirable color can be calculated for a symbol having a height of 12 points or more when read at a resolution of 300 dpi or higher, for a 10 point or smaller symbol, the ratio of the originally calculated representative color data to the black of the binary image is small, and a desired color can not be calculated.
An explanation will now be given, while referring to FIG. 19, for a case wherein calculations are performed to obtain the representative color of a large symbol, and for a case wherein calculation are performed to obtain the representative color of a small symbol.
FIG. 19 is a diagram showing a sample wherein green symbols are written on a white background. A binary result 1901 is obtained for a comparatively thick symbol, and the multi-valued image of a black portion in the binary result 1901 has a level change 1902. In the level change 1902, since the level remains steady for a long time at portions 1903 and 1904, which correspond to the representative color of the symbol, the color is distributed in the color space RGB as is shown in FIG. 20A. A block 2002 in FIG. 20A is the color green in FIG. 19, i.e., indicates the representative color of the symbol. Since the block 2002, of the symbol portion, has a specific size, it can be extracted comparatively easily.
But then, for a fine symbol 1906 in FIG. 19, the level change in a multi-valued image has a shape 1907, and as soon as the level reaches portions 1908 and 1909, which correspond to the representative color of the symbol, it is changed to the level of the background portion. In this case, the color distribution in the RGB color space is as is shown in FIG. 20B, and compared with the block 2002 in FIG. 20A, using the obtained data it is difficult to calculate a point 2005 in FIG. 20B. Through the binarization process, the left side of a broken line is binarized as a black symbol, and when the representative color is calculated using the conventional method, the point 2005 is obtained as a value, the greatest number that is present. This is not preferable because compared with the desired symbol color, the obtained symbol has a whitish-green cast.
In order to avoid the occurrence of this phenomenon, a method is available whereby a binary image is thinned and a conventional representative calculation is performed using a fine image. When this method is applied, however, a defect described in the following explanation occurs.
To simplify the explanation, a symbol “∘” is used as an example.
Assume that in FIG. 21 a green symbol “∘” is drawn on a white background. The level shift for the symbol “∘” has a change 2104. Originally it would be ideal for the center indentation to be returned to the white level; however, the complete return to the white level of the symbol “∘”, a small point, may not be possible. If the binarization process is performed by using a threshold value 2105, a solid black dot 2102 is obtained as the binary result. And if the thinning process is then performed for this dot 2102, a black dot 2103 is obtained. In accordance with the level 2104, the position of the multi-valued image indicated by this binary image is a point 2106, which is not a preferable level for the representative color.
Since this “crushed phenomenon” occurs for a symbol having a small point, it is apparent that the thinning process is not effective.
The binary image that is the output employed for representing a symbol is used to calculate the representative color for the symbol. However, it is preferable that a threshold value for optimally representing a symbol be binarized, so that no blurring of the symbol occurs. It is further known that, while taking the succeeding OCR process into account, it is better for a binarized symbol to become solid than it is for it to become blurred, since better OCR results can be obtained.
FIG. 22 is a graph showing a typical histogram for the brightness of a symbol area. A point 2201 is a desirable point for a binary image. However, when binarization is performed at this point 2201, a pixel that is shifted from the background to the symbol portion is binarized as a black dot, a preferable output, while when the calculation of the representative color of the symbol is performed, this output constitutes noise.
This state is shown in FIG. 22. When binarization is performed at the point 2201 in FIG. 22, this is the equivalent of binarization being performed at a level 2301 in FIG. 23, and the binary image that is obtained also includes many portions 2302 and 2303 that are shifted from the background to the symbol.
As is described above, since the binary image that is the output employed to represent a symbol is used for the calculation of the representative color of the symbol, it is not possible to calculate an optimal representative color for the symbol portion.
Furthermore, according to the conventional method, for each symbol area only one representative color can be obtained, and a symbol area in which multiple colors appear can not be coped with.