In recent years, information digitization has advanced, and a demand has arisen for saving or sending digital documents in place of paper documents. Especially, due to the advent of low-price storage media and an increase in communication band, documents to be digitized are changing from monochrome binary documents to full-color documents.
Note that document digitization not only photoelectrically converts a paper document into image data using a scanner or the like, but also segments a document into regions of different natures such as text, symbols, figures, photos, tables, and the like that form the document, and respectively converts text, figure, photo, and table portions into text code information, vector data, image data, and structure data.
A process for analyzing the contents described on a document image for one page, and segmenting the document image into partial elements having different natures such as text, figures, photos, tables, and the like, i.e., a region segmentation process, is done as the first stage of such document digitization process. FIG. 25 shows an example of region segmentation.
As an implementation example of such region segmentation process, U.S. Pat. No. 5,680,478 “Method and Apparatus for character recognition” (Shin-Ywang et. al.,/Canon K.K.) or the like is known. In this example, sets of 8-coupled contour blocks of black pixels and 4-coupled contour blocks of white pixels are extracted from a document image, and characteristic regions in a document such as text regions, pictures or figures, tables, frames, lines, and the like are extracted on the basis of their shapes, sizes, set states, and the like. In the example shown in FIG. 25, characteristic regions of a document such as text regions (blocks 1, 3, 4, and 6), a picture & figure region (block 2), table region (block 5), and a frame/line (7) are extracted.
Note that an 8-coupled contour block of black pixels (to be referred to as a black pixel block hereinafter) is a set of black pixels which are coupled from a given black pixel in one of eight directions, as shown in FIG. 14. Also, a 4-coupled contour block of white pixels (to be referred to as a white pixel block hereinafter) is a set of white pixels which are coupled from a given white pixel in one of four directions, as shown in FIG. 16.
The aforementioned region segmentation process is premised on that the input document image is a monochrome binary image due to its operation principle. Therefore, in order to execute region segmentation of a color document by exploiting this technique, a document image must be converted into a binary image in advance. In general, a color image is converted into a binary image by calculating a threshold value from a pixel luminance distribution, and converting each pixel of the image into a white or black pixel to have this luminance threshold value as a boundary.
The method of calculating a threshold value used to binarize a color image includes a method of calculating a common threshold value to the entire image, and a method of calculating threshold values for respective regions. In a binarization method proposed by Japanese Patent Application No. 11-238581 of the present applicant, an optimal threshold value is dynamically calculated for each region in accordance with the contents of an input document, and is used to attain optimal binarization for each region. Especially, this method can implement binarization which can automatically convert all characters on a color document that includes both high-luminance characters on a low-luminance background and low-luminance characters on a high-luminance background into black characters on a white background, and an optimal binary image as an input of the region segmentation process can be obtained.
FIG. 24 shows a region segmentation process of a document which includes a colored background by the previously proposed binarization method. Referring to FIG. 24, a color document 2301 includes a dark-colored background region on its lower half portion, on which light-color characters are printed, and dark-color characters are printed on a light-color background of the remaining portion. As can be seen from FIG. 24, the upper and lower half portions of such document have separate meanings.
When a color document like the document 2301 undergoes binarization by the aforementioned binarization method, a binary image 2302 in FIG. 24 is generated. In the binary image 2302, the background color is removed and is expressed by white pixels, and all characters are expressed by black pixels. At this time, when the binary image 2302 undergoes the conventional region segmentation process, a result 2303 shown in FIG. 24 is obtained. In this case, since information of the region with the color background, which is present on the lower half of the image is omitted, TEXT1 and TEXT2 are coupled although they should be respectively separated into two regions at their center.
That is, range designation information of the text region using background color that the color image originally has is lost upon binarization.