1. Field of the Invention
The present invention relates to an image processing apparatus that subjects image data to image processing.
2. Description of the Related Art
In the prior art, with the development in digital technology, an increasing number of documents have been digitized and the management of these digitized documents has posed an important problem.
Under the circumstances, image data is divided into regions by image region discrimination or layout analysis, and each divided region is subjected to image processing. Thereby, character information is detected. However, such image region discrimination information has not effectively been used in most cases.
Jpn. Pat. Appln. KOKAI Publication No. 4-160981 (Document 1), for instance, discloses that at least two regions, that is, a character region and a gray-scale region of an original image, are separated from image data, and the respective regions are individually subjected to image processing.
Jpn. Pat. Appln. KOKAI Publication No. 5-225378 (Document 2) discloses that an input document is segmented into blocks, and each block is classified into a photo part, a character part or a background part by a threshold-value process. Neighboring blocks, which are classified into the same kind, are integrated into an independent region.
Jpn. Pat. Appln. KOKAI Publication No. 2000-20726 (Document 3) discloses that a character string region is extracted from a character region extraction section and a specific region extraction section for, e.g. a photo or a figure/table.
In Document 1, image data is divided into regions and an image of each region is subjected to image processing. However, layout information at the time of dividing the image data into regions is not effectively used.
In Document 2, an input document is divided into a plurality of blocks, and each block is classified into a photo part, a character part, a background part, etc. However, there is a problem relating to the size of block. In addition, there is such a problem that only neighboring blocks can be integrated.
In Document 3, a character string region is extracted from the character region extraction section and the specific region extraction section for, e.g. a photo or a figure/table. However, Document 3 is silent on other effective methods of use.