1. Field of the Invention
The present invention relates to an image processing device, image processing method, and image processing program, and more specifically relates to obtaining specific document data by extracting necessary document blocks from image data obtained by reading a document such as a newspaper, magazine and the like.
2. Description of the Related Art
There are times when extraction of data of only a specific document is wanted in a document comprising a full page surface such as, for example, a newspaper, magazine and the like.
For example, Japanese Laid-Open Patent Application No. H9-204511 proposes a device which extracts character images of headlines among image data after reading a document such as a newspaper, magazine and the like to obtain image data, and records the associations of the character code data of headlines obtained in a character recognition process to the extracted character images, and character image data of body text corresponding to the headlines.
Although the device disclosed in this publication can obtain character image data of the corresponding body text by specifying character code data of the headline, disadvantages arise in that the obtained document data are difficult to read because the shape of the document block (document region) in which the document data appears is irregular because the character image data of the body text is used directly in the layout of the document. Moreover, when the obtained document data are adhered to a region of definite form, there is much white space since the shape is irregular, which is inefficient.