This invention is applicable to electronic equipment such as an OCR (optical character reader), copier, facsimile machine or processor for implementing an electronic database and, more particularly, relates to an image processing apparatus and method for extracting a specific desired area from a document image.
Two methods of extracting a desired area from a document are available. The first method is such that whenever the operator wishes to extract a desired area, the operator designates this area in an input image each time. This method involves reading the document image using a scanner, displaying the scanned image on a display monitor and having the operator designate the desired area using a mouse or the like.
The second method involves creating a template for which size and position information representing rectanglar areas has been decided in advance, applying the rectangular areas decided by the template directly to an input image and then extracting these areas from the input image. In this case rectangular areas whose positions and sizes have been decided by the template are extracted from a scanned document image and the operator need no longer perform the laborious task of specifying extraction areas one after another.
The first method is disadvantageous in that the operator must specify the desired area each time. This method, therefore, is not suited to the processing of a large number of documents. The second method using the template is disadvantages in that if there is a disparity in position or size between an area to be extracted from the input image and the rectangular area decided by the template, the area to be extracted may be omitted in the extraction process.
The present invention has been devised in view of the foregoing problems and its object is to provide an image processing apparatus and method whereby it is possible to extract a desired area from a document image in reliable fashion.
Another object of the present invention is to make possible the rapid and reliable extraction of a desired area from a large quantity of document images.
A further object of the present invention is to provide an image processing apparatus and method whereby it is possible to reliably extract a desired area from an entered document image while employing a template.
An image processing apparatus according to one mode of the present invention for attaining the foregoing objects comprises: holding means for holding position, size and attribute as template information in regard to one or a plurality of areas in an image; image input means for inputting a document image; first extraction means for extracting block areas from the document image input by the image input means and evaluating attributes of the extracted block areas; and second extraction means for extracting, from block areas that have been extracted by the first extraction means, a block area that at least partially overlaps an area indicated by the template information and whose attribute agrees with the attribute included in the template information.
An image processing method according to another mode of the present invention for attaining the foregoing objects comprises: a holding step of holding position, size and attribute as template information in regard to one or a plurality of areas in an image; an image input step of inputting a document image; a first extraction step of extracting block areas from the document image input at the image input step and evaluating attribute of the extracted block areas; and a second extraction step of extracting, from block areas that have been extracted at said first extraction step, a block area that at least partially overlaps an area indicated by the template information and whose attribute agrees with the attribute included in the template information.
Other features and advantages of the present invention will be apparent from the following description taken in conjunction with the accompanying drawings, in which like reference characters designate the same or similar parts throughout the figures thereof.