1. Field of the Invention
The present invention relates to a document processing apparatus and method for reading out a document (original) to copy the image as a hard copy or convert the image into image data and store the image data.
2. Description of the Related Art
There are a technique of reading out a document (original) as image data to output the hard copy of the image data and a technique of storing image data as an image file. The former is used in the fields of image readers of copy machines or personal computers. The latter is used for filing devices or databases.
When such a conventional image processing apparatus is to read out a document as image data and copy the image data as a hard copy, or convert a document image into image data and store the image data, the apparatus generally processes a page (a sheet of paper) of the document original as a unit for a document in regular size.
More specifically, a document includes, as contents on its sheet surface, various data such as titles, texts, illustrations, drawings, and photographs laid out on the sheet surface. However, when all images on the document are to be processed together to copy the document, or converted into image data and stored, necessary and unnecessary portions of the document cannot be separated.
In addition, a method of recognizing the layout of a document is only used to positively search a text region. Although the text region can be detected, it is impossible to separate necessary and unnecessary portions and output or store only the necessary portions as a document image.