1. Field of the Invention
This invention relates to a document image processing device and a document image processing method that are to be used for correcting the location of a document image, and a computer-readable memory medium in which a program to be run on a computer to perform such processing is stored.
2. Description of the Related Art
Systems in which a paper document is converted to electronic document by use of a scanner or the like, the electronic document is stored and managed in the form of various image file formats, and the stored document is visualized by use of a display device such as display or by use of an output device such as printer have been used widely. In some cases, a document image formed by use of a scanner that reads a paper document is located with deviation due to various causes depending on the setting of the paper document on the scanner or depending on the skew in feeding in the case where a document feeding type scanner is used.
In the case of the system in which the electronic document that has been converted from the paper document is stored and managed as described hereinabove, it is desirable that the document image is stored and managed in the best condition. In view of the above, various methods for correcting the locational deviation of the document image that has been read as described hereinabove to true up the location of the document images have been proposed.
For example, Japanese Published Unexamined Patent Application No. Hei 11-120288 discloses a method in which the position of the vertical line and horizontal line of a table is extracted with the run length of a black pixel to detect the locational deviation in the case where the document includes the table having ruled lines at the position to be served as the reference, and this method is an example of the conventional technique for correcting the locational deviation of an image. However, the document has to include the table having ruled lines, and this method cannot be applied to a document having no table and therefore cannot be used for detecting and correcting the locational deviation.
Furthermore, for example, Japanese Published Unexamined Patent Application No. Hei 11-282959 discloses a method in which the coordinate where the character string of the document of predetermined format is to be located is stored previously as the dictionary, the position of the string is detected from the input document image by the pixel projection method, and the deviation is detected based on the difference between the coordinate value in the dictionary and the coordinate value detected by the pixel projection method. However, this method requires much memory because the document image data should be multi-gradational. This method is applied only to the stylized document in which the position of characters and character strings are specified previously, and otherwise cannot detect and correct the locational deviation of the document. Because of the above, this method cannot be used for the application in which documents having different formats are stored and managed. Furthermore, the correction processing is interrupted when the character string is not detected, and the subsequent processing is not taken into consideration.