1. Field of the Invention
The present invention relates to a document image processing method and system having functions of extracting text regions comprising a plurality of regions, then detecting the construction of the thus extracted text regions and then determining the reading order in which the plurality of regions in the text regions are to be read.
2. Prior Art
An OCR (optical character recognition) process or document database production process needs a preliminary process to be performed prior to the relevant character reading (recognition) process. In the preliminary process, the reading order in which the plurality of text regions (image regions each containing characters) of an input document image are to be read is determined.
Two systems (1) and (2) obtaining such a reading order are now described.
(1) Japanese Laid-Open Patent Application No.3-269689 discloses an example of a document reading-in system for facilitating such a reading-order determination operation. The system produces an initial state including point coordinates respectively representing a plurality of regions of an input document image. Then, determination means appropriately replaces the positions of the adjacent point coordinates by one another. Thus, the regions are arranged according to the above reading order. Further, by adding a non-text region such as that consisting of a ruled line to the input document image before the above initial state is obtained, the determining of the reading order is facilitated.
This system depends on the initial state and compares the positions of the adjacent points defining the regions so as to detect the proper reading order. If it is assumed that a region consisting of a title of the document and a region consisting of an ordinary(body) text subsequent to the title are present in the input document image, it is further assumed that the title region and the ordinary(body)-text region are not adjacent but apart or that the positional relationship between the title region and the ordinary(body)-text region does not allow the above determination means to determine the order thereof. In this case, the order relationship between the title region and the ordinary(body)-text region can not be determined. Further, the system handles a non-text region in a manner similarly to that for text regions and does not provide for various states of an input document image in which ruled lines perpendicular to the text-line direction in the image or figures are present.
(2) Japanese Laid-Open Patent Application No. 1-183784 discloses a document image processing system for extracting the columns in an input document image in accordance with a proper reading order. For this purpose, this system produces a tree graph including nodes respectively representing the columns and then detects the logical construction of the columns using the tree graph.
Since this system uses a column arrangement of the input image, some states of the input images may not allow the system to determine columns to which nodes of the tree graphs are to correspond. The states of the input images are those in which no clear columns are found in the input image or an irregular column arrangement such that the upper half consists of two columns while the lower half consists of three columns appears in the input image. Further, this system also does not provide for various states of an input document image in which ruled lines perpendicular to the text line direction in the image or figures are present.
Further, since the above system (1) provides for vertical text-line documents, the reading order is that the right block is first and the left block is second for the adjacent blocks. On the other hand, since the above system (2) provides for horizontal text-line documents, the node order is determined so that the lower region is subsequent to the upper region for two vertically adjacent regions. That is, each system provides for a single one of the respective text-line directions. Further, both the systems do not provide for input document images including text regions, having attributes different from that of the in-order reading regions which are regions to be successively read, such as figures/tables, titles, headers (text regions located at the head of a page and apart from the body part of the page), footers (text regions located at the foot of a page and apart from the body part of the page), or the like.