The present invention generally relates to table region identification methods, and more particularly to a method for identifying a table region within an image obtained by a character recognition apparatus from an original document.
A document generally contains not only a character region but also an image region including a drawing region, a photograph region, a table region or the like. When the recognition of such a document is carried out with a character recognition device, it is necessary to apply different techniques to different regions. The recognition of the character region is usually carried out by a character recognition technique. The recognition of the drawing or photograph region is done by storing the data as an image data or by further reading such image data as a pseudo two-level picture data through the dither technique and storing it. And the recognition of the table region is conventionally carried out by recognizing ruled lines of a table as a frame and performing character recognition of the image within the frame. A ruled line herein refers to a vertical or horizontal border or straight line as between columns or between rows of a table or list on a document. The frame described above is made up of horizontal ruled lines and vertical ruled lines.
A region identification technique is the prerequisite for applying the several identifying methods to different regions of the input data read from a document. Among these region identification techniques, a conventional frame pattern identification method may be used for identifying a table region within a document. In such a frame pattern identification method, however, the procedures of extracting long vertical and horizontal lines (which may constitute a possible frame pattern) from the document and of making a decision on whether they actually make up the above mentioned frame pattern are often too complicated to carry out at a practical level, and therefore cause difficulty in accomplishing a high-speed region identification.