The present invention generally relates to a method and apparatus for recognizing a table area formed in a binary image of a document. The present invention is suitably applied to a character recognition apparatus.
A character recognition apparatus processes a binary image of a document which is scanned. A binary image can be classified into a character area, a photograph/graphics area and a table area, for example. In some character recognition apparatus, different image processes are executed for different types of areas. For example, a process for a table area includes the steps of segmenting images within a table into parts by using coordinates of ruled lines and recognizing characters included in the parts. Various character recognition methods have been proposed. For example, Japanese Laid-Open Patent Application No. 57-104363 discloses a pattern recognizing method in which each pixel unit in a binary image is scanned and a line extending in a main scanning direction or a sub scanning direction is detected.
However, the method proposed in the above-mentioned Japanese application is directed to a table which has frames, each frame surrounded by ruled lines. In actuality, there are many tables which do not have ruled lines on all the sides thereof. For example, there is a table which does not have ruled lines on both sides thereof. The method proposed in the above Japanese application cannot extract the frames of such a table. This means that tables such as the above are not handled as table areas.