1. Field of the Invention
The present invention relates to a method and system for image segmentation which is applicable to a document image processing system such as an Optical Character Recognition (OCR) system. This invention further relates to a method and system which is able to discriminate both a regular and irregular form area of a document image having a polygonal or complex structure. The invention is still further related to a method and system for determining whether characters are interior to or exterior to a form.
2. Discussion of the Background
As computers and image processing and forming systems become less expensive as technology improves, these systems become more popular and more accessible to the general public. The increasing popularity allows the image processing systems to become even more improved. An example of an improved image processing system is disclosed in U.S. Pat. No. 5,335,290 issued to Cullen et al., which is incorporated herein by reference. The system disclosed in Cullen et al. sets forth a technique of segmenting a document image into areas constituting text and areas which do not contain text. This system demonstrates features used in an image processing system such as compression of a bit-mapped image, and construction of a rectangle in order to process images.
In addition to classifying an image of a document into a text area and a non-text area, Japanese Laid-Open Patent Application 7-37036 published Feb. 7, 1995 discloses an image segmentation function which is used to classify a text area and a picture area of a document image using an image analysis technique. JP 7-37036 discloses the use of a circumscribing rectangle and a provisional ruled line which is extracted from the document image. The present invention has grown out of the system and process disclosed in JP 7-37036 and relies on some of the techniques disclosed in this patent. Specifically, portions of the present invention including at least parts of FIGS. 2, 3A, 3B, 4A, 4B, 5A-5C, 6A-6F, 14 and 15 of the present application are based on techniques set forth in JP 7-37036. However, JP 7-37036 is based on the processing of a regular form image having a rectangular construction and difficulties would arise if JP 7-37036 attempted to process irregularly shaped forms.