Field of the Invention
This invention relates to a method for handwritten text recognition, and in particular, to a method of segmenting lines and words from a handwritten text image.
Description of Related Art
Handwriting recognition plays an important role in the field of artificial intelligence. It represents the ability of a computer to receive and interpret intelligible handwritten input from sources such as paper documents, photographs, touch-screens and other devices. Processing an image containing text may involve, for example, extracting a text region from the image, extracting lines of text from the region (line segmentation), then extracting words of text from the lines (word segmentation), before applying text recognition.
For handwritten text, line and word segmentation often present a challenge because there are many variances in the handwriting. Some methods have been proposed for this task. For example, U.S. Pat. Appl. Pub. No. 2007/0041642, entitled “Post-OCR image segmentation into spatially separated text zones”, describes “a post-recognition procedure to group text recognized by an Optical Character Reader (OCR) from a document image into zones. Once the recognized text and the corresponding word bounding boxes for each word of the text are received, the procedure described dilates (expands) these word bounding boxes by a factor and records those which cross. Two word bounding boxes will cross upon dilation if the corresponding words are very close to each other on the original document. The text is then grouped into zones using the rule that two words will belong to the same zone if their word bounding boxes cross upon dilation. The text zones thus identified are sorted and returned.” (Abstract.)
U.S. Pat. No. 5,933,525, entitled “Language-independent and segmentation-free optical character recognition system and method”, describes “a language-independent and segment free OCR system and method [which] comprises a unique feature extraction approach which represents two dimensional data relating to OCR as one independent variable (specifically the position within a line of text in the direction of the line) so that the same CSR technology based on HMMs can be adapted in a straightforward manner to recognize optical characters. After a line finding stage, followed by a simple feature-extraction stage, the system can utilize a commercially available CSR system, with little or no modification, to perform the recognition of text by and training of the system. The whole system, including the feature extraction, training, and recognition components, are designed to be independent of the script or language of the text being recognized. The language-dependent parts of the system are confined to the lexicon and training data. Furthermore, the method of recognition does not require pre-segmentation of the data at the character and/or word levels, neither for training nor for recognition. In addition, a language model can be used to enhance system performance as an integral part of the recognition process and not as a post-process, as is commonly done with spell checking, for example.” (Abstract.)
Chinese Patent Appl. Pub. No. CN 100527156C, entitled “Picture words segmentation method”, describes “a method for detecting text image, comprising the steps of: (1) The combined picture on each color component edge map obtained cumulative edge map; (2) the cumulative edge map is set for an edge point in the picture of the respective colors, depending on the color point edge, with the clustering of the cumulative edge map is divided into several sub-edge map sheets, each sub-edge map contains similar color edge points; (3) in each sub-edge map, multiple horizontal and vertical projection, according to the regional projection in the vertical direction and horizontal segmentation, positioning text in the image area. In the present invention, after obtaining original cumulative edge map using the clustering method based on the color of the cumulative edge map is divided into several sub-edge map, edge map of the sub edge is simplified, so that the detection area is relatively simple text pictures and accurate.” (Abstract.)