Automatic OCR is the main tool for extracting textual information from digital images. Hence, performance of document processing systems, to a large extent, depends on the OCR quality. Indeed, even slight improvements in the OCR reading rates translate into significant savings in the cost of document handling.
The OCR process, in turn, can be viewed as a combination of two main sub-processes: (1) segmentation and (2) recognition. The first sub-process locates and “isolates” the characters. The second one classifies the characters in question and assigns to each character a corresponding alpha-numerical symbol. For high quality images, characters are well separated and segmentation process becomes relatively straightforward. However, typical (scanned) images suffer from low contrast and high degree of noise. Moreover, frequently, characters are connected (due to the low quality of printers and typing machines). All these factors complicate the segmentation process and wrong segmentation leads to the failure in the recognition, and, in the worst case, to substitution errors.
Consider for example, a character “n” which is badly segmented and “truncated” on its right side. Such character can be easily misinterpreted as an “r”. In such a case the word “counting” may be recognized as “courting”. The verification of the word by an English dictionary will not help, since both words are legal. Therefore, there is a need in the art for improved methods for enhancing OCR with segmentation.