Optical Character Recognition (OCR) within computer systems involves conversion of data encoded in image form into machine-encoded text. A common example is conversion from a known image format, such as PDF, TIFF or JPEG, among others, into a text encoded format. OCR operates to provide a form of information entry to allow printed paper data records to be converted to a format that can be electronically edited, searched, stored more compactly, displayed on-line, and used in various machine processes.
Accuracy of OCR has become more important as computerization of business processes has increased. Unfortunately, known OCR programs do not provide the accuracy required for automated and systematic usage in business processes. There is accordingly a need for an improved OCR system that provides a level of accuracy required for demanding applications such as business processes.