Marking documents with machine-readable characters to facilitate automatic document recognition using character recognition systems is well known in the art. For example, cheques issued by banks and other financial institutions have pre-printed account identification information thereon that is intended to be automatically read when cheques are processed by a cheque-processing system. The account identification information is in the form a character string, typically thirty (30) characters in length, adjacent to and running parallel to the bottom edge of the cheque. To enhance readability, the characters in account identification character strings are often printed according to the “E13B” font specification developed by the American Bankers Association. The use of E13B characters in conjunction with specifications, such as for example Standard 006: Standards and Specifications for MICR Encoded Documents, 2002, published by the Canadian Payments Association, that govern the manner by which information is to be laid-out on cheques, have improved the ability of cheque-processing systems to recognize correctly account identification information pre-printed on cheques.
Although the use of the aforementioned E13B characters and cheque layout specifications has improved account identification information recognition in cheque processing systems, advancements to improve accuracy and reduce the need for manual intervention in cheque processing systems and indeed, in character recognition systems in general, are continually being sought.
As a result, many systems to recognize automatically characters printed on documents and reduce the need to handle manually documents have been considered. For example, U.S. Pat. No. 4,564,752 to Lepic et al. discloses a sorting machine including a character recognition system to read information on documents fed to the sorting machine. When the sorting machine receives a document that includes information that cannot be read by the character recognition system, the document is diverted and an image of the unreadable information on the document is captured. The captured image of the unreadable information is then presented to an operator allowing the operator to view the information and manually code the information into the sorting machine. The diverted document is then reinserted and sorted based on the manual coding provided by the operator.
U.S. Pat. No. 3,764,978 to Tyburski et al. discloses a system for recognizing characters including both optical and magnetic readers to recognize characters printed on cheques using magnetic black ink. During processing both optical and magnetic images of the printed characters are acquired. By acquiring both optical and magnetic images, poor image quality in one of the images resulting from marks, such as stains, can be overcome by relying on the other image during character recognition.
U.S. Pat. No. 4,315,246 to Milford also discloses a character recognition system including both optical and magnetic readers for reading and identifying characters printed on cheques using magnetic ink. The character recognition system compares strings of characters identified by the optical and magnetic readers to validate read character strings.
Neural networks have also been employed in cheque-processing systems to recognize characters in captured cheque images. In cheque-processing systems employing such neural networks, the neural networks are trained by manually associating collections of captured character images with the corresponding characters.
As will be appreciated from the above discussion, various character recognition systems have been considered. Unfortunately, known character recognition systems such as those noted above suffer disadvantages and as a result, improvements are desired. It has been found that when capturing an image of a document including characters to be recognized, noise in the captured image negatively impacts the ability of the character recognition systems to recognize the characters. Such noise may take the form of dark specks or blotches in or overlapping a background around a character. Additionally, the noise may be a missing part or parts of a character, which may occur as a result of a less than ideal application of ink to the document surface or a result of surface wear.
It is therefore an object of the present invention to provide a novel method and system for recognizing a candidate character in a captured image.