The present technology relates to imaging systems, and more specifically, to a handheld scanner with trainable optical character recognition functionality.
Optical imaging systems that incorporate optical character recognition (OCR) are used for reading fonts and other symbols on packages or articles, for example. One of the most common of these imaging systems is the handheld scanner. OCR is generally considered as an electronic conversion of scanned images of handwritten, typewritten or printed text into machine-encoded text. It is important for imaging systems to achieve a quality scan so the image can be electronically searched, stored more compactly, displayed on-line, and used in machine processes such as machine translation, text-to-speech and text mining applications.
In order to improve scanning results, some optical imaging systems utilize standardized machine readable fonts, such as OCR-A and OCR-B, which were created to make the OCR process more accurate. The standardized font made decoding the font in an image far less complicated because the imaging system was made aware of the simplified fonts it was attempting to scan, and the individual characters in the fonts were designed to be easily distinguishable. For example the numeral “zero” contains a slash in order to help discriminate it from the alphabetical “o” (lower case) and “0” (uppercase). Nevertheless, many imaging applications, especially those where a handheld scanner is desired to scan an object or article, do not use standardized fonts.
Some modern OCR systems can be “trained” to recognize alternate fonts and other symbols. Yet, the training process is a complicated and time consuming process where each font and/or symbol must be scanned and then manually associated with the desired electronic character or data. The training process involves use of a computer where a user can view scanned images and match the image to the desired data. Once all the fonts or symbols are associated with the desired data, an electronic file with all the association data can be generated and can then be transferred to the imaging system for use by the imaging system to improve the results of a scan.
There are current handheld scanners that have limited OCR functionality; yet, these current handheld scanners require pre-configured machine readable fonts, such as OCR-A and OCR-B. Training current handheld scanners on alternate fonts or symbols is not an option because the handheld scanners do not have the processing power and user interface to provide the association between fonts and symbols and the desired data. In addition, as with any imaging device, providing quality results for each image scan can be difficult taking into consideration the numerous variables that affect the quality of an image scan.
What is needed are systems and methods that allow a handheld scanner to be trained on alternate fonts and/or symbols. What is also needed are systems and methods that can improve the quality of results for each image scan by influencing the variables that affect the quality of an image scan.