Optical Character Recognition (OCR) generally refers is the mechanism of converting images of typed, handwritten or printed text into machine-encoded text (e.g., American Standard Code for Information Interchange (ASCII)), whether from a scanned document, a photo of a document, a scene-photo (e.g., an image acquired from a surveillance camera including a license plate number) or from subtitle text in an image (e.g., closed captioning text). Generally, an OCR mechanism is a computer-implemented process that includes the steps of acquiring an image containing a string of characters to be recognized, recognizing individual characters in the input image as characters of an alphabet, segmenting the characters into one or more strings of characters, performing a string recognition mechanism to return a corresponding output string of characters that corresponds to one or more model strings that are searches in the image (e.g., license plate, serial numbers, postal codes, addresses, etc.).
OCR has a wide range of applications including the recognition of vehicle license plate numbers (e.g., for use in automated traffic law enforcement, surveillance, access control, tolls, etc.), the recognition of serial numbers on parts in an automated manufacturing environment, the recognition of labels on packages (e.g., pharmaceutical packaging, food and beverage packaging, household and personal products packaging, etc.), and various document analysis applications.
Despite sophisticated OCR techniques, OCR errors frequently occur due to the non-ideal conditions of image acquisition, the partial occlusion or degradation of the depicted characters, and especially the structural similarity between certain characters (e.g. Z and 2, 0 and D, 1 and I). For example, the recognition of vehicle license plate numbers must overcome lighting conditions that are both variable (according to the time of day, weather conditions, etc.) and non-uniform (e.g. due to shadows and specular reflection), perspective distortion, and partial occlusion or degradation of the characters (e.g. due to mud, wear of the paint, etc.).
To improve the overall performance of OCR systems, a post-processing stage is performed, during which OCR errors are automatically detected and corrected. A popular technique to automatically correct errors in words is “dictionary lookup”: an incorrect word, that is, one that does not belong to a predefined “dictionary” of valid words, is replaced by the closest valid word in the dictionary. This is often achieved by selecting the dictionary word yielding the minimum “edit distance” with the incorrect word. The edit distance between two strings is the minimum number of edit operations (deletions, insertions, and substitutions) needed to transform the first string into the second string. In some techniques, the edit distance has been generalized to an edit cost by assigning a weight to an edit operation according to the type of operation, the character(s) of the alphabet involved in the operation and/or recognition scores.
Methods of automatic string correction based on the dictionary lookup paradigm are useful in cases where valid input strings are those belonging to a limited dictionary of valid strings. However, they are inadequate to correct strings that are not of the word-type. There are an increasing number of OCR applications in which valid strings are not words but strings satisfying a “template” of some sort; such strings include vehicle license plate numbers, serial numbers, ID numbers, ZIP codes, etc. In particular, these strings may include non-blank characters (that belong to an alphabet, or a set of numbers), as well as space or blank characters between the non-blank characters which are not addressed by standard OCR and string correction techniques.