ALPR (Automatic License Plate Recognition) is an image-processing approach that often functions as the core module of “intelligent” transportation infrastructure applications. License plate recognition techniques, such as ALPR, can be employed to identify a vehicle by automatically reading a license plate utilizing image processing and character recognition technologies. A license plate recognition operation can be performed by locating a license plate in an image, segmenting the characters in the captured image of the plate, and performing an OCR (Optical Character Recognition) operation with respect to the characters identified.
In general, an OCR engine can be optimized for performance with respect to a document having a uniform substrate (often the ‘paper’) with known or unknown characters. The substrate (the ‘plate’ background) of the license plate, however, is quite non-uniform due to noise with a constrained set of characters and fonts. Hence, the OCR engine optimized for document OCR is not optimum for the license plate OCR. The task of recognizing characters on the license plate is particularly difficult due to a number of challenging noise sources, for example, highly non-uniform backgrounds, touching or partially occluding objects (e.g. license plate frames), excessive shadows, and generally poor image contrast. Such noises present a much more challenging OCR problem than that typically seen in standard document scanning applications.
Most prior art ALPR approaches do not meet all of the performance demands of, for example, transportation businesses and enterprises. Several OCR technologies have been tested utilizing sample tolling images provided from an actual tolling installation. In one prior art implementation, for example, an image distortion model (IDM) nearest neighbor (NN) data driven (DD) classifier was used to implement an OCR engine. The performance of this OCR method is correlated highly to a training set size and can be quite sensitive to how well centered the segmented characters are in the input images. As such, it was found that the performance is improved somewhat by supplementing the training set with characters that are shifted variants of an original set.
In another prior art implementation, a Tesseract OCR engine algorithm breaks up edges of characters, both external and internal, into features with orientation, length, and direction. Starting with a topmost character edge pixel, the algorithm traverse along the edge pixels until the direction changes more than 45 degrees from a starting point at which point the features is saved and a new feature is created. This process is repeated until all the edge pixels are mapped to a unique feature. A classifier can be employed to map the test character features to those contained in training.
The Tesseract engine has been developed for decoding documents and leverages highly a secondary dictionary classifier to determine which character is likely to be in a given position of a word helping to improve performance for close characters (0/D, 8/B, 0/O, etc.) that are typically a problem for OCR. For a LPR application, for example, the dictionary information is not readily available and as a result the performance suffers. Note that obtaining DMV (Department of Motor Vehicles) records for valid license plate character sequences to improve ALPR is highly problematic in practical implementations. Since there are problems such as poor image quality due to ambient lighting conditions, image perspective distortion, and interference characters, etc., such prior art approaches are unable to accurately recognize the license plate characters.
Based on the foregoing, it is believed that a need exists for an improved method and system for recognizing a license plate character utilizing a machine learning classifier, A need also exists for an improved method for locally preprocessing a character image utilizing a quantization transformation, as will be described in greater detail herein.