1. Technical Field
The present disclosure generally relates to computer vision, and in particular relates to optical character recognition in industrial environments.
2. Description of the Related Art
Optical character recognition (OCR) is the mechanical or electronic conversion of scanned or photographed images of alphanumeric or other characters into machine-encoded/computer-readable alphanumeric or other characters. OCR is used as a form of data entry from some sort of original data source, such as product packaging (e.g., a bag of chips, a box, etc.), books, receipts, business cards, mail, or any other object having characters printed or inscribed thereon. OCR is a common method of digitizing printed characters so that the characters can be electronically identified, edited, searched, stored more compactly, displayed on-line, or used in machine processes such as machine translation, text-to-speech, verification, key data extraction and text mining.
Typically, the OCR process can be viewed as a combination of two main sub-processes: (1) segmentation and (2) recognition or classification. The segmentation sub-process locates and “isolates” the individual characters. The recognition or classification sub-process classifies the characters in question and assigns to each character a corresponding alpha-numerical or other character or symbol. The OCR process is typically divided into two sub-processes because the classification sub-process is computationally expensive, and therefore it is advantageous that the classification sub-process not be done throughout an entire image, but rather only at select locations where the segmentation sub-process has detected a potential character. For high quality images, characters are well separated and segmentation sub-process becomes relatively straightforward. However, often images suffer from low contrast, high degree of noise, character variation, character skew, background variation, or other non-ideal factors. These factors complicate the segmentation sub-process, and segmentation errors lead to the failure in the recognition sub-process.
OCR is frequently used in many computer vision systems to detect text associated with various manufacturing processes. For example, OCR may be used to verify that a label has been correctly printed on a packaging for a product. In a specific implementation, OCR may be used to verify the presence of a “best before date” label on a product's packaging. The best before date is a piece of information written on the package of perishable foods that identifies the date past which a product is no longer suitable for consumption. Typically, inkjet printers are used to write the best before date on the otherwise pre-printed packages since inkjet printers represent a mature and reliable technology, are capable of printing at high speeds, and are relatively low cost. Nevertheless, occasionally errors may occur in the printing process (e.g., low contrast, missing characters), which compels verification that the labels are printed correctly.
Setting up OCR parameters for a given application can be very difficult, especially for an inexperienced user. Unlike many applications, which can be solved in advance by a system integrator, present OCR systems can require technicians or engineers to be present on the production floor to train or modify the runtime parameters, which causes a significant expenditure of time and expense. Further, the OCR parameters of a system may need to be adjusted when new parts or products are introduced, or when new printing systems or labels are used. For existing systems, if the default OCR segmentation parameters do not allow the OCR system to perform at a suitable level, the user may be confronted with a list of tens of parameters (e.g., 20-30 parameters) that might require adjustment. Adjusting these parameters may require significant training and/or experience.