U.S. Pat. No. 6,405,925 and application 20070278306 (both to Symbol Technologies) detail imager-based barcode readers (as opposed to laser-based). These references particularly concern methods for identifying barcodes—and their specific types—in the context of other imagery. In an exemplary arrangement, contrast statistics and directional vectors associated with detected edge lines are used to identify what sub-region(s), if any, of the image data likely corresponds to a barcode. A barcode decoder then processes any thus-identified image sub-region(s) to extract a payload.
Since these references concern dedicated barcode readers, they are not designed for more general purpose image processing. In more general arrangements, consideration may be given to barcodes that might not be characterized by high contrast edges (e.g., barcodes that are in “soft” focus), and other image scenes that might present high contrast linear edges, yet are not barcodes (e.g., a white picket fence against a blue sky background).
Google, in its U.S. Pat. No. 7,565,139, teaches a system that processes input imagery by applying multiple recognition processes, e.g., optical character recognition (OCR), object recognition, and facial recognition. Each process produces a confidence score with its results. If the facial recognition confidence score is higher than the other scores, then the image is presumed to be a face, and those results are used for further processing. If the OCR score is the highest, the image is presumed to depict text, and is treated on that basis. Etc.
It will be recognized that this is a brute force approach—trying all possible recognition processes in order to get a useful result. Indeed, the processing is performed by a remote server, since timely execution of the various involved algorithms is apparently beyond the capabilities of mobile platforms.
Pixto (since acquired by Nokia) teaches a more sophisticated approach to mobile visual query in its application 20080267504. In the Pixto arrangement, a mobile handset obtains GPS information to determine the geographical context in which imagery is captured. If the handset is found to be in a shopping mall, a barcode recognition process is preferentially applied to captured image data. If the handset is found to be outdoors, an object recognition process may be most appropriate. (The phone may load an object glossary emphasizing local points of interest, e.g., the Statue of Liberty in New York Harbor.) A set of rules, based on location context, is thus applied to determine what image recognition processing should be performed. (Pixto also teaches looking for stripes in imagery to indicate barcodes, and looking for regions of high spatial frequency content as possibly indicating text.)
In accordance with certain embodiments of the present technology, drawbacks associated with the foregoing approaches are overcome, and new features are provided.
In one particular embodiment, color saturation of input image data is used as a metric to discriminate whether a first set of image recognition processes (e.g., object or facial recognition) is more likely to be relevant than a second set of image recognition processes (e.g., OCR or barcode reading). Such classification technique can be used in conjunction with other known arrangements, including those taught in the references noted above, to improve their performance and usefulness.
The foregoing and other features and advantages of the present technology will be more apparent from the following detailed description, which proceeds with reference to the accompanying drawings.