With the rapid growth in large collections of digital images and increasing capabilities for quickly and conveniently acquiring such images in natural settings, interest in developing efficient ways for extracting useful information from these images in an automated fashion is increasing as well. For example, the wide proliferation of mobile computing devices (such as smart phones) with integrated cameras and network (e.g., Internet) access gives rise to a desire for technologies that enable analyzing a picture of an object of interest—such as a product, building, etc.—on the fly to retrieve relevant information associated with the object (e.g., a product description, the name of the building, etc.). It will be evident that object-detection and -recognition capabilities have vast application potential in e-commerce, tourism, and other contexts.
Accordingly, much research has and is being done on computer-vision approaches for detecting and recognizing certain types of objects within images. Given the ubiquity of text objects (such as words, number, or symbols) in our environment, text-recognition is a task of particular importance. A number of text-recognition approaches that are successful in certain circumstances have been developed. For instance, commercially available optical character recognition (OCR) systems achieve high performance on text-containing images obtained, e.g., by scanning a page of a book or other printed medium, where text is typically displayed in constrained settings, e.g., on a uniform (typically white) background, in standard fonts, etc. However, these systems generally do not provide satisfactory performance on textual images acquired in natural settings, e.g., photos of bill boards, traffic signs, product labels, etc. Such images are often characterized by noisy backgrounds, perspective, irregular sizes and fonts, unusual aspect ratios, and so on, resulting in low classification performance (i.e., incorrectly identified text) and/or an impracticably high computational load. Accordingly, alternative text-recognition approaches that achieve higher performance particularly on images of text occurring in natural settings are desirable.