The ability to easily capture and store digital photographs have made digital photographs a valuable source of information. One area where such digital photographs have proven to be a valuable resource is in text detection. Text detection systems employ text detection algorithms. Text detection algorithms are used to identify areas in images such as street sides that are most likely to contain text. The identified areas can then be processed by a text recognition algorithm (OCR). There are two advantages to using text detection prior to OCR. First, because conventional OCR algorithms are typically slow, it is computationally advantageous to identify areas that are likely to contain text so as to reduce the areas that the text recognition algorithm has to process such that its computational workload is likewise reduced. Second, identifying areas that are likely to contain text enables the pruning out of areas that do not contain text, which can reduce the error rate of an OCR algorithm.
In some applications text detection can be beneficial even where there is no need to actually recognize the text. For example, as a part of the texturing of building models for services such as Microsoft Virtual Earth™, there can be a need to stitch several ground-level images into one unified texture to model a building façade. When doing so, it is beneficial to avoid stitching the images in the areas that include text (such as shops signs), in order to prevent the generation of unreadable text in the final texture.
Accordingly, detecting text in natural scenes (as opposed to scans of book pages, faxed documents, and business cards) is an important step for a number of applications. Other applications where such functionality can be vital include computerized aids for visually impaired persons, precise and automatic geo-coding of businesses, automatic navigation in urban environments, recognition of goods on store shelves, and the like.
Natural images can include components that have a wide range of text fonts, language types, colors and illumination changes. Some conventional systems that are used to detect text in natural images rely on particular color contrast, horizontal and vertical features, windows of expected pixel height and boundaries. Because of their reliance on such parameters, where a natural image includes the aforementioned wide range of text fonts, language types, colors and/or illumination changes, the reliable detection of text can prove problematic. Consequently, many conventional systems produce a large number of false detections (e.g., false positive detections) and are thus inadequate for many text detection applications.