Technological advances in computer hardware, software, and networking have lead to efficient, cost effective computing systems (e.g., desktop computers, laptops, handhelds, cell phones, servers . . . ) that can communicate with each other from essentially anywhere in the world. Such systems continue to evolve into more reliable, robust and user-friendly systems. As a consequence, more and more industries and consumers are purchasing computers and utilizing them as viable electronic alternatives to traditional paper and verbal media for exchanging information. Many industries and consumers are leveraging computing technology to improve efficiency and decrease cost. For instance, consumers can scan and store documents, create an album of digital images with text overlays, search and retrieve specific information (e.g., web pages with various types of data), upload pictures from digital cameras, view financial statements, transmit and/or receive digital facsimiles, exchange correspondence (e.g., email, chat rooms, voice over IP . . . ), etc.
The data available to computing systems includes text, images, audio, video, drawings, tables, etc. In addition, the data can include any combination thereof. For example, printed text often appears in many types of images; a scan of a printed page will often contain multiple components including text, images, and line drawings; and photographs often portray scenes in which text plays a meaningful role. Automatic text detection is a key technology in many applications involving these types of data. For instance, automatic text detection can be utilized to identify the parts of a printed document image to which Optical Character Recognition (OCR) should be applied. It can also be used as part of a broader analysis of an entire page's layout, and/or to quickly identify text in photographic images that might lead to a richer understanding of the rest of the image. The text in any given image can take on a wide variety of forms. The characters representing text may be large or small; white, black, or colored; aligned in rows or columns; appear together or in isolation, for example. Non-text regions can be relatively complex since they can include any sort of object or drawing.
Conventional text detection techniques are designed and tested with many types of images. One particular difficult situation where robustness is important is when the image includes artifacts such as dithering patterns. In general, dithering occurs rather frequently in real document images, and can easily confuse a text detector when the dithering is adjacent to text. This may be because dithering dots are often indistinguishable from things like the dots on i's. Dithering dots further away from text are often more easily distinguishable. Many dithering dots share a lot of qualities in common, and may even be identical, regardless of whether they are near text or not. Since detection of most dithering dots is relatively easy, a model for the characteristics of the dominant dithering components can be used to facilitate the detection of dots closer to text. In other words, dots close to text may actually be dithering because they look like other dots that are unambiguously identified as dithering. This type of processing is broadly called transduction. In transduction, the set of test samples are classified as a group. The transduction approach allows the unique statistical structure of the group to play a role in the classification. Another technique that can be utilized in text detection system is inductive classification. With this classification approach, test samples are treated independently rather than as a group.
In general, with conventional text detection approaches achieving better than an 85% detection rate is not very difficult. The remaining undetected text is often statistically unusual, and distinguishing it from other document elements and/or sensor noise in the image can be quite difficult. For automatic text detection to be practically useful, however, a text detector should have an accuracy as close to perfect as possible, while being fast and robust. Therefore, there is a need to improve detection of statistically unusual undetected text, overall text detection rate, robustness and performance.