Technological advances in electronics are quickly reducing the size, weight, cost, and power consumption of cameras. Thus, mobile computing devices such as cell phones, PDAs, and notebook computers can capture information using small, inexpensive, low resolution digital cameras that are now designed as subcomponents thereof. Such cheap and versatile cameras currently make it possible to easily photograph a wide variety of documents without using cumbersome peripheral devices such as a scanner. Documents ranging from books and legal documents to bills and business cards can now be photographed instantaneously on cell-phones, PDAs, and laptop computers.
However, optical character recognition (OCR) of such photographed documents presents a different challenge. Conversions from paper to digital representations and back are straightforward when utilizing desktop computers and suitable peripheral scanning devices. In contrast, the rapid evolution of technologies suitable for supporting mobile computing now makes imaging of such documents by such devices more complicated. For example, it is fairly difficult to print and scan documents, when untethered to a suitable peripheral device. Mobile printers are heavy, power hungry, and expensive. Portable scanners are equally unwieldy. Moreover, these low resolution cameras, whether standalone or embedded in a mobile computing device, present a new challenge to the OCR of such photographed documents. Traditional printed character systems proceed by first binarizing the image, segmenting the character, and then recognizing the characters. Because these systems separate the steps, much lower recognition rates are achieved than desired thereby mitigating the incentive to use such device capabilities. The OCR process is much more complicated when using these low resolution cameras because it is virtually impossible to do quality binarization or character segmentation independent of the recognition process. Segmentation is where the OCR engine organizes the pixels of a pixelated scanned image into charactcrs.
If such a mobile technology can now be provided the OCR capability of capturing and processing the document data, it is expected that people will use the cameras therein to take pictures of many different types of documents, ranging from restaurant bills, interesting articles, reports, book covers, business cards, screen shots, slides projected on a wall, maps, etc. The incentive to capture a wide variety of documents is high, since such digital documents can later be massively stored, indexed, archived, edited, and even printed back, once in electronic form. Additionally, for the information worker, this presents a wealth of opportunities.
Current cheap cameras (e.g., about $50) do not provide sufficient resolution to capture a whole document page at 11-point font in one exposure and perform character segmentation independently of character recognition. Low-resolution camera images are so blurry and of such poor quality that binarization, segmentation, and recognition cannot be performed independently, and still maintain high recognition rates. For instance, at low resolution, the middle column of character “o” is often a better candidate for being cut than many other such cuts between letters. One cannot hope for good performance without addressing the segmentation problem effectively. Improvements in low resolution OCR benefit high resolution cameras as well, thereby allowing users to take pictures from further away and out of focus.
Thus, what is needed is an OCR capability that can resolve low resolution symbols.