Cellular telephones, PDAs (Personal Digital Assistants) and other portable electronic devices have become fixtures of everyday life over the last several years. As they evolve, prices continue to fall while the devices' capabilities have expanded. Currently, such devices can be used in many places to make wireless connection to the Internet, play games as well as carry out email and other text messaging functions. It can readily be anticipated that as time goes by, the capabilities of such devices will continue to expand as prices continue to fall making use of such devices a permanent part of people's daily lives.
Wireless two-way communication products are emerging which will enable users to have portable live video and still image transmission capabilities. For example, cellular telephones and personal digital assistants (PDAs) are being developed with an integrated camera and display to provide video telephone calls and image capture. Similarly, digital cameras will likely be equipped with wireless transceivers, enabling them to transfer images to other devices for printing, storage, and sharing. Such capability is likely to become more prevalent in the future and it can reasonably be expected that the resolution of the images captured will be enhanced over time. It is also expected, due to memory constraints in portable devices, that still images will be captured then transmitted over wireless networks and the Internet for remote storage
Recently, Optical Character Recognition (OCR) has become more and more of a “mainstream” application. The technology has become accurate, fast and stable. In addition, as the power of the OCR systems has considerably advanced, the prices of the OCR software applications have decreased. Documents of almost any form can be readily converted into editable computer files.
Optical Character Recognition is a process of capturing an image of a document and then extracting the text from that image. During the recognition process, the document is analyzed for several key factors such as layout, fonts, text and graphics. The document is then converted into an electronic format that can be edited with application software. The document can be of many different languages, forms and features. For example, some of the latest OCR applications can read over ninety (90) different languages, and can read tables as well as images contained within a document. The latest OCR readers utilize neural network-based recognition and feature extraction technologies to achieve accuracy rates over 99.9975%, or one character misread in 40,000. To achieve even higher accuracies rates software applications that check digit validation can be used to reduce this error rates to fewer than one in 3,000,000 characters.
There are two basic methods used for OCR: Matrix Matching and Feature Extraction. The simpler and most common of the two forms is Matrix Matching. It compares what the OCR device sees as a character against a library of character matrices or templates. When an image matches one of these prescribed templates within a given level of accuracy, the OCR application assigns that image the corresponding American Standard Code for Information Interchange (ASCII) symbol. Feature Extraction, also known as Intelligent Character Recognition (ICR), is OCR without strict matching to prescribed templates. The amount of computing intelligence that is applied by a device varies the results for ICR applications. The application looks for general features such as open areas, closed shapes, diagonal lines, line intersections, etc. When there is little or no variations within the type styles and there is a limited set of type styles the Matrix Matching is the preferable method. Where the characters are less predictable Feature Extraction is the preferred method.