The present invention relates to a system for the automated conversion of displayed text to audio.
A vast amount of information is available in “hardcopy” print media such as books, newspapers, leaflets, and mailings as well as electronic print media such as online documents. Many people, however, are unable to avail themselves of this information due to visual impairment or illiteracy.
There are a variety of techniques employed to audibly convey the content of print media to those who can not read it. For example, print media may be recorded onto tapes which may then be made available for audio replay. However, this is highly inefficient and has found only a limited use with respect to popular novels and certain educational materials.
One existing system is capable of capturing an image of print media using a scanner or fax machine, recognizing the printed words from the image, and reciting each word in the order printed by relying upon phonemes. In this system, the optical character recognition software requires that the text portion of the image be orthogonally oriented with respect to the boundaries of the image. In other words, if the text is diagonally skewed on the print media, the software in this system will not be capable of interpreting the text. Accordingly, to ensure that the text portion of an image is properly oriented, this system physically aligns the print media in an orthogonal orientation using a tray, a frame, or other structure. The system then linearly scans the print media while it is physically maintained in its proper orientation by scanning successive rows of pixels into memory. The data, as a result of the scanning, is arranged in a digital image format and the system then processes the digital image, identifies the printed letters, and forms words from the letters to match each word to an associated audio file of that word, and plays the audio files in the proper sequence.
Unfortunately, using such a system is cumbersome. First, and particularly with respect to users of desktop flatbed scanners, a visually impaired person may have difficulty properly aligning the print media with respect to the scanning surface. Second, desktop flatbed scanners and fax machines are often too bulky and/or heavy to be used in a variety of social contexts, such as for a menu in a restaurant or for a magazine in a waiting room lobby. Finally, such a systems requires that print media be fed into the device page by page, which is not practical with respect to many items such as menus, bound books, or magazines.