1. Field of the Invention
The present invention relates to the field of speech processing and, more particularly, to opening more applications to speech synthesis by using a printer driver architecture as a mechanism to feed data to a text-to-speech engine.
2. Description of the Related Art
Many applications include text-to-speech (TTS) processing capabilities, which permit each application to audibly present machine generated speech that has been automatically constructed from textual content present within the application. This TTS processing capability is especially useful for visually impaired computer users that have difficulty interpreting visually displayed content and for users of mobile and embedded computing devices, where the mobile and embedded computing devices may either lack a screen, possess a tiny screen unsuitable for displaying large amounts of content, or can used in an environment where it is not appropriate for a user to visually focus upon a display. An inappropriate environment can include, for example, a vehicle navigation environment, where outputting navigation information to a display for viewing can be distracting to a driver.
For most of these applications having TTS capabilities, the computer readable instructions responsible for providing the TTS processing capabilities are embedded within the code of the application itself, and can be accessed through a user interface specific to the application. For example, an “options” menu under a “tools” heading can open an interface dialogue box through which an application's TTS capabilities can be configured by a user.
Unfortunately, many applications lack text-to-speech capabilities. Notably included in these applications currently lacking TTS capabilities is a popular PDF reader and many text editing and word processing programs, such as the NOTEPAD application and the WORDPAD application. It is very cumbersome if not impossible for a user to convert content within an application that lacks integrated TTS capabilities into speech output.
For example, one technique to generating speech output is to “cut and paste” content from a first application that lacks TTS capabilities to a second application that includes TTS capabilities. After pasting the content into the second application, the TTS capabilities of the second application can be used to generate speech output. This approach is inefficient, is subject to manual user errors during the cut and paste process, consumes substantial computing resources such as RAM, requires a user to possess an application with TTS capabilities, and is generally cumbersome to implement.
Another approach is to generate a file in a format of the first application and to convert this file using a conversion application into an audio format, where the converted file includes encoded speech which has been generated by a speech-to-text engine based upon the content of the original file. For example, conversion programs exist that convert PDF formatted documents into MP3 formatted audio files, where TTS conversion of textual content included within the PDF file occurs during the conversion process.
The conversion approach has numerous shortcomings. First, the solution is limited to particular types of file formats, such as PDF formatted documents and MP3 formatted documents, and cannot be generally applied to in a file-format independent manner. Second, the solution requires a user to perform multiple steps that include: (1) saving content included within an open application to a file, (2) instantiating a conversion application, (3) selecting the saved file from the conversion application and providing a name and location for the new file, (4) executing the file conversion operation, and (5) using a third application to open the newly converted file, where the third application audibly presents the text-to-speech converted content. Consequently, like the cut and paste method, the file conversion method is inefficient and cumbersome for a user to utilize.