A portion of the disclosure of this patent document contains material that is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure as it appears in the United States Patent and Trademark Office patent file or records, but otherwise reserves all copyright rights whatsoever.
The present invention is directed to messaging systems, and more particularly, to a messaging system that provides text-to-speech conversion of e-mail documents for playback to a caller over a telephone connection to the system.
Electronic mail, or xe2x80x9ce-mailxe2x80x9d, has become an almost ubiquitous form of communication. Corporate computer networks have for some time provided e-mail services to the employees of many corporations. With the advent of the Internet, millions of home computer users now have access to e-mail services. Recognizing that e-mail users may not always have ready access to their computers, which is the primary means by which users retrieve e-mail messages, many messaging system vendors have begun to provide access to e-mail messages from a telephone handset. Using a text-to-speech converter and an interface to a telephone network, such as the Public Switched Telephone Network (PSTN) or a private branch exchange (PBX), these messaging systems enable a user to dial-in to the system from a standard telephone and to have the text of an e-mail message played back to the user over the telephone handset. With this capability, e-mail users can retrieve their e-mail messages from virtually anywhere.
U.S. Pat. No. 5,715,370 discloses a system for extracting selectable fields of text from various structured data files, including e-mail message files, and then feeding the extracted text data to a text-to-speech converter so that the selected text can be played back to a user over a telephone handset. A personal computer user, for example, can use this system to telephone a personal computer and then listen to portions of e-mail messages and other files that have been converted to speech by the text-to-speech converter.
U.S. Pat. No. 5,825,854 discloses a telephone access system for accessing a computer through a telephone handset. Audio instructions are provided to the user to select between a plurality of audio dialogs. An audio dialog may provide access to voice mail, electronic mail, facsimiles, or other data stored on the computer. Within an audio dialog, the user is provided with instructions and controls that allow the user to remotely manipulate the information stored in the computer from the telephone handset. For example, the system can allow the user to listen to an e-mail message that has been converted from text to speech, and then to reply to the message.
U.S. Pat. Nos. 5,530,740 and 5,737,395 both disclose universal messaging systems that integrate a voice and facsimile messaging system with an e-mail messaging system via a network. The integrated systems are capable of converting text messages, such as e-mail messages, into voice messages for playback over a telephone handset.
U.S. Pat. No. 5,479,411 discloses another universal messaging system that integrates voice, facsimile, and e-mail messaging and that converts portions of e-mail messages into voice messages for playback over a telephone handset.
In all of the foregoing systems, the load on the text-to-speech converter that converts the e-mail messages into speech can be significant, particularly in larger messaging systems that support simultaneous access by hundreds and even thousands of users. Consequently, there is a need for methods and apparatus that help to reduce the load on the text-to-speech converter in these systems. The present invention satisfies this need.
The present invention is directed to a method, apparatus, and computer program product for reducing the load on a text-to-speech converter in a messaging system in which e-mail messages are converted to speech for playback to a user over a telephone handset.
A messaging system in accordance with the present invention comprises a storage unit for storing e-mail messages, a text-to-speech converter for converting the different text segments of e-mail messages into speech signals for playback to a user via a telephone handset, and a cache for storing the speech signals of selected ones of previously converted text segments. Upon a subsequent request by a user to convert the text segments of a new e-mail message to speech signals for playback via a telephone handset, the speech signals of previously converted text segments that are identical to any text segments of the new e-mail message are played back from the cache thus avoiding the need for the text-to-speech converter to convert those text segments of the new e-mail message to speech. The load on the text-to-speech converter is thereby reduced.
A method of the present invention, for use in a messaging system that comprises a storage unit for storing e-mail messages, each comprising a plurality of different text segments, and a text-to-speech converter for converting the text segments of an e-mail message into speech signals for playback to a user via a telephone handset, comprises (i) storing the speech signals of selected ones of previously converted text message segments in a cache, (ii) receiving a request from a user to convert the text segments of a new e-mail message to speech signals for playback to the user over a telephone handset, (iii) comparing a text segment of the new e-mail message to the previously converted text segments for which speech signals are stored in the cache, and (iv) if one of the previously converted text segments matches the text segment of the new e-mail message, playing back the stored speech signal for the previously converted text segment from the cache instead of performing a text-to-speech conversion on the text segment of the new e-mail message.
In one embodiment, only the speech signals of text segments that satisfy a predetermined maximum length requirement are stored in the cache. For example, the predetermined maximum length requirement may be forty (40) characters. In another embodiment, all of the xe2x80x9cTO:xe2x80x9d, xe2x80x9cFROM:xe2x80x9d, xe2x80x9cCC:xe2x80x9d, and xe2x80x9cRE:xe2x80x9d segments of each e-mail message are cached regardless of length, but a predetermined maximum length requirement (e.g., 40 characters maximum) is still applied to the message bodies. In yet another alternative embodiment, rather than applying a maximum length requirement to the body of the message, each individual sentence of the message body is cached and when the cache approaches its storage capacity, the cached speech signals for longer sentences are discarded first to make room for newly cached sentences. Alternatively, the discard determination can be based on a combination of length and a least recently used algorithm, with a weighting factor arbitrating between the two.
Additional features and advantages of the present invention will become evident hereinafter.