This invention relates to voicemail messaging.
Messaging systems allow a message recipient to listen to an audio message via his telephone or other audio terminal. In so-called voicemail systems, when the message is accessed from the voicemail system, the voicemail system typically presents header information, such as the time of receipt of the message and the identity of the sender, if known, and plays a recorded message, consisting of a segment of audio material, to the recipient. The recipient can navigate through the recorded message using his telephone keypad or voice input that can effect a skip, rewind, pause, or other similar operations. Recently, integrated messaging systems have been introduced that have voice interfaces that can handle conventional voicemail messages as well as messages of other media types, such as email. In the latter case, a textual email message is delivered to recipient""s mailbox. When retrieved by the recipient through his audio terminal, the email header information is converted to audio and presented to the recipient together with the body of the message, which is played for the recipient using text-to-speech processing. Thus, in both the traditional voicemail systems and the integrated messaging systems, the body of the message is interpreted as a monolithic chunk of recorded audio or text, the latter being converted to audio, which audio in either case is played linearly to the recipient when he accesses his messaging system from his telephone or other audio terminal.
Voicemail and other messaging systems have revolutionized the way people communicate with each other in today""s electronic age. Although the messaging systems available today are generally useful and have found widespread popularity, we have recognized that additional and highly advantageous functionality can be achieved in accordance with our invention.
The present invention is directed to a structured message that includes a plurality of messaging elements. The sending of such a message to a recipient""s messaging system is the subject of co-pending patent application Ser. No. 09/318,140, filed on even date hereof. The assembly and presentation of the message, as described in further detail herein below, is the subject matter of patent application Ser. No. 09/318,450, also filed on even date hereof, and now U.S. Pat. No. 6,240,391, issued on May 29, 2001.
The structured message sent by a sender includes a plurality of messaging elements. These messaging elements may illustratively include textual fragments, speech fragments in attached audio files, references to audio or textual fragments stored at specified addresses, and explicit or implicit instructions that define the structure of the message. The message, including a plurality of such messaging elements is delivered to an address indicated in the message of the recipient""s mailbox on a messaging system that has the capability of interpreting the instructions incorporated within the structured message. That messaging system, upon retrieval by the recipient, assembles, in accordance with the instructions that define the message structure, an audio message using the messaging elements associated with the message content, and presents that assembled message to the recipient in its intended format.
Advantageously, the delivery of the structured message may enable interactions between the recipient and the message content, and between the recipient and the outside world. In particular, the embedded instructions within the message may be such as to allow a dialog between the recipient and the messaging system. Indeed, that dialog can, in accordance with the embedded instructions, allow the recipient to navigate between messaging elements through voice and/or keypad inputs, as if the recipient was connected to an active interactive voice response (IVR) system. The recipient will thus hear those content-related messaging elements from within the structured message that are associated with and are responsive to his command inputs.
The structured message may also contain embedded addresses, or xe2x80x9clinksxe2x80x9d as they are currently known in the Internet art, that specify a telephone address such as a telephone number, or an IP telephony address. If the recipient performs an action, such as making a keypad entry or supplying a voice input, during his interaction with a structured message, which action is interpreted by the messaging system to represent a selection by the recipient of a specific link, placement of a call to that telephone number or address associated with that link is effected by the messaging system. Alternatively, the structural message may contain embedded links that specify a destination for messaging rather than telephony connections. Examples of the latter include email addresses and Web services for HTTP upload.
The messaging system can gather information from the recipient during his interaction with the structured message, which information is then sent to a destination specified, for example, by the sender, such as a server or email address. Receipt of that gathered information may result in a response from the specified destination, which response is processed by the messaging system and forwarded to the recipient. This, in effect, initiates an interactive session between the recipient and a service that is active at the destination specified in the original structured message.
The various capabilities of the structured message can also be combined in several ways. As an example, a structured message may cause coordinated data and telephony actions. Thus, the messaging system can collect input data from the recipient, communicate that data to a specified destination system, such as a server, and place a telephone call to a phone number associated with that destination system. That destination system can then be provided with information over the telephone call that enables it to access the separately sent data. The destination system then may use that data to enhance the handling of the telephone call in various ways.
In a specific illustrative embodiment, the structured message is formulated by a sender using, for example, a phone markup language (PML) to define the structure and the inherent embedded instructions associated with the structure of the message. The message then consists of PML markup interleaved with other messaging elements such as textual fragments that will be converted to speech by the messaging system, and/or audio and textual fragments made part of the message as attached files or which are retrievable from a designated address. After formulating the message, the composite message is sent over a data network, as for example, an IP network such as the Internet, to the messaging system which stores the composite message for later retrieval by the intended recipient.
The messaging system includes those functionalities necessary to interpret the embedded instructions within the stored structured message and to audibly present it to the recipient, while also being able to receive and interpret a recipient""s audio or touch-tone inputs for interaction with the message in accordance with the instructions. In the specific illustrative embodiment, the messaging system receives and stores the PML-formatted message sent by the sender over the data network. Upon being accessed by the recipient for retrieval of the message, the system accesses the message, and a processor interprets the PML markup within the message to effect playing of the textual and/or audio fragments of the message to the recipient in accordance with the embedded instructions associated with that markup. Thus, for example, for a structured PML-formatted message including fragments of text and attached audio files, the message is formulated by converting the text to speech using a text-to-speech processor, and inserting the appropriate audio file(s) during the play-out to the recipient in the proper sequence, as determined by the embedded instructions within the PML-formatted message. Further, the illustrative messaging system includes a detector for detecting the recipient""s touch-tone keypad inputs and an automatic speech recognizer (ASR) processor for recognizing and interpreting the recipient""s voice and touch-tone inputs to effect interaction and navigation within the structured message as allowed by the markup within the message, as well as the transfer to and interaction with other destinations as specified by the markup.
Advantageously, the structured message can be formulated by the sender through an editor with a graphical user interface running on a computer. Through the input of textual files, previously recorded audio fragments, as well as contemporaneously recorded fragments, the sender is able to formulate the structured message. Alternatively, the structured message could be created xe2x80x9cby handxe2x80x9d with a text editor and an audio file recording utility.
Advantageously, if the structured message is sent to a plurality of recipients, the invention allows information to be gathered from each, without requiring real-time telephonic communications with each individual recipient to collect that information.
Although noted above as being associated with audio messaging, it should be understood that the present invention could equally be applied to multi-media type of messaging in which the messaging elements of the structured message may include video fragments that are assembled by the messaging system in accordance with the instruction embedded within the structured message.