The present invention relates generally to a method of transmitting audio messages over a network, and more particularly, a method of recording and retrieving audio attachments to electronic mail by use of a touch-tone telephone.
Electronic mail (xe2x80x9cemailxe2x80x9d) has proliferated as a common method of communication. Initial communications consisted of ASCII (American Standard Code for Information Interchange) text. In an ASCII file, each alphabetic, numeric, or special character is represented with a 7-bit binary number (a string of seven 0s or 1s). 128 possible characters are defined. However, basic ASCII text email messages have progressed to include graphics, audio and even video. Graphic images, digital audio files and digital video all require an encoding and decoding process when transmitted over the Internet. A user wishing to encode a voice message and send the message to a preselected email address had to accomplish several steps and have certain hardware and software equipment. The user would typically record their voice message on a computer using a sound card attached or integrated into the motherboard of a computer.
The voice message is a sequence of analog signals that are converted to digital signals by the audio card, using a microchip called an analog-to-digital converter (ADC). When sound is played, the digital signals are sent to the speakers where they are converted back to analog signals that generate varied sound.
Audio files are usually compressed for storage or faster transmission. Audio files can be sent in short stand-alone segments xe2x80x94for example, as files in the WAV format. In order for users to receive sound in real-time for a multimedia effect, listening to music, or in order to take part in an audio or video conference, sound must be delivered as streaming sound. More advanced audio cards support wavetables, or precaptured tables of sound. The most popular audio file format today is MP3 (MPEG-1 Audio Layer-3).
Once these digital audio files reside on the hard drive of the user, the user would attach the file to an email sent to a selected recipient. When the file is attached, it might be transmitted in a standardized protocol such as Multi-Purpose Internet Mail Extensions (herein xe2x80x9cMIMExe2x80x9d). MIME is an extension of the original Internet e-mail protocol that lets people use the protocol to exchange different kinds of data files on the Internet: audio, video, images, application programs, and other kinds, as well as the ASCII handled in the original protocol, the Simple Mail Transport Protocol (SMTP). In 1991, Nathan Borenstein of Bellcore proposed to the Internet Engineering Task Force that SMTP be extended so that Internet (but mainly Web) clients and servers could recognize and handle other kinds of data than ASCII text. As a result, new file types were added to xe2x80x9cmailxe2x80x9d as a supported Internet Protocol file type.
Servers insert the MIME header at the beginning of any Web transmission. Recipients use this header to select an appropriate xe2x80x9cplayerxe2x80x9d application for the type of data the header indicates. Some of these players are built into the Web client or browser (for example, all browsers come with GIF and JPEG image players as well as the ability to handle HTML files); other components, such as audio file players, may need to be downloaded.
U.S. Pat. No. 5,945,989 to Freishtat et al. describes a method of adding or altering the content of a website by using a touch-tone telephone. Freishtat et al. describes a processing of converting a telephone message into an audio file which can then be posted on a website (col. 2, lines 19-22; col. 4, lines 33-34; and col. 5, lines 6-7 There is also a suggestion that the handset on a touch-tone telephone operates as xe2x80x9ca kind of substitute computer keyboard.xe2x80x9d (col. 2, lines 26-27 and col. 22, lines 24-26). However, the patent does not describe or suggest any means for transmitting the audio file by email. Nor does the touch-tone telephone entry describe or suggest a method of keying in any alphanumeric character based on the number of times the telephone button is depressed within a specified wait loop. Rather, the Freishtat et al. patent requires the user to establish a pre-existing touch-tone ID for each page element. (col. 6, lines 48-49; col. 9, lines 62-65; and col. 10, lines 3-4). Accordingly, while the Freishtat et al. patent describes a method of digitizing recorded audio from a touch-tone telephone to a file for publication on a web server, there is no description nor suggestion that the recording of the audio file would be transmitted to a predetermined email recipient. Furthermore, there is no teaching or suggestion for a method of keying in the necessary array of alphanumeric characters to properly designate an email recipient over a touch-tone telephone without pre-existing email address identifiers.
U.S. Pat. No. 5,996,006 to Speicher describes an online dating service that converts audio files received via telephone into digital files for retrieval on the Internet. (col. 5, lines 27-29 and col. 6, lines 37-39). However, the Speicher patent does not describe nor suggest a method of directly sending the recorded audio file to a predetermined email recipient. Nor does the Speicher patent describe or teach a method of keying in the necessary alphanumeric characters necessary to establish a preselected Internet email address over a touch-tone telephone. (See col. 6, lines 60-63 wherein Speciher teaches that the email address must be recorded by audio, then later manually translated to alphanumeric form).
A number of companies such as Onebox.com, BuzMe.com, Inc., Getmessage.com, American Voicemail Network, Inc., and Excite, Inc. currently offer services wherein a pre-configured recipient account may be set up to receive email with audio file attachments originating from a regular voice mail system. However, in all of these systems, the recipient must set up an account in advance. Furthermore, these systems require the sender of the audio voice mail message to know ahead of time their voicemail telephone number which typically has a unique extension for that recipient.
Consequently, there is a need in the art for a method of transmitting digital audio file attachments to a preselected email address without requiring the recipient to first set up an account with a service.
There is a further need in the art for a method of transmitting digital audio file attachments wherein the only information required by the sender is to know the recipient""s email address.
However in view of the prior art at the time the present invention was made, it was not obvious to those of ordinary skill in the pertinent art how the identified needs could be fulfilled.
The above and other objects of the invention are achieved in the embodiments described herein by providing a method of transmitting one or more audio file attachments in an electronic message from a touch-tone telephone comprising the steps of dialing into a predetermined telephone number, sending one or more DTMF signals on the touch-tone telephone corresponding to a preselected email address wherein the one or more DTMF signals is associated with a predetermined alphanumeric character, assembling a string of alphanumeric characters by repeating the DTMF signal entry until the preselected email address has been completed, recording an audio voice message over the touch-tone telephone, converting the audio voice message into a digital audio file, attaching the digital audio file to an electronic message directed to the preselected email address, and transmitting the electronic message to the preselected email address.
In a preferred embodiment, a subscriber record is maintained so that the caller does not need to repeatedly enter in the same email addresses. The steps involved for utilizing a subscriber record system comprise sending an identification code to a central server by sending DTMF signals on the touch-tone telephone, associating the identification code with a subscriber record, validating the authenticity of the subscriber record, and authorizing the transmission of the electronic message based upon whether the identification code is authentic.
When a recipient receives a message from the caller, the return email address will typically be that of the central server. It is preferable that the recipient have the ability to correspond directly back to the caller by email if possible. This is accomplished by associating an alphanumeric reply string with the subscriber record and encoding the alphanumeric reply string into the electronic message in a reply-to field wherein a recipient of the electronic message may send a return electronic message addressed to the alphanumeric reply string.
Electronic email addresses are unique and there is generally no margin of error if they are incorrectly entered. Therefore, it is important that the entry of the individual alphanumeric characters be as easy and seamless as possible. This may be performed by providing a wait loop of predetermined duration to identify the predetermined alphanumeric character, identifying the predetermined alphanumeric character according to the number of identical DTMF signals received during the wait loop, and appending the predetermined alphanumeric character as identified at the end of the wait loop.
For example, pressing the numeral xe2x80x9c2xe2x80x9d once within two second results in the alphanumeric character xe2x80x9c2xe2x80x9d being recorded by the server. Pressing the numeral xe2x80x9c2xe2x80x9d twice within two seconds results in the alphanumeric character xe2x80x9cAxe2x80x9d being recorded by the server. Pressing the numeral xe2x80x9c2xe2x80x9d three times within two seconds results in the alphanumeric character xe2x80x9cBxe2x80x9d being recorded by the server. Pressing the numeral xe2x80x9c2xe2x80x9d four times within two seconds results in the alphanumeric character xe2x80x9cCxe2x80x9d being recorded by the server. Pressing the number xe2x80x9c1xe2x80x9d once within two seconds results in the alphanumeric character xe2x80x9c1xe2x80x9d being recorded by the server. Pressing the number xe2x80x9c1xe2x80x9d twice within two seconds results in the alphanumeric character xe2x80x9c@xe2x80x9d being recorded by the server. Pressing the number xe2x80x9c1xe2x80x9d three times within two seconds results in the alphanumeric character xe2x80x9c.xe2x80x9d being recorded by the server.
Additional steps to verify the correct email address of the proposed recipient may include providing a text-to-speech audio confirmation of the string of predetermined alphanumeric characters comprising the preselected email address. For even more specific confirmation, the text-to-speech confirmation is played upon determination of each alphanumeric character selected. Once the email address is entered and confirmed, the next set of preferred steps include prompting for the audio voice message by an automated voice response, recording the audio voice message, and detecting a DTMF stop signal.
The transmission of the audio voice messages presents a unique opportunity to disseminate information. Additional audio segments may be spliced into the original audio voice message for public service announcements, musical interludes, or commercial advertisements. Furthermore, such additional information is not limited to the audio segments of the email attachment, but may also be incorporated as text or graphic elements in the body of the electronic message. Accordingly, an additional step to the method describe above might include encoding a sponsor message into the electronic message wherein the encoding a sponsor message comprises the step of appending the digital audio file with an audio sponsor message. Alternatively, the method might include encoding a text-based sponsor message into the body of the electronic message or encoding a sponsor message comprises the step of encoding one or more graphic elements into the body of the electronic message.
There is also an opportunity to present additional information, not just to the recipient of the electronic message, but also to the sender by playing an audio sponsor message upon making a connection to the predetermined telephone number. In order to enhance the effectiveness of the audio sponsor message, another step might include selecting the audio sponsor message from an array of audio sponsor messages according to one or more demographic factors of the caller wherein the one or more demographic factors are resolved from a caller-ID string.
Accordingly, if the caller-ID string identifies the call is originating from Florida, the sponsor message selected from the array might include advertisements for suntan lotion but not snow shovels. Caller-ID strings may also be used in combination with other databases to provide more detailed demographics on the caller. For example, the method might include the additional steps of cross-referencing the caller-ID string against relative property values of the origin of the call, assigning a financial rating variable, and selecting the audio sponsor message from the array of audio sponsor messages according to the financial rating variable. For example, presenting a budget automobile advertisement to a caller originating from an area of high property values will probably be less effective than presenting a luxury automobile advertisement. In a further embodiment of the invention, the method includes the step of providing a DTMF menu option to transfer into a sponsor""s call center system wherein further information on a sponsor""s products or services may be obtained.
Electronic devices can easily reproduce the DTMF signals. In an alternative embodiment of the invention, a personal digital assistant (herein xe2x80x9cPDAxe2x80x9d) device transmits the one or more DTMF signals through the touch-tone telephone. Popular PDAs include the 3COM PalmPilot(copyright) and Windows CE(copyright) devices. These PDAs have easy to use and sophisticated address book features. By utilizing a predetermined table of alphanumeric character to DTMF signal conversions, the PDAs can be programmed to dial into the predetermined telephone number and send the appropriate DTMF identifiers for a particular email address. However, in this embodiment, it would disadvantageous to utilize a wait loop for entry of the DTMF signals as the PDAs can produce the signals at a much faster rate than can be achieved by manually pressing the buttons on a touch-tone telephone. Therefore, to overcome this problem, an entry of one or more DTMF signals corresponding to the unique selection of the predetermined alphanumeric character is followed by a stop DTMF signal indicating acceptance of the predetermined alphanumeric character without utilizing the wait loop. Therefore, once the DTMF signal or combination of signals are received followed by the stop DTMF signal, the next alphanumeric character may be entered without waiting for a predetermined elapse of time.
Although speech recognition technology has advanced considerably, current technology generally requires high memory and CPU processing to enable most speech-to-text processes. This stems from the variations of speech between individuals and the tens-of-thousands of spoken words that must be recognized for most applications. However, in the case of speaking and translating individual alphanumeric characters, the memory and processing demands are exponentially less. Furthermore, individual alphanumeric character translations and less susceptible to the variations of speech and sound quality of telephone systems. Accordingly, in a preferred embodiment of the invention, entry of a preselected email address may be accomplished entirely through speech recognition means comprising the steps of dialing into a predetermined telephone number, receiving one or more speech elements through the telephone, associating each individual speech element with one or more predetermined alphanumeric characters through a speech recognition means, assembling a string of alphanumeric characters by repeating steps above until a preselected email address has been completed, recording an audio voice message over the telephone, converting the audio voice message into a digital audio file, attaching the digital audio file to an electronic message directed to the preselected email address, and transmitting the electronic message to the preselected email address.
In order the maintain the low CPU and memory requirements of the system, the one or more speech elements are substantially restricted to an individual alphanumeric character. However, certain common groupings may be detected wherein the phonetic equivalents of xe2x80x9cdot com,xe2x80x9d xe2x80x9cdot net,xe2x80x9d and xe2x80x9cdot orgxe2x80x9d are associated with the alphanumeric character groupings of xe2x80x9c.com,xe2x80x9d xe2x80x9c.net,xe2x80x9d and xe2x80x9c.orgxe2x80x9d respectively through the speech recognition means.
Other steps previously described above and utilizing DTMF signals may also be accomplished through speech recognition means including the steps of sending an identification code to a central server by the speech recognition means, associating the identification code with a subscriber record, validating the authenticity of the subscriber record, and authorizing the transmission of the electronic message based upon whether the identification code is authentic. As an alternative to the identification code, a single spoken code can be used to identify the caller and associate the call with the subscriber record based on the unique characteristics of the caller""s speech. Such technology is well known in the field of biometrics.
Speech recognition may be incorporated into a wait loop assembly of the preselected email address by the steps of providing a wait loop of predetermined duration to identify the predetermined alphanumeric character, identifying the predetermined alphanumeric character according to the one or more speech elements received during the wait loop, and appending the predetermined alphanumeric character as identified at the end of the wait loop. In addition, the speech recognition can be used to signal the start and stop of the recording phase wherein the steps comprise prompting for the audio voice message by an automated voice response, recording the audio voice message, and detecting a speech element stop signal. The speech element stop signal should be a unique word or combination of words. For example, if part of the audio message included the phrase, xe2x80x9cplease do not stop sending our company such wonderful referrals,xe2x80x9d and the word xe2x80x9cstopxe2x80x9d was used as the stop signal, xe2x80x9csending our company such wonderful referralsxe2x80x9d would not be included in the audio voice message as the recording phase would have already ended.
It is preferred that a combination of voice and DTMF signals are used in this case wherein the caller is prompted to depress a button on their touch-tone telephone to start and stop the recording process. DTMF functions particularly well even during concurrent speech as two discrete tones are emitted which are picked up and interpreted by telephone switches. The two tones represent each key on the telephone touch pad. (The xe2x80x9cAxe2x80x9d, xe2x80x9cBxe2x80x9d, xe2x80x9cCxe2x80x9d, and xe2x80x9cDxe2x80x9d keys were used for the US military""s Autovon phone system).
When any key is pressed, the tone of the column and the tone of the row are generated, hence dual tone. As an example, pressing the xe2x80x985xe2x80x99 button generates the tones 770 Hz and 1336 Hz.
A preferred embodiment of the invention includes the step of associating the preselected email address with the subscriber record wherein the preselected email address may be retrieved at a later time without re-entry. To make the retrieval of the preselected email address as simple as possible, the method may include the step of tagging the preselected email address with a description audio file wherein the description audio file is played back through the touch-tone telephone for selecting a previously entered email address. When speech recognition capability is included, additional steps may comprise receiving a speech request associated with a previously entered email address, associating the speech request with the previously entered email address through voice recognition means, and confirming identification of the previously entered email address.
A particular problem that currently exists with many voice mail systems is the inability to easily archive the voice messages. In many situations, it would be advantageous to store the voice message in a particular file directory or associate the message with a particular contact. Popular contact management software such as Symantec""s ACT!(copyright) and Microsoft""s Outlook(copyright) are capable of organizing and storing binary files such as digital audio. While current voice mail systems permit the user to save a voice mail message, the user must continually cycle through the old messages to find the message of interest. By providing an easy to use, one-touch operation, voice messages stored on telecommunication devices, and particularly wireless devices which are often used outside a formal office setting, may be archived, stored and organized for later retrieval and use.
An embodiment of the invention includes a method of transmitting one or more audio file attachments in an electronic message from a telecommunications device having voice mail capability. The steps comprise storing one or more alphanumeric strings corresponding to one or more preselected email addresses in a telecommunications device, receiving an audio voice message into the telecommunications device, forwarding the audio voice message to a voice mail server, converting the audio voice message into a digital audio file, attaching the digital audio file to an electronic message directed to the preselected email address, and transmitting the electronic message to the one or more preselected email addresses. In a preferred embodiment, the one or more preselected email addresses are assigned to a single button, wherein the depression of the single button forwards the audio voice message to the voice mail server. While the telecommunication device is anticipated to be a wireless device such as a cellular telephone or pager, it may also include non-wireless telephone.
In addition, the telecommunication devices may be preconfigured to transmit an alphanumeric header string permitting the particular voice mail being forwarded to be associated with a particular contact or file directory on the recipient""s computer. For example, if a user receives a voice message on his cellular telephone that relates to a business matter, he might assign a xe2x80x9c01xe2x80x9d value to the voice message before forwarding the message for ultimate delivery by email. The xe2x80x9c01xe2x80x9d value is associated with a business matter and is placed in the subject field of the email message. The recipient""s email communication program is pre-configured to recognized the xe2x80x9c01xe2x80x9d value and automatically place the email and attached digital audio file into a predetermined file folder. In Microsoft Outlook(copyright) this procedure is called a xe2x80x9crule.xe2x80x9d Alternatively, the user might assign a value of xe2x80x9c02xe2x80x9d to the message which directs it to a file folder on the recipient""s computer which holds personal voice messages.
Accordingly, it is an object of the present invention to provide a method of transmitting an audio voice message to an email address without the need of a separate CPU.
It is another object of the present invention to provide a method of transmitting an audio voice message to an email address without requiring a pre-configured voicemail account.
It is another object of the present invention to provide a method of transmitting an audio voice message to an email address without requiring a unique voicemail number or extension for the recipient.
An advantage of the invention is that callers are able to send an audio voice message to any email address without any other information such as telephone, address, extension numbers or the like.
Another advantage of the invention is that those wishing to send an audio voice message do not need to have any computer equipment.
Another advantage of the invention is that persons with disabilities that make if difficult to type regular messages can transmit communications through the Internet by simply pressing touch-tone buttons, or alternatively, by speaking the alphanumeric characters associated with an email address.
These and other important objects, advantages, and features of the invention will become clear as this description proceeds.
The invention accordingly comprises the features of construction, combination of elements, and arrangement of parts that will be exemplified in the description set forth hereinafter and the scope of the invention will be indicated in the claims.