The present invention relates generally to a method of transmitting audio messages over a network, and more particularly, a method of emulating voice messaging using electronic mail technology.
Electronic mail (xe2x80x9cemailxe2x80x9d) has proliferated as a common method of communication.
Initial communications consisted of ASCII (American Standard Code for Information Interchange) text. In an ASCII file, each alphabetic, numeric, or special character is represented with a 7-bit binary number (a string of seven 0s or 1s). 128 possible characters are defined. However, basic ASCII text email messages have progressed to include graphics, audio and even video. Graphic images, digital audio files and digital video all require an encoding and decoding process when transmitted over the Internet. A user wishing to encode a voice message and send the message to a preselected email address had to accomplish several steps and have certain hardware and software equipment. The user would typically record their voice message on a computer using a sound card attached or integrated into the motherboard of a computer.
The voice message is a sequence of analog signals that are converted to digital signals by the audio card, using a microchip called an analog-to-digital converter (ADC). When sound is played, the digital signals are sent to the speakers where they are converted back to analog signals that generate varied sound. Audio files are usually compressed for storage or faster transmission. Audio files can be sent in short stand-alone segmentsxe2x80x94for example, as files in the WAV format. In order for users to receive sound in real-time for a multimedia effect, listening to music, or in order to take part in an audio or video conference, sound must be delivered as streaming sound. More advanced audio cards support wavetables, or precaptured tables of sound. The most popular audio file format today is MP3 (MPEG-1 Audio Layer-3).
Once these digital audio files reside on the hard drive of the user, the user would attach the file to an email sent to a selected recipient. When the file is attached, it might be transmitted in a standardized protocol such as Multi-Purpose Internet Mail Extensions (herein xe2x80x9cMIMExe2x80x9d). MIME is an extension of the original Internet e-mail protocol that lets people use the protocol to exchange different kinds of data files on the Internet: audio, video, images, application programs, and other kinds, as well as the ASCII handled in the original protocol, the Simple Mail Transport Protocol (SMTP). In 1991, Nathan Borenstein of Bellcore proposed to the Internet Engineering Task Force that SMTP be extended so that Internet (but mainly Web) clients and servers could recognize and handle other kinds of data than ASCII text. As a result, new file types were added to xe2x80x9cmailxe2x80x9d as a supported Internet Protocol file type.
Attempts have been made to develop unified messaging systems that link video, text, audio, document management and the like into a single system. However, such attempts have not provided a means to enable voice messaging to maintain continuity throughout a thread, store outbound voice communications, and provide a reply function equivalent to the simplicity of email replies. Furthermore, many unified messaging application require expensive proprietary equipment wherein typical SMTP email servers are inexpensive, well-developed and already employed by most medium or larger business entities.
Consequently, there is a need in the art for a method of transmitting an audio voice message to an email address without the need of a computer.
There is a further need in the art for a method of replying to an audio voice message without the need to key in a telephone number.
There is a further need in the art for a method maintaining a thread of voice mail correspondence.
There is a further need in the art for a means to store and validate the transmission and content of an outgoing voice message.
There is a further need in the art for a novel dual email and telephone extension identity for user to access voice messages from a telephone or computer using a common set of alphanumeric characters.
There is a further need in the art for a new means of delivering audio advertisements to a captive audience.
There is a further need in the art for a new means of obtaining survey and demographic data.
There is a further need in the art for a method to provide untethered access to voice and email messaging using voice command zones.
There is a further need in the art for a means of scheduling the delivery of voice messages to enhance the impact on the recipient.
However, in view of the prior art in at the time the present invention was made, it was not obvious to those of ordinary skill in the pertinent art how the identified needs could be fulfilled.
The above and other objects of the invention are achieved in the embodiments described herein by providing a computer implemented method of transmitting electronic voice messages comprising the steps of establishing a caller identity associated with a first telephone connection. This may be achieved by parsing the Telco caller ID string and associating that string with a preexisting user record. In other words, a call from 727-507-8558 might be associated with the identity of a law firm and linked to additional records such as address, fax, user name, email address and the like. For most applications, it is preferred that the caller ID string be linked to the email address of the caller in order that the recipient may easily reply to the original message.
In the next step, a first audio clip is recorded from the first telephone connection. The audio clip may be digitized in any number of computer readable formats including, but not limited to, WAV, AIFF, MP3 and the like. An email target string is then established. This might be resolved by a number of alternative methods. First, a xe2x80x9cspeed dialxe2x80x9d interface may be established. The interface might be resident in an operating system, network appliance or on a website. Users pre-configure their settings similar to speed dials used in standard telephone systems wherein numerals are associated with the email target string. This has the advantage that alphanumeric characters are easily entered on a computer keyboard, but often problematic on a telephone system. A typical telephone has twelve keys which may be depressed in various iterations to resolve an alphabetic character. Alternatively, the individual alphanumeric characters may be individually spoken and resolved with voice recognition means as disclosed in U.S. patent application Ser. No. 09/517,415 filed Mar. 2, 2000 which is incorporated by reference.
The first audio clip, the caller identity and the email target string are all encapsulated into a first email attachment. The first email attachment is then transmitted to a first email account which is associated with the email target string.
This is typically an SMTP address accessed directly from an information store such as Microsoft Exchange, Novell Groupwise or the like. Alternatively, it may be stored offsite and accessed through a POP3 means. A second telephone connection accesses the first email attachment wherein the first audio clip is broadcast to the second telephone connection. Once broadcast, the caller is then prompted for a reply recording. Responsive to a record signal, a second audio clip from the second telephone connection is then recorded. The second audio clip, the caller identity and the email target string are all encapsulated into a second email attached which is then transmitted to a second email account associated with the caller identity.
In order to establish a thread of messages having a common topic, the first audio clip is appended to the second audio clip. The second audio clip might be placed after the first audio clip in order to preserve the chronological order the thread, or the second audio clip might be placed before the first audio clip in order to avoid the need to hear previously transmitted audio data. It should be understood that the reply loops may continue on far beyond one initiating email and a single reply. It is also anticipated that carbon copies, blind carbon copies and forwarded be enabled as they are standard SMTP functions.
It is preferred that the recording time is encapsulated into the email attachments with a date/time stamp. This may be achieved by encoding the alphanumeric characters of the date and time into the text body of the email message. Alternatively, the date and time may be synthesized as speech and appended to the audio clip.
Providing system security includes the steps of establishing the caller identity associated with the first telephone connection by receiving a password entry from the first telephone connection and associating the password entry against a preexisting caller account. Receiving the password entry may include interpreting at least one DTMF signal responsive to the keying of buttons on a touch-tone telephone.
Alternatively, the process may include the step of receiving the password entry by interpreting at least one alphanumeric character as spoken into the telephone. In another embodiment, the invention utilizes biometrics by receiving a predetermined call phrase, reducing the call phrase to a voice pattern, and associating the voice pattern against a preexisting caller account by speech identification means such as described by U.S. Pat. No. 5,608,784 of which specification is incorporated by reference.
The email target string may be established by the steps of interpreting a plurality of DTMF signals responsive to the keying of buttons on a touch-tone telephone.
Alternatively, the step may include receiving a plurality of individually spoken alphanumeric characters representative of the email target string and translating the spoken characters to their binary equivalent by a speech identification means.
With sensitive communications, it is preferred that the email attachment be encrypted. As an added security measure, an addition step may be employed by establishing a hash of the first email attachment and storing the hash in a secure storage means.
Other embodiment of the invention includes the steps of establishing a first SMTP email address, the first SMTP email address having a distinct prefix address and a domain address. For example, in a common email address format such as xe2x80x9c2224848@uspto.govxe2x80x9d the prefix address would be xe2x80x9c2224848xe2x80x9d and the domain address would be xe2x80x9c@uspto.gov.xe2x80x9d In the next step a primary telephone number is established. This number is preferably a toll-free number that is easily remembered with an alphabetic phrase correlated to the numerals of the number. For illustrative purposes, an example number might be xe2x80x9c1-800-555-EMAIL.xe2x80x9d In the next step, an extension to the primary telephone number is established. The extension contains alphanumeric characters identical to the distinct prefix address. For example, the above-mentioned email address would have an extension of 2224848. The full dialing string to the account would be xe2x80x9c1-800-555-EMAIL, ext. 2224848.xe2x80x9d A text-to-speech synthesizer established which is responsive to a call to the primary telephone number and extension wherein email messages sent to the first SMTP address are synthesized into computer-generated speech. A voice digitizing means is then established wherein reply messages spoken to the primary telephone number and extension are converted into an audio computer file and transmitted to a second SMTP address as an email attachment.
Callers that wish to retrieve their email from a regular telephone may be identified by businesses as a potential target for marketing new products and services. Furthermore, the caller is somewhat of a captive audience as the caller is seeking information of personal interest. An alternative embodiment of the invention includes the step of broadcasting a commercial to callers of the first SMTP email address. For example, the caller might hear xe2x80x9cbefore we retrieve your messages, please listen to a brief message from our sponsor . . . xe2x80x9d The caller might be presented with an option to pay for a subscription if they find the sponsor messages annoying or may enjoy the service free by their exposure to the advertisements. Another step may include surveying callers to the primary telephone number regarding caller demographics. The demographic may include age, gender, occupation, residence and the like. Using this information, demographically targeted commercials may be broadcast to callers according to survey results. To encourage callers to engage in the survey questions, an additional step of offering prize incentives for engaging in survey activities may be included. The prize incentives may be long distance telephone credits and, preferably, the assignment of those credits to a preexisting long distance calling card.
One objective of the current invention is to simplify the task of voice messaging and removing the tether of electronic equipment from the user. Accordingly, an alternative embodiment of the invention includes the steps of establishing a voice command zone. The voice command zone is a pre-designated area from which spoken commands and voice recordings are obtained. The zone may be established in an automobile passenger compartment, a human-inhabitable dwelling, or the like. An array of voice command instructions relating to the operation of an electronic message system are established. A microphone input means receives the voice command instructions. Responsive to a record command, an audio message received by the microphone input means is encapsulated into an email attachment and transmitted to a predetermined SMTP address. A play command broadcasts audio files attached to emails through a speaker means. Alternatively, a text-to-voice synthesizer synthesizes text messages into speech which is then broadcast through the speaker means responsive to the play command. For zones that have low ambient noise, an omni directional microphone is appropriate. However, where voice commands from a specific individual are desired, an alternative embodiment of the invention includes a target location sensor means. This may include an RF or IR transmitter placed on the individual. A target acquisition means picks up the RF or IR broadcast and points a unidirectional microphone input means towards the individual thereby avoiding extraneous audio noise.
In yet another embodiment of the invention, a caller identity is established with a telephone connection and an audio clip is recorded from the first telephone connection. An email target string is established. The audio clip, caller identity and email target string are encapsulated into an email attachment. The email attachment is then transmitted to an email account associated with the email target string and the email attachment is also stored in a sent items repository. This embodiment of the invention serves a critical function, particularly in the business and legal communities of providing a record of outbound communication. In the prior art, one business person may leave a message on a voice mail system, but has no means to prove that message was left, much less the actual content of that message. The ability to produce a recordof outbound communication from one business to another serves the function to validate important communications and messages were actually transmitted. It is preferred that along with the email attachment, the email target string and a date-time stamp string are stored in association with the email attachment.
In an alternative embodiment of the invention, a caller may wish to have a communication delivered at a predetermined time. Email communications are substantially instantaneous. Therefore, it might be known that an intended recipient may not be current available due to work schedule, time zone differences or the like.
If the communication is immediately transmitted, the recipient may find it buried below more recent communications. Furthermore, many message system provide immediate feedback with sound or an interface display when a new message is received. Accordingly, it would be advantageous to provide the ability to schedule the delivery of the communication. In this embodiment of the invention, a caller identity is associated with a telephone connection. An audio clip is recorded from the telephone connection. An email target string is established. The audio clip, caller identity and email target string are encapsulated into an email attachment. A broadcast time is established. The email attachment is then held in a queue until the broadcast time is reached wherein the email attachment is transmitted to an email account associated with the email target string. The email target string may be associated with a time zone associated with the physical location of the intended recipient. The broadcast time then may be automatically calculated relative to the time zone.
Accordingly, it is an object of the present invention to provide a method of transmitting an audio voice message to an email address without the need of a computer.
It is another object of the present invention to provide a method of replying to an audio voice message without the need to key in a telephone number.
It is another object of the present invention to provide a method maintaining a thread of voice mail correspondence.
It is another object of the present invention to provide a means to store and validate the transmission and content of an outgoing voice message.
It is another object of the present invention to provide a novel dual email and telephone extension identity for user to access voice messages from a telephone or computer using a common set of alphanumeric characters.
It is another object of the present invention to provide a new means of delivering audio advertisements to a captive audience.
It is another object of the present invention to provide a new means of obtaining survey and demographic data.
It is another object of the present invention to provide untethered access to voice and email messaging using voice command zones.
It is another object of the present invention to provide a means of scheduling the delivery of voice messages to enhance the impact on the recipient.
An advantage of the invention is that it incorporates well-developed SMTP technology to deliver voice messages globally without incurring long distance charges.
Another advantage of the invention is that those wishing to send an audio voice message do not need to have any computer equipment.
Another advantage of the invention is that recipients of voice messages no longer are required to find and dial the number of the originator, the recipient can easily reply to the voice message with a single click, button or spoken command.
Another advantage of the invention is that the context of a voice message discussion may be maintained as a thread. Past comments may be accessed and referenced as needed.
Another advantage of the invention is that outgoing voice messages may be easily saved in virtually any computer readable medium. Proprietary voice message systems are not required.
Another advantage of the invention is that businesses may subrogate the costs of the messaging system infrastructure with demographically targeted advertising. Users unable to subscribe to the service can still have access paid for by advertising businesses.
Another advantage of the invention is that delivery of voice messages may be coordinated to coincide with the schedule of the recipient so that the voice message is delivered at the optimum time.
These and other important objects, advantages, and features of the invention will become clear as this description proceeds.
The invention accordingly comprises the features of construction, combination of elements, and arrangement of parts that will be exemplified in the description set forth hereinafter and the scope of the invention will be indicated in the claims.