Voice communication encompasses a rapidly evolving mix of technologies. A relatively recent communication technology that has garnered a lot of attention is Voice over Internet Protocol, referred to as VoIP. VoIP can use the Internet to transmit telephony data (voice and control data associated with a VoIP telephone call) in order to provide voice services to consumers. The Internet is a publicly accessible worldwide system of interconnected computer networks that transmit data using a standardized Internet Protocol (IP) as well as other standard and accepted data transmission protocols such as Transport Control Protocol (TCP) and User Datagram Protocol (UDP). It is made up of thousands of smaller commercial, academic, domestic, and government networks and is used to transmit and host various information and services, such as electronic mail, online chat, and the interlinked Web pages and other documents on the World Wide Web (WWW).
The main attraction of VoIP technology to businesses and consumers is the lower cost. VoIP technology has been adopted for use by businesses and consumers as a substitute for existing landline or mobile telephone services that use expensive cellular networks and/or the Packet Switched Telephone Network (PSTN). By using VoIP, a consumer can make telephone calls using a broadband Internet connection instead of a regular landline or mobile telephone. VoIP voice data is transmitted over a packet-switched network by breaking down voice signals into packets of digital data from the transmitting end of a telephone call (or a computer equipped with a microphone), then sending the data over the Internet using UDP (User Datagram Protocol) to the receiving end of the call. The voice signals are reassembled and played at the handset (telephone) at the receiving end of the telephone call (or through a computer if it is being used in lieu of a telephone handset to receive and place VoIP telephone calls). Control data used to initiate a VoIP telephone call is also transmitted over the Internet.
Because it can utilize existing data communication infrastructure put in place for Internet data transfer, the overhead for a VoIP service provider is less than that of the more traditional telephone service providers that typically install, maintain, and upgrade their respective networks and communications equipment. As described in more detail below, the VoIP service providers do have to provide some equipment to manage their network of VoIP handsets (telephones) or computers, but such equipment is less expensive to install, maintain, and update than the network infrastructure supporting the more traditional voice communication telephone network (known as PSTN or Public Switched Telephone Network). The reason for the lower cost is that the VoIP service providers can leverage the already existing packet-switched network infrastructure in place to transfer data over the Internet (or over an Intranet for a business). Furthermore, voice communication over PSTN operates over a circuit-switched, rather than packet-switched, protocol. Therefore, an active call requires a 64-kps connection between the parties that cannot be used for any other purpose during the call, and is billed by the service provider accordingly. In contrast, the VoIP packet-switched approach allows bandwidth that is not being used by the voice data to be allocated to other purposes.
It would be beneficial to be able to further lower the cost of providing VoIP service to consumers. By lowering the cost of its service, a VoIP service provider will be able to attract a greater number of consumers currently using traditional PSTN and mobile telephone networks. Lower cost is the primary attraction of consumers to VoIP to begin with. So, further reduction in cost will naturally lead to more consumer interest. Additionally, if an individual service provider can provide a lower cost service that is otherwise comparable in quality and features to competing VoIP service providers, the more affordable provider will benefit from customer migration from other VoIP service providers, because the main incentive for such customers to switch to VoIP in the first place was to achieve greater cost savings.
One of the most common business models used for generating income through the use of media is advertising. Many businesses successfully use advertising revenue as a primary business model through the Internet. Many more Internet-based businesses supplement other revenue streams through advertising. Traditionally, advertising through telephone calls has been limited. It has mostly consisted of telemarketing calls or prerecorded messages played while a consumer is on hold for some other purpose. Telemarketing advertisements can either be random or targeted but are not typically initiated by the service provider and therefore do not provide a traditional revenue stream to service providers that can offset the cost of such a service.
However, if a service provider can reliably deliver effective advertisements to its customers it would naturally be able to use a portion of the advertising revenue to offset the cost of providing the service. So, for example, if an advertisement can be delivered to a VoIP consumer on that consumer's telephone, for example, as a voice message preceding the phone call (there are many other ways to effectively deliver such an advertisement, some of which are discussed below), then the income generated from the advertisement can be used to offset, or eliminate, the actual cost of the telephone call to the consumer. The challenge is to be able to deliver effective targeted advertisements to such consumers.
Consumers who would receive such advertisements would understand that they are receiving discounted service costs in exchange for having to listen to or view the advertisement. However, advertisements that have no relevance to these consumers would be annoying to them, and after experiencing frustration with viewing irrelevant advertisements, such consumers might instead optionally choose to pay more money for an advertisement-free service. Therefore, there is a need for a system that allows the service provider to send targeted advertisements to these consumers. Such targeted advertisements would be directed to those consumers' perceived interests or needs and would therefore not be annoying (or as annoying) for the consumers to view or listen to. Furthermore, advertisers would be willing to pay more money if their advertisements were being targeted to customers with a particular interest in their products or services.
Advertising revenue is generated by many on-line businesses. There are even advertising networks (also known as online advertising networks or ad networks) that represent a number of web sites that sell online advertising space, allowing advertisers to reach broad audiences relatively easily through a single package deal purchase. Often these advertisers pay per click, i.e., they pay a predetermined price for every click on their advertisement by a web user (such clicks will often bring the web user to the advertiser's website). Advertising networks provide a way for media buyers to coordinate advertising campaigns across dozens, hundreds, or even thousands of sites in an efficient manner. The campaigns often involve running advertisements over a category (run-of-category) or an entire network (run-of-network).
Another online advertising method is called opt-in e-mail advertising (also known as permission marketing), which communicates an advertisement by e-mail where the recipient of the advertisement has consented to receive it. Often the consent is the result of offers of free merchandise in exchange for filling out a survey. Some of the advantages of this method are that it provides a direct contact with the consumer and is inexpensive, flexible, and simple to implement. By using the information in the survey, in some cases, the advertisements may, to some extent, be targeted to the consumer's interests. However, unlike the targeted advertisements achieved through the present disclosure, there is no incentive for the consumer to continue to receive or view the advertisements after receiving their free merchandise.
Another method of online advertising is spamming. Spamming is the sending of unsolicited e-mails, usually trying to sell products or services, to web users. While spamming can be economically viable because advertisers have very few operating costs beyond the management of their mailing lists, it is widely reviled due to the often unacceptable content of the e-mails as well as being an annoying distraction to e-mail users that do not wish to receive them but have to take time to delete the spam e-mails from their inboxes. That is why spamming restrictions have been the subject of legislation in a number of jurisdictions. Spamming also presents a problem because the volume of unsolicited mail it creates results in costs borne by the Internet service providers (which is, in turn, indirectly borne by the service providers' customers); the service providers may be forced to add extra capacity to cope with the increase in bandwidth or alternatively provide a slower service to its customers.
Contextual advertising is where advertising networks display text-only advertisements that correspond to the keywords of an Internet search or to the content of the page on which the advertisement is shown. Contextual advertisements are believed to have a greater chance of attracting a user because they are based on the user's search query as that correlates to the user's interest at the time of query. Contextual advertising can be seen, for example, in a search query for “wine” which may return an advertisement for a wine seller's website.
GOOGLE ADSENSE, for example, implements contextual advertising by providing its website customer with JavaScript code that, when inserted into its web pages, generate relevant advertisements from the GOOGLE inventory of advertisers. The relevance of the advertisements shown is calculated by a separate GOOGLE program that indexes the content of the web page.
Telephone advertising by telemarketers is accomplished through the use of live sales people or pre-recorded messages. An example of live telephone advertising is where a sales representative calls a consumer on the telephone to sell products or services. Pre-recorded telephone advertising is also used when a caller is put on hold while trying to reach an operator or customer service. For example, a pre-recorded telephone advertisement about a new model television being offered by an electronics manufacturer might be played while a caller is put on hold while trying to reach technical support.
With the increasing popularity of VoIP, telephone advertising has also been modified and inserted into VoIP calls. For example, a pre-recorded advertisement may be inserted immediately before a VoIP call is connected. Further, since many VoIP calls are initiated using a computer, advertisements may be continuously displayed on the computer screen while a VoIP call is in session. Alternatively, if the VoIP call is being initiated from a telephone that has a video screen, the advertisement could be displayed on that screen. However, as discussed above, such random advertisements may annoy the VoIP consumers and cause them to choose an advertisement-free service.
Speech recognition technologies allow computers to convert wave forms of human speech into text. A typical system for accomplishing speech recognition consists of a computer equipped with a microphone and special speech recognition software. The microphone might also be used to convert the analog voice wave form into digital data representing the spoken voice so that it can be analyzed and converted to textual form by the software running on the computer. One well-known speech recognition technique used in such software extracts the sounds that group together to form words, referred to as phonemes, from the digital data. Once these phonemes have been extracted and recognized, they are converted into textual words. A common method of converting these phonemes into words is through the use of a hidden Markov model (HMM). An HMM is a statistical model that is applied to a set of phonemes to generate the most likely corresponding words.
Speech recognition technologies are often used in transcription. For example, speech recognition can be used by people to interact with a computer who would otherwise have difficulty using a keyboard, such as people with physical limitations like carpal tunnel syndrome. Speech recognition is also used in legal and medical transcription and for the generation of subtitles for television programs. Many automated telephone services' directory systems also employ speech recognition. For example, there are automated telephone-based directory systems for travel booking and information, financial account information, customer service call routing, and directory assistance that utilize speech recognition technology.
Since under VoIP, the voice data can be (and usually is for VoIP consumers) transmitted via the Internet, there is no premium for long-distance or international calling, which is one of the ways that consumers benefit from cost savings. By analogy, when an Internet user accesses a web page in the United Kingdom from the United States, he or she does not pay any kind of premium international rate but rather only the cost incurred for the basic fee from the Internet service provider. Similarly, for example, under VoIP, a telephone call placed from the United States to the United Kingdom might have no premium charges associated with an international call.
Typically, VoIP voice packets (sometimes referred to as the bearer packets) are transmitted using UDP over IP. UDP is one of the core protocols used in the Internet protocol suite. UDP is used by programs running on networked computers to send datagrams (short packets of information) to each other. UDP is more suited to voice communication data than other packet-switched data transmission protocols such as Transport Control Protocol (TCP) because UDP is faster and more efficient, which are very important characteristics for the successful transmission of real-time voice data. TCP, on the other hand, is better suited for reliability because it has built in error checking functionality.
There are several types of VoIP call-control protocols. H.323 is the most widely deployed. Other protocols used include Simple Gate Control Protocol (SGCP), Internet Protocol Device Control (IPDC), Media Gateway Control Protocol (MGCP), and Session Initiation Protocol (SIP). Some of these protocols, such as H.323, were created to deal with real-time multimedia transmission over an unreliable data network but not specifically created for VoIP. H.323 is a standard protocol approved by the International Telecommunication Union (ITU) in 1996 to promote voice transmission over the Internet and provide mechanisms for voice and video communication and data collaboration.
These protocols are interchangeably used in connection with VoIP to accomplish the same thing, i.e., to accomplish call-flow over the packet-switched network. Typically, the VoIP service provider will maintain a call manager that is used to establish the VoIP calls. When a consumer initiates a call from his VoIP handset (or computer), the control data will be transmitted to the call manager via the packet-switched network. The call manager will then establish the call by transmitting control data back to the IP address of both the initiating end and the receiving end of the call, letting them know to initiate the normal call protocols (such as causing the phone to ring at the destination end) and transmit the packetized voice (bearer) data directly to each other's IP addresses once a call is initiated. So, the call manager maintains a list of VoIP calls that can translate telephone numbers into IP addresses as appropriate if both handsets are on the VoIP network.
A gateway typically is established and maintained by the VoIP service provider to make the connection to/from the PSTN network. If, for example, the VoIP-initiated destination call's destination is in the PSTN network, then the gateway establishes a circuit-switched call on the PSTN network—which is a circuit-switched rather than packet-switched network that employs time division multiplexing (TDM). In that case, once the call is established the VoIP voice (or bearer) packets will be transmitted to the IP address of the gateway which will be expecting those packets after receiving the control data from the manager and will, in turn, transmit the voice data to the destination over the PSTN network using TDM.
Similarly, telephone calls coming from the PSTN network that are destined to a VoIP pass through a similar gateway that will convert the TDM call data into a packet-switched format for transmission over an IP network. There may also be VoIP gateways internal to the IP network, as well as DNS servers and other network control devices that need to be set up and maintained by the VoIP service provider, depending on how the network is architected.
The bearer packets are transmitted over the packet-switched network using UDP because it is not necessary to try to correct for lost packets on a voice call since the call is happening in real time and any such correction would cause jitters on the call. Therefore, it would be impractical to use an error-correcting protocol such as Transport Control Protocol (TCP).
Heretofore, there is a need to provide contextual messaging methods and systems, wherein the particular messages provided to a voice customer are based on keywords extracted from that customer's telephone conversation using speech recognition technologies or other telephony data. The messages can include different types of information (e.g., advertisements, weather, transportation routes, local and foreign news, schedules, historical information, and the like).