Voice over Internet Protocol (VoIP) is a general term for a family of transmission technologies used to deliver voice communications over IP networks such as the Internet or other packet-switched networks. Other terms frequently encountered and synonymous with VoIP are IP telephony, Internet telephony, voice over broadband (VoBB), broadband telephony, and broadband phone.
Internet telephony refers to communications services—voice, facsimile and other such communications transmissions, and/or voice-messaging applications—that are transported via the Internet, rather than the public switched telephone network (PSTN). The basic steps involved in originating an Internet telephone call include conversion of the analog voice signal to digital format and translation of the signal into Internet protocol (IP) packets for transmission over the Internet; the process is reversed at the receiving end.
VoIP systems employ session control protocols, such as the Session Initiation Protocol (SIP), to control the set-up and tear-down of calls as well as the selection of audio codecs which encode speech allowing transmission over an IP network as digital audio via an audio stream. The advantage to VoIP is that a single network can be utilized to transmit data packets as well as voice and video packets between users, thereby greatly simplifying communications.
SIP is an open signaling protocol for establishing many kinds of real-time and near-real-time communication sessions, which may also be referred to as dialogs. Examples of the types of communication sessions that may be established using SIP include voice, video, and/or instant messaging. These communication sessions may be carried out on any type of communication device such as a personal computer, laptop computer, telephone, cellular phone, Personal Digital Assistant, etc. One key feature of SIP is its ability to use an end-user's Address of Record (AOR) as a single unifying public address for all communications. Thus, in a world of SIP-enhanced communications, a user's AOR becomes their single address that links the user to all of the communication devices associated with the user. Using this AOR, a caller can reach any one of the user's communication devices, also referred to as User Agents (UAs) without having to know each of the unique device addresses or phone numbers.
In VoIP and SIP telephony, the audio-encoding algorithm specified by ITU-T standard G.711 is commonly used when voice quality is a high priority. This technique has an encoded-data bandwidth requirement of approximately 64,000 bits per second, per voice-flow direction. Combining both voice-flow directions and adding packet-format overhead, roughly 174,000 bits per second of bandwidth is typically required to support a two-way communication utilizing G.711. Even higher-quality codecs in the G.722 family require approximately the same bandwidth as G.711. Another commonly used technique, G.729, utilizes only 8,000 bits per second per voice-flow direction, but the voice quality is below that of G.711. Similar tradeoffs occur when using various types of video codecs or any other compression technology utilized during real-time or near-real-time communications.
When an enterprise uses IP to interconnect the telephony systems at their different locations (e.g., when branch offices are linked to the main office though an IP link) the enterprise's contract with the service provider will often specify the maximum bandwidth that may be carried on each of the IP links at any given time. In many cases, there is no ability to exceed the amount specified in the contract because the physical equipment made available by the service provider rarely has capacity beyond the contracted amount. This means that, when the full contracted amount of bandwidth is being utilized, attempts to initiate additional phone calls via that IP link will fail or be denied. Similarly, other attempts to utilize bandwidth over the full IP link will fail or be denied.
One solution for this problem is to use alternative routing, such as via the Public Switched Telephone Network (PSTN), for calls that cannot be carried on the IP link due to bandwidth limitations. This solution has the disadvantages of potentially being more expensive than utilizing the IP link, having less flexibility or functionality than could otherwise be offered if the IP link were utilized, and requiring additional Operation, Administration, Maintenance and Provisioning (OAM&P) complexity to configure and maintain the facility. Another factor that can make this approach impractical is that many branch office locations do not have their own local interface to the PSTN; instead, they rely on PSTN gateways that can be accessed only via the IP link.
Beyond the basic capacity concerns, and more importantly, there is no mechanism to ensure that the contracted bandwidth is utilized in a way that cross-optimizes call quality, individual users' specific priority of service, and full consumption of all available link capacity (i.e., leaving no contracted bandwidth ‘unspent’). Without this optimization, techniques to manage link bandwidth are somewhat crude, without needed sophistication in considering the requirements of specific applications or users, and minus an emphasis on ensuring that the link bandwidth that's being paid for is fully occupied in the manner most advantageous to the contracting customer.
In order to keep their costs down, while reducing the likelihood that calls won't go through, the compromise commonly adopted is to encode inter-location calls using G.729, and then contract for enough bandwidth to cover the maximum number of anticipated simultaneous calls. There are several disadvantages of this approach, including: (1) no calls, even the “important” ones, receive the higher acoustic benefit of G.711 or other high-quality codecs; (2) during non-peak times, enterprises are paying for bandwidth that they are not using; and (3) all calls are treated equally, with no opportunity for variation in approach or capability.