Conventional telecommunications providers offer a broad variety of services, including both voice and data services. As these providers continue to extend their service offerings, they constantly strive to minimize the costs of providing these services. The consolidation of voice and data services provides a substantial opportunity for savings. However, the consolidation of these services and corresponding decrease in cost must not come at the expense of the quality in voice communications services (referred to herein as “Quality of Service” (QoS)).
Other organizations, such as large corporations, face similar challenges. As with telecommunications providers, the integration of voice and data networks within an organization presents a great potential for savings. Conventional separation of private branch exchange (PBX) and Internet/Intranet often involves significant expense. Organizations typically find it more efficient and cost-effective to design, deploy, maintain, and support a single integrated network than to support separate data and voice solutions.
Conventional voice systems rely on circuit-based equipment. Circuit-based equipment provides high reliability and scalability, almost universal interconnection, and a tremendous installed base. In contrast, conventional packet-based telephone systems, such as Internet telephony, often provide limited reliability and scalability. Protocols conventionally utilized in packet-based networks, such as file transfer protocol (FTP) and hypertext transfer protocol (HTTP), are opportunistic, taking as much bandwidth as is available. Therefore, mixing voice and data in a single, uncontrolled, packet-based network, often results in low QoS due to a variety of factors, such as jitter, packet loss, and excessive delay. Callers recognize the degradation in QoS resulting from jitter, packet loss, and delay as voice distortion, loss of portions of words or sentences, echoes, talker overlap, and dropped calls. A certain amount of delay is inherent in any packet-based voice implementation, including a voice over Internet protocol (VoIP) implementation. However, if the total delay is greater than 150–200 milliseconds, QoS will be generally unacceptable.
Jitter, caused by variable inter-packet timing, is one source of QoS degradation in VoIP services. In a converged network in which voice and data packets are interleaved, normally orderly packetized voice arrives at disorderly intervals. Conventional systems implement jitter buffers to address the problem of jitter. Unfortunately, the addition of jitter buffers result in increased delay in the network.
Packet loss occurs when routers begin to overflow during periods of congestion, forcing them to drop packets. Conventional systems attempt to account for packet loss in a variety of ways. For example, a conventional system may compensate for lost packets by interpolation, replaying the last packet received to fill in the non-contiguous speech. Interpolation is effective only for a very small number of lost packets. Other conventional systems send redundant information so that the information contained in the lost packets may be replaced with information contained in successfully transmitted packets. Sending redundant information results in increased traffic and requires greater bandwidth and therefore may cause greater delay within the network. Another conventional approach sends redundant packets but utilizes a codec that results in a smaller number of packets and therefore requires less bandwidth. Although this approach decreases the bandwidth requirements inherent in sending redundant packets, the approach increases the delay and reduces voice quality due to coding effects.
QoS degradation may also result from delay. Delay causes two problems: echo and talker overlap. Echo is caused by the signal reflection of the speaker's voice from the far end telephone equipment back into the speaker's ear. To eliminate echo, conventional systems may implement an echo canceller. These are active devices used by phone companies to suppress positive feedback (singing) on the phone network. They work by predicting and subtracting a locally generated replica of the echo based on the signal propagating in the forward direction. To eliminate talker overlap, a VoIP system must reduce the total delay experienced during the VoIP call.
Delay includes the time necessary to collect a packet or frame of voice samples to be transmitted, to code and packetize the collected packets, and to transmit the resulting packets over the physical network. Delay results from several sources, including processing delay, queuing delay, transmission delay, and propagation delay.
Because of the degradation that can affect voice communications in a packet-based network and because of the complexity and cost of converting existing circuit-based systems, telecommunications providers have been slow to implement packet-based networks for the transmission of voice. Large traditional voice carriers are just beginning to merge their existing Public Switched Telephone Networks (PSTN) with their data networks using Voice over IP (VoIP) or Voice over Asynchronous Transfer Mode (ATM). But the carriers understand that without acceptable QoS, callers will avoid VoIP.
In order to both provide this voice quality and to begin merging the PSTN with the data network, the carrier must provide a level of QoS which provides low loss and a reasonable delay for the RTP voice packets in the IP core, and at the same time provide, as a minimum, best effort service for data. In addition to best effort service for data, the carrier may wish to provide other levels of QoS for other types of communications, include video and fax.
Several conventional approaches exist for maintaining QoS in a mixed-service packet network. These approaches include (1) using differentiated levels of priorities, wherein the voice packets receive the highest priority and the data packets receive a lower priority; (2) reserving a path through the network across which the communication can traverse; and (3) performing endpoint or connection admission control. While each of these approaches has its advantages and disadvantages, none provides both a simple, and at the same time, effective means of ensuring QoS for VoIP communication.
One of the simplest conventional approaches for maintaining QoS for VoIP is through the use of differentiated services, assigning different priorities for the real-time packets containing the VoIP packets relative to other packets in the network. Traditional IP networks use Native IP Forwarding (NIF). A router determines a packet's next hop route by the finding the longest match of the destination IP address with a prefix in the routers forwarding table. At the destination point of each hop, a router reexamines the IP header for the destination IP address and performs the longest match on it's own forwarding table to determine the next hop. This process repeats hop by hop until the packet reaches its final destination. Note that with NIF, the routing table is the only state information processed and maintained in the router.
In the DiffServ model, packets entering a network domain are classified and marked with a DiffServ code point (DS code point) according to their requirements for Per Hop Behavior (PHB). The PHB is a forwarding behavior that represents queuing and servicing disciplines in the routers. PHBs provide a means of allocating bandwidth and buffers according to the relative requirements of the packets being transferred across the network. Packets are grouped into classifications, and all packets in a classification receive the same treatment. The key characteristic of DiffServ is that classification and treatment are relative. No reservations are made, and thus one classification might receive higher priority relative to other classifications to reduce delay. Another classification might get better treatment relative to other classifications to reduce loss. Ultimately a limited set of resources is divided among the various classifications, and, if traffic is excessive, loss or excessive delay may occur. However, DiffServ has the advantage of not requiring the processing and storing of additional state information needed by Multi-protocol Label Switching (MPLS) (described below).
Another approach for ensuring QoS for VoIP is to set up resource reservations in routers across the IP network. The QoS requirement may be expressed in the form of bandwidth, delay, or jitter, or may involve specifying an explicit route across the network. This approach may be implemented using Multi-protocol Label Switching (MPLS) with some type of bandwidth reservation capability.
MPLS is the most popular standard of label-based forwarding. The foundation for label-based forwarding is Forwarding Equivalency Class (FEC). An FEC is assigned as a packet enters the network and can be based on information gleaned from the packet header including destination IP address or on information not available in the header such as the ingress port. A Label representing the FEC is pre-pended to each packet, and subsequent forwarding decisions are based on these Labels without examining the packet header at each hop. In practical terms, at each hop, rather than examining the destination address in the header, the Label is examined and used as an index to a table that contains the next hop to which the packet should be forwarded. All packets in an FEC are treated equivalently as they are forwarded across the network. This is similar to switching in an ATM or Frame Relay network in which a Virtual Path Identifier/Virtual Circuit Identifier (VPI/VCI) or Data Link Connection Identifier (DLCI) identifies a Permanent Virtual Circuit (PVC) or Switched Virtual Circuit (SVC). The forwarding decision is accomplished by a table lookup in the switch using the VPI/VCI or DLCI along with the ingress port. In an ATM or Frame Relay network, the entries are placed in the table when PVCs or SVCs are established either by signaling or using a network management system. In MPLS, these table entries are placed using a reservation protocol such as RSVP or CR-LDP, which are described below. The addition of these switching tables in routers represents a second form of state information that must be processed and maintained in addition to the routing tables associated with NIF.
In MPLS a label distribution protocol is used to distribute the label and associated next hop information to Label Switching Routers (LSRs) throughout the network. Other information may also be distributed and contained in these tables as well. There are two protocols that have been designed to perform this function, Label Distribution Protocol (LDP) and Resource ReSerVation Protocol (RSVP). LDP was originally designed to distribute labels to LSRs but is in the process of being extended to make resource reservations. The extended form of LDP is called Constraint based Routing-LDP (CR-LDP). RSVP was originally designed to make resource reservations, but has been extended to perform label distribution. The extended form of RSVP is called RSVP-Traffic Engineering (RSVP-TE). Both CR-LDP, and RSVP-TE perform a signaling function that enables some form of Quality of Service (QoS) across MPLS. This signaling reserves resources, which are essentially router queues. These routing queues ultimately represent bandwidth along routes in the network, and this reserving of bandwidth for a particular FEC enables QoS. If insufficient resources are available to provide QoS for a particular call, the connection is refused. This is called connection admission control (CAC).
The advantage of this approach is that the reservations are not relative to other traffic on the network as in the case of DiffServ, but are much closer to being guaranteed. One of the problems with this approach is that implementing RSVP-TE or CR-LDP in high-speed core routers requires these routers to process and maintain state information for the label switching tables and reserved bandwidth. Building high-speed core routers with these capabilities is complex and very expensive. Also, these capabilities need to be implemented in every router in the network. This violates one of the principles of TCP/IP, which is to process and maintain a minimum amount of state information in the core, keeping the core fast and simple, while CPU intensive tasks are pushed to the edge. Scalability is a problem as well, since at least in its simplest form, a reservation has to be made for each call originated across the network. In order to avoid this, tunnels can be reserved and calls aggregated into these for transport across the network. This too has its problems in that it makes the process even more complicated and increases the difficulty in fully utilizing resources in the network. It also still requires the core routers to process and maintain the additional state information for the label switching tables and reserved bandwidth.
Another alternative approach is to provide Endpoint or Connection Admission Control (CAC). Traditional PSTNs rely on local switches to perform this CAC function when the network is too busy to process a call. A CAC approach in a packet-based network could rely on a variation of the reservation approach discussed above in which an error code is returned if the attempt to create a reservation is unsuccessful. Upon return of the error code, the phone could emit a busy signal.
An alternative CAC approach maintains a simple core IP network and provides a means for the edge devices to perform CAC. Under such an approach, a packet stream requests service from a network edge device, such as a media gateway, and the device includes a means to detect impending congestion in the IP core. The device either accepts or rejects the request based on the congestion state. This method would push congestion control from the core to the edge and thus simplify the job of the core routers because it requires no support from the core IP routers; the core routers do not process or maintain state information other than traditional routing tables.
Several conventional methods for performing CAC in a packet-based network exist. In one method, the routers use congestion marking. However, this method requires more functionality be added to the router, increasing the complexity of the core routers. Another method utilizes packet drops to determine congestion. But for voice applications, the objective is to avoid congestion before drops occur.
Another conventional method is to use a black box approach to congestion avoidance with implicit feedback based on increased delay. However, conventional methods of this type use window-based flow control for each individual user. Also these conventional methods assume deterministic delays and fail to examine the effects of stochastic delays experienced in an actual network. In addition, these conventional methods utilize round-trip delay rather than one-way delay.
A further conventional method utilizes probing packets. Endpoints, such as media gateways or other hosts, probe the network to detect the level of congestion. The endpoint admits connections only if the level of congestion is sufficiently low. To accurately determine the congestion of the network, the endpoint sends probe packets at the data rate VoIP call will require and records the resulting level of packet losses, jitter, or other congestion indicator. For example, in one conventional approach, the probe packets are sent in a DiffServ code point that is a low priority FEC. The data, which requires the QoS, is placed in the high priority FEC.
Although a CAC method based on probing may accurately measure congestion, the probing and feedback phases slow down the admission decision significantly. Probing causes a delay while the probing packet is sent and either feedback is received or a timeout period expires. This delay creates a significant setup delay for the VoIP call, on the order of seconds, and VoIP applications will not tolerate such long set-up delays.
In another conventional CAC method, the endpoint attempts to determine the amount of bandwidth a specific communication will require and then attempts to determine if the required bandwidth is available on the network. For example, the patent to Hiroyuki Yokoyama, et al., U.S. Pat. No. 6,324,166, describes a call setup control apparatus, which determines the amount of bandwidth consumed by current calls, compares that amount with the available bandwidth, and accepts or rejects call requests based on the comparison. And the patents to Patrick Droz, U.S. Pat. No. 6,292,466, and to Gyeong-Seok Kim, U.S. Pat. No. 6,215,768, describe similar systems and methods. Also, the patent to Sari Saranka, U.S. Pat. No. 6,314,085 describes a similar method for performing CAC based on the probability of cell loss given a known capacity. Utilizing estimated bandwidth requirements to perform CAC is relatively ineffective because the differing coding schemes used to transmit voice over packet networks cause great difficulty in accurately predicting voice bandwidth requirements.