Internet Telephony allows telephone calls to be carried over an Internet protocol (IP) network either end-to-end between two telephones or computers, or as one or more “hops” in an end-to-end telephone call. A major objective in creating an internet telephony system is to reduce the cost of voice calls while maintaining the same quality level currently provided in voice networks. To achieve this objective a voice call may have to be routed over multiple hops, with some of these hops being in the data network while others are in the voice network.
Internet telephony calls are created, managed, and torn down by signaling protocols. These signaling protocols, when combined with a method of routing the signaling messages and maintaining call state allow the actual media (i.e. voice) to flow in packets between the endpoints. The standards organizations are currently evolving two Internet Telephony signaling protocols: H.323 and SIP. A call routing scheme can be developed separately for each signaling protocol, but it is highly desirable to separate the routing function from other control functions of the Internet, as has been done for the routing of IP data packets.
The routing of telephone calls in the public switched telephone network (PSTN) is accomplished by a combination of common channel signaling (CCS), such as Signaling System #7 (SS7), a number of translation facilities in elements called Service Control Points (SCPs), and static routing tables in elements called Service Switching Points (SSPs). While the CCS routing architecture is a reasonable solution for the PSTN, this architecture has a number of serious limitations, not the least of which is the use of static routing tables in the SSPs. It also suffers from poor separation of the name→address translation function (e.g. 800 number→destination port) from the configuration of the routing machinery of the PSTN.
Initial deployments of Internet telephony have been designed to be similar to the PSTN and use static routing tables in network endpoints, gateways, or centralized call control elements called gatekeepers. An example of a simplified internet telephony system architecture 100 is shown in FIG. 1.
In architecture 100, terminal 12 is connected to intranet 10 which has a gatekeeper 14 which acts as a routing agent for intranet 10. Terminal 22 is connected to intranet 20 which has a gatekeeper 24 which acts as a routing agent for intranet 20. Intranets 10 and 20 are each connected to Internet 30.
In the configuration 100 of FIG. 1, a call is routed from terminal 12 to terminal 14 using the routing tables in gatekeepers 14 and 24. One problem with the conventional internet telephony system 100 has no distributed routing protocol to ease the maintenance and distribution of routing information among the elements of the system.
As mentioned above, two Internet Telephony protocols are presently evolving within the standards organizations: H.323 and SIP. The two protocols are discussed in the next two sections with emphasis on how they achieve multi-hop call routing. Then we describe how to achieve distributed multi-hop call routing. We also discuss the addressing formats used in Border Gateway Protocol (BGP) which is used for routing of IP data packets in the backbone of the Internet.
The H.323 Architecture
Recommendation H.323 is a standard architecture for multimedia conferencing (voice, video, and, data) in packet-based networks that was designed by the ITU-T. H.323 has been successfully applied as a suite of signaling protocols for Internet Telephony.
The main components involved in H.323 conferencing are:                Terminals: an H.323 terminal is an endpoint capable of generating audio, video, and data streams or any combination thereof.        Gatekeepers: a gatekeeper is an H.323 entity that provides address resolution and controls access for all types of H.323 endpoints. In addition, a gatekeeper may perform other services such as accounting and authentication.        Multipoint Control Units (MCU): an MCU is an H.323 endpoint which provides the capability for three or more terminals to participate in a multipoint conference.        Gateways: an H.323 gateway is an endpoint that translates from/to H.323 to/from another multimedia conferencing protocols such as H.320 (conferencing on ISDN), or SIP. The gateways with the most relevance to Internet telephony are the voice gateways which are H.323/PSTN gateways and which carry voice only.        Proxies: While not part of the H.323 standard, Cisco provides for an H.323 proxy. It behaves like an H.323/H.323 gateway. Useful features of the proxy include quality of service (QoS), Security, and Application Specific Routing (ASR). ASR involves forcing multimedia streams to follow specific routes path towards the destination).The main signaling protocols required to implement the H.323 architecture are:        RAS: Registration, Admission, and Status. It is a UDP-based protocol used for communication between H.323 endpoints and the gatekeeper and also for inter-gatekeeper communication. It is part of the H.225 recommendation.        Q.931: is the signaling protocol used for connection establishment between two endpoints. It is part of the H.225 recommendation.        H.245: is the signaling protocol responsible for call control between endpoints. It provides for capability exchange, channel and coder/decoder (codec) negotiation, and several other functions.        RTP: is the protocol used for carrying the real-time media streams over IP networks.        T.120: is the architecture used for sharing data between endpoints participating in a conference.        
A gatekeeper administers one or more H.323 zones. Calls between endpoints in the same zone typically consist of a single hop. On the other hand, inter-zone calls will usually consist of multiple hops (called legs). Some examples will be described below which illustrate the operation of H.323 and the problems involved with multi-hop calls.
FIG. 2 illustrates an example 200 of H.323 call set-up. In FIG. 2, a call is established directly between the terminals 12 and 22 and therefore consists of only one hop.
However, the H.323 recommendation also defines a signaling model for gatekeeper routed calls. In the gatekeeper model, the Q.931 and H.245 signaling may flow through a gatekeeper while the RTP media streams still flow directly between the terminals, as in FIG. 2. In this case, the signaling part of the call consists of two call legs: one call leg from terminal 12 to gatekeeper 14; and one call leg from gatekeeper 14 to terminal 22.
Routing calls to H.323 terminals, or other entities, outside the caller's local area system requires multi-hop routing. Several solutions have been proposed which involve: manual configuration of gatekeepers; inter-gatekeeper communication; and the use of directory servers. Most of these solutions consider only calls consisting of one call leg, and none of them scale well to large networks nor provide for dynamic update of call routes over time.
An Internet Service Provider (ISP) may also wish to enforce certain Quality of Service (QoS) and security policies on H.323 calls. To achieve this, the call may be directed through proxies 16 and 26, as shown in the architecture 300 of FIG. 3.
This call setup works as follows:                Terminal 12 requests admission from its gatekeeper 14, to call Terminal 22. Admission typically includes at least: authorizing of the call, resolution of the destination address; and accounting for the call.        Gatekeeper 14 directs Terminal 12 to connect to Proxy 16.        Terminal 12 connects to Proxy 16.        Proxy 16 receives the call and queries Gatekeeper 14 on how to forward the call.        Gatekeeper 14 instructs Proxy 16 to connect to Proxy 26.        Proxy 16 connects to Proxy 26.        Proxy 26 receives the call and queries Gatekeeper 24 on how to forward the call.        Gatekeeper 24 instructs Proxy 26 to connect to Terminal 22.        Proxy 26 connects to Terminal 22.The Q.931 and H.245 signaling for the call, as well as the RTP streams, all pass through the proxies 16 and 26. In the example of FIG. 3, the call consists of three call legs (layer 7 hops). Cisco gatekeepers and proxies can implement a three-hop call, such as the one demonstrated in FIG. 3, by isolating the source zone, i.e. intranet 10, and the destination zone, i.e. intranet 20, from the rest of the network 30.        
The situation can be further complicated by decomposing the Internet cloud 30 of FIG. 3 into multiple ISP networks as shown in FIG. 4. In multiple ISP networks, each ISP can have different policies. Therefore each ISP places proxies at the border of its network and forces all incoming H.323 calls to go through these proxies in order to enforce its specific policies on the calls.
In the network architecture 400 of FIG. 4, proxy 16 is coupled to ISP network 430 which includes gatekeeper 434. ISP network 430 is connected through proxy 436 to ISP network 440. ISP network 440 includes gatekeeper 444 and is connected to ISP 450 through proxy 446. ISP network 450 includes gatekeeper 454 and is connected to proxy 26.
In the example of FIG. 4, a call from Terminal 12 to Terminal 22 will consist of five call legs. In conventional internet telephony technology, there is no mechanism available that can realize such a multi-hop (more than three hops) scenario. Once the call leaves the source zone by passing through proxy 16 to ISP network 430 and then on to ISP network 440, the application layer addressing identifying terminal 22 is unavailable for routing. Neither inter-gatekeeper communication nor directory services are able to solve this application layer routing problem.
Inter-gatekeeper communication and directory services can only resolve a layer 7 destination address into a layer 3 address of a gateway to the layer 7 domain (e.g. the PSTN). Therefore, the layer 7 address, which is the actual desired destination, becomes irrelevant to the IP network routing which takes place. For instance, a layer 7 directory may have multiple entries for a given destination address, where each entry may include a gateway protocol type or gateway cost. However, directories are not dynamic enough to store current status information for the gateway represented by each entry.
For example, there may be two gateways, a primary and a secondary, available to reach the 408 area code through the internet. If the primary gateway is out of service, then 408 area code is still physically accessible. However, once the directory resolves the layer 7/PSTN destination address to the layer 3 address of the primary gateway, then the IP network routes only on the basis of the layer 3 address and not the layer 7 destination. All the IP network knows is that it can't reach the layer 3 address of the first gateway. Once the layer 3 address is obtained from the static directory table and the call is sent out into the multiple ISP network, the call will be dependent upon the availability of the layer 3 address.
Similarly, there can be multiple ISPs with gateways to a layer 7 domain, such as the PSTN, where some gateways are better than others. For example, a gateway to the 408* area code from an ISP in San Jose will likely be cheaper in terms of telephony costs than a gateway to the 408* area code from an ISP in Oakland, even though the Oakland ISP may require fewer IP hops and therefore be cheaper in terms of IP network costs.
Therefore, the need remains for an inter-domain application-layer routing protocol that handles multi-hop inter-ISP calls.
FIG. 5 shows an example of a call routed through a voice gateway 516. In order for the H.323 terminal 522 to be able to call a telephone 512 on PSTN 510, the gatekeeper 534 routes the call to the appropriate gateway 516 given the telephone number of the call. However, there are likely to multiple gateways to the PSTN 510 through which the call could route. The PSTN access costs, i.e. phone charges, are likely to be lower through some gateways than others. However, the call from the H.323 terminal 522 to gateway 516 will be routed by the internet 530 based upon the lowest internet cost, irrespective of the telephony costs involved in using a particular gateway.
A static routing table of the conventional art can be modified to reflect the PSTN access costs of the various available gateways and choose the one with the lowest cost. Then the internet 530 will route to that gateway using BGP based upon the least number of hops. However, the static routing table solution is unable to cope with a failure of gateway 516 when other gateways to the same destination phone number are still available. The system will then drop the call when it is unable to reach gateway 516. In addition, the static routing table will also typically require a high degree of manual configuration to set up and maintain the available routes in a large network.
In contrast, the present invention is able to route the call through the IP network to the appropriate gateway according to an aggregate cost function of both the layer 7 gateway cost and the IP network costs. Some possible cost functions are: minimize the total number of hops, minimize the distance traversed in the layer 7/PSTN domain, minimize the distance traversed in the IP data network, or minimize the monetary cost of the call. To make the appropriate call routing decision, each gatekeeper also needs sufficient status information about the reachability of E.64 prefixes in the PSTN. The PSTN call leg, between the gateway and the telephone, is treated no differently from the call legs within the data network.
Therefore, the need remains for a routing system which is self-configuring, routes based upon an aggregate cost of the call, and which maintains current status information regarding each route to the destination.
Typically, the only addressing format PSTN telephones understand is E.164 numbers. Therefore, in order for the PSTN telephone 512 to be able to call the H.323 terminal 522 through the voice gateway, the H.323 terminal 522 must have an E.164 number assigned to it. And the gateway 516, or the gatekeeper 534, must be able to route the call to the H.323 terminal 522 through the IP network 530 based on the H.323 terminal's E.164 number.
FIG. 6 shows how a call can be established from a PSTN telephone 512, through voice gateway 516, onto the IP network 630, then through another voice gateway 626, back to PSTN 620 and eventually to the called telephone 622. Here once the call hops onto the IP network 630, the gateway 516, or the gatekeeper 634, must decide which hop off gateway to route the call to. This decision is made based on the called E.164 number. Note that Cisco's voice gateways can be configured to operate without gatekeepers, and in this case will have to make the call routing decisions on their own. Remember also that the IP cloud 630 of FIG. 6 can be decomposed into multiple ISP networks similar to those in FIG. 4, and therefore the call leg across the IP network 630 may actually consist of multiple hops.
FIG. 7 shows a voice call between two H.323 terminals through PSTN 730. The problem here is to provide gatekeeper 714, or gateway 716, of the calling terminal's IP network 710 with call routing information about the called IP network 720. This is possible if the two IP networks 710 and 720 are connected because they are both part of the overall Internet. As far as the IP network as a whole is concerned, the PSTN cloud 730 will be represented as just a link between two IP nodes, e.g. gateways 716 and 726. FIG. 8 shows a similar topology to FIG. 7, wherein one of the H.323 terminals 722 is replaced with a LAN PBX 828 and telephone 822.
The SIP Architecture
The Session Initiation Protocol (SIP) is an Internet Conferencing Protocol being developed in the IETF. SIP is a signaling protocol for establishing connections between endpoints participating in a conference call. The endpoints advertise their capabilities and media channel information to other nodes in the network using the Session Description Protocol (SDP) format. These capabilities are included in the SIP connection establishment messages. RTP is used for carrying the actual media streams.
The main components involved in SIP conferencing are:
                Terminal: a SIP entity capable of creating media streams and of participating in SIP conferences.        Proxy Server: receives SIP requests from a client and creates the corresponding requests for the next call leg of a SIP call.        Redirect Server: receives SIP requests from a client and responds with addressing information about where the call should be forwarded.        Gateway: for example, a gateway from SIP to PSTN, from SIP to H.323.        