The present invention relates to telecommunications systems and, particularly, to an improved system for restarting signaling entities such as an H.323 gatekeeper.
The International Telecommunications Union (ITU) H.323 standard allows building of local area network (LAN) attached communication equipment that can communicate via the Internet Protocol (IP). Typically, one or more zones are established, each zone being provided with a gatekeeper for address translation, admissions and bandwidth control, and zone management. Usually gatekeeper (GK) routed signaling is utilized. With gatekeeper routed signaling, an endpoint (e.g., client or gateway) does not send the call signaling (H.225) and media control (H.245) directly to the remote endpoint but to its gatekeeper which then sends the signaling messages to the remote endpoint via its gatekeeper. In calls over several zones and/or administrative domains, several gatekeepers are involved on the signaling path of the call.
As a result of monitoring and auditing their own operation, gatekeepers may perform a software restart, for example, as a result of detection of a software error. However, the current H.323-based (and other types of systems) LAN telephony systems do not inform the other affected entities about the restart, thereby leading to problems such as unnecessary dropping of calls and/or inconsistencies in registration of clients to gatekeepers.
For example, since the gatekeepers also initialize the TCP/IP interface during a restart, where the gatekeepers are on the signaling path, a loss of all active calls can result because of the loss of the signaling path. In particular, the TCP/IP interface initialization causes clients having active calls on the signaling path to detect an error in the TCP/IP interface. As this is a severe error, the signaling protocol stack is restarted by the local client and by the remote client, thereby resulting in a loss of all calls that were active or in progress in these endpoints. As another example, a gatekeeper restart may result in inconsistent endpoint registrations to a gatekeeper. In particular, a client that is not engaged in a call is not able to detect that its gatekeeper has failed, because in H.323 the link between endpoint and gatekeeper is not consistently supervised. If the gatekeeper failure lasts only a short time, the failure does not have any effect on the registered endpoints"" operation. However, if the gatekeeper failure lasts a long time, the client must find an alternate gatekeeper in order to maintain the client""s readiness to establish calls. Many current H.323 implementations perform an audit of the gatekeeper connection either autonomously by clients or via gatekeepers by periodically causing all clients to re-register with their zone""s gatekeeper (e.g., H.323 RAS procedures). The re-registration thus allows the clients to detect a gatekeeper failure, perform recovery with this gatekeeper, and if recovery is unsuccessful, register with an alternate gatekeeper. However, the H.323 client terminal is not able to recover the failed connection, until the re-registration is due to take place. Typically, re-registration takes place only periodically every few minutes (because more frequent re-registration may cause excessive load on an already overloaded network) and the user must re-initiate call set-up again. As these re-registrations occur only periodically, inconsistent registrations may occur due to the time delay between re-registrations. In other implementations, a back-up gatekeeper may be provided which is continually updated so that the alternate can immediately replace the failed gatekeeper and take over the operation with IP addresses and ports as the master gatekeeper. However, this approach can be much more expensive compared to the average system cost per endpoint.
Accordingly, there is a need for an improved system and method for recovering from gatekeeper restarts.
These disadvantages in the prior art are overcome in large part by a system and method according to the present invention. In particular, a gatekeeper according to an embodiment of the invention is able to automatically recover signaling connections that were interrupted due to gatekeeper failure.
According to one embodiment, primary and secondary gatekeepers establish a supervisory link with one another while the H.323 calls and the associated H.225 signaling connections are set up between client terminals via the primary gatekeeper. The supervision is done by the secondary gatekeeper sending xe2x80x9ckeep alivexe2x80x9d messages between gatekeepers. When the primary gatekeeper establishes a call, information about the ongoing call (i.e., calling party, called party, and other related information) is sent to the secondary gatekeeper or stored in a commonly accessible data store.
If the primary gatekeeper fails, the H.225/H.245 connections go down, but the media connections will continue. The secondary gatekeeper then initiates takeover of the call and sends to affected clients a failure notification message that primary gatekeeper has failed and the secondary gatekeeper is ready to take over and is waiting for re-registration. Further, the secondary gatekeeper sends a similar notification message of what has occurred to all other affected remote parties to the call. The clients then reestablish the H.225/H.245 channel by using the original setup message with a new Reestablish parameter. The receiving client receives the message and continues using the existing resources for the call.
A better understanding of these and other embodiments of the present invention is obtained when the following detailed description is considered in conjunction with the following drawings.