A simplified call flow for registration of user endpoint devices is illustrated by the sequence diagram 100 shown in FIG. 1. The timeline illustrates the messaging typically used in registering a user endpoint device (UE) 110 with a VoIP network 130 such as a Next Generation Network (NGN).
To prepare for registration, the UE 110 transmits two levels of Domain Name System (DNS) queries to a DNS server. First, an SRV query 151 against a high-level Fully Qualified Domain Name (FQDN) is transmitted to a DNS server 124. One or more FQDNs are returned in a response 152; the FQDNs corresponding to specific border elements of the VoIP network. A second DNS A query 153 is then transmitted to determine the IP addresses of those FQDNs. The IP addresses are returned at 154 from the DNS server 124 to the UE 110.
The UE 110 then registers via a border element 132 such as a Session Border Controller (SBC) in the VoIP network 130, by transmitting a Session Initiation Protocol (SIP) registration request 155 to the border element 132. The registration request includes at least one Internet Protocol (IP) address of a FQDN. The border element 132, in turn, registers with a registrar 134 on behalf of the VoIP user endpoint 110 by forwarding the registration request at 156. The registrar 134 then populates a local registration cache.
During recovery from a catastrophic failure such as a wide-scale power outage, a large number of VoIP endpoints such as UE 110 may come online simultaneously. A VoIP network can easily become overwhelmed when all endpoints attempt to come online simultaneously. In an environment based on SIP, the endpoints will all issue SIP REGISTER registration requests 155 within a small window of time. A large percentage of those REGISTER messages will fail with a SIP error or timeout as a result of the overload. The endpoints may re-transmit in a timeout scenario or they may attempt to re-register in an error scenario. In either case, the timers in the endpoints that control this process will, for the most part, be the same and the global synchronization attempts and failures will continue. Furthermore, the VoIP infrastructure may become overwhelmed and even calls from registered endpoints may fail.
A backoff scheme has been proposed to address the above problems, wherein a backoff algorithm is incorporated into the VoIP user endpoints 110. That approach, however, is very static in nature. A given VoIP endpoint does not know whether a failure impacts itself or many endpoints, how many endpoints are affected, and at what rate the network can allow them to come back online.
Solutions implemented at the VoIP border element 132 have also been considered. In those solutions, the border elements must still process the message flood, while gracefully rejecting some of the requests.
There therefore remains a need for a method and system capable of quickly and automatically reinstating a VoIP communications system after a catastrophic failure. The technique should re-register user endpoints as quickly as possible without overwhelming the VoIP network.