After recovery from a catastrophic failure (e.g., a wide-scale power outage), a VoIP network can easily become overloaded when many or all of its endpoints attempt to re-register with the network in a short window of time. For example, in a session initiation protocol (SIP)—based environment, many of the endpoints may issue SIP REGISTER messages within a small window of time. The overload on the network will cause a large percentage of these SIP REGISTER messages to fail with a SIP error or a timeout.
An endpoint that receives a SIP error or a timeout may attempt to re-transmit the SIP REGISTER message. This re-transmission process is controlled by a timer in the endpoint. However, the timers in all of the endpoints are typically set to the same value, such that the endpoints are likely to attempt re-transmission within the same small window of time. As such, the pattern of globally synchronized registration attempts and failures can continue.
To further complicate matters, the VoIP infrastructure may become so overwhelmed that even calls from registered endpoints may fail. These failed calls may in turn cause more endpoints to attempt re-registration, thereby exacerbating the ongoing registration flood. This snowballing effect can ultimately lead to a major failure of the VoIP network.