Communication systems often provide one or more services that need high availability and reliability of a communication related services. As a result, redundancy is often utilized in such systems to ensure that service interruption is kept to a minimum in case of equipment failure. Examples of systems that utilize redundant servers or other redundancy mechanisms are disclosed in U.S. Pat. No. 6,751,748 and U.S. Patent Application Publication Nos. 2003/0123635, 2004/0209580 and 2008/0304478.
Often, redundant systems require primary and backup servers to communicate with each other to have a complete knowledge of the service status of the other device. For instance, the primary sever may send a message to the backup server that identifies its current service status and the backup server may send a message to the primary server to identify its current service status. Such systems, however, can result in a backup server not quickly learning of a failure of a primary server if there fails to be a timely communication of a failure event to the backup server due to a communication failure or damage to the primary server. Such latency in the determination of a failure can result in undesirable time periods of poor service or time periods in which the service hosted by the primary server is unavailable to users.
A new system is needed for identifying a failure event that may require a backup device to take control of a process being overseen or managed by a primary device. We have determined that it would be preferable for embodiments of such a new system to permit redundancy to be provided without an exchange of messages having to occur between the primary and backup devices for the backup device to deduce that it should take over the services hosted by the primary device. Additionally, we have determined that it would be preferable for embodiments of such a system to avoid redundancy “split-brain” breakdown problems that occur when the redundancy communication system is broken and a primary server acts as a standalone or when a backup server wrongly takes control causing a double mastership on the network.