To meet reliability requirements, network devices often utilize hardware and software-based mechanisms to quickly detect, and in some cases attempt to recover from, link failures. Hardware-based mechanisms typically involve physical signaling and media level fault detection. Software-based mechanisms typically take the form of link monitoring protocols, and may utilize exchanges of protocol packets (e.g., keepalives) over links between neighboring network devices to determine the operational status of links. If one or more protocol packets for a link are not received at a network device during a protocol timeout period, a protocol state of a port coupled to the link may expire, and it may be assumed the link has failed. Common link monitoring protocols that operate in this manner include UniDirectional Link Detection (UDLD) protocol, Bidirectional Forwarding Detection protocol (BFD), Device Link Detection Protocol (DLDP) protocol, among others.
One issue with link monitoring protocols, as well as other types of protocols, is that transient software, hardware or network conditions may lead to “false positive” identifications of network problems, such as, link failures. For example, due to high processor (e.g., CPU) load at a network device, or temporary traffic congestion in the network, protocol packets may be delayed, or otherwise not be received, during a protocol timeout period, and a link may be mistakenly declared as failed, while the link itself is operating normally. False positives may be particularly prevalent with protocols that implement sub-second length timeout periods, as there is a greater likelihood a transient condition may cause protocol packets to not be received within the allotted timeout period. As the reliability of computer networks becomes increasingly important, the existence of substantial numbers of false positives has become unacceptable.