Traditional Local Area Networks (LANs) exchange data using Ethernet, a frame-based standard that allows high-speed transmission of data over a physical line. Since its initial implementation, the Ethernet standard has rapidly evolved and currently accommodates in excess of 10 Gigabits/second. Furthermore, because Ethernet is widely used, the hardware necessary to implement Ethernet data transfers has significantly reduced in price, making Ethernet a preferred standard for implementation of enterprise-level networks.
Given these benefits, telecommunications service providers have sought to expand the use of Ethernet into larger-scale networks, often referred to as Metropolitan Area Networks (MANs) or Wide Area Networks (WANs). By implementing so-called Carrier Ethernet, service providers may significantly increase the capacity of their networks at a minimal cost. This increase in capacity, in turn, enables provider networks to accommodate the large volume of traffic necessary for next-generation applications, such as Voice over Internet Protocol (VoIP), IP Television (IPTV), and Video On Demand (VoD).
Because Ethernet evolved in the context of local area networks, however, native Ethernet has a number of limitations when applied to larger scale networks. One key deficiency is the lack of native support for Operation and Maintenance (OAM) functionality. More specifically, because network operators can typically diagnose problems in a LAN on-site, the Ethernet standard lacks support for remote monitoring of connections and performance. Without support for such remote monitoring, network operators of large-scale networks would find it difficult, if not impossible, to reliably maintain their networks.
To address the lack of native Connectivity Fault Management (CFM) in the Ethernet standard, several organizations have developed additional standards describing this functionality. In particular, the International Telecommunication Union (ITU) has published Y.1731, entitled, “OAM Functions and Mechanisms For Ethernet-Based Networks,” the entire contents of which are hereby incorporated by reference. Similarly, the Institute of Electrical and Electronics Engineers (IEEE) has published 802.1ag, entitled “Connectivity Fault Management,” the entire contents of which are hereby incorporated by reference.
Y.1731 and 802.1ag describe a number of mechanisms used to detect, isolate, and remedy defects in Ethernet networks. Some of these mechanisms include the establishment of a Maintenance Association (MA) comprising at least two Maintenance Endpoints (MEPs) configured on different network nodes. The MEPs within an MA are typically fully meshed, meaning that any MEP may communicate with any other MEP within the MA. MEPs within an MA work together to monitor the connections between them. For example, each MEP may periodically transmit a Continuity Check Message (CCM) to other MEPs within the MA, thereby informing the network nodes within the MA of an individual node's status. Additionally, the receipt of a CCM by a MEP inherently affirms that the connection between the sending and receiving MEPs remains sufficiently intact.
Y.1731 provides seven possible intervals for CCM transmission ranging from 3.33 milliseconds to 10 minutes. A MEP will attempt to transmit one CCM to each MEP within the MA per transmission interval. At the same time, each MEP monitors received CCMs in order to detect any problems with the other network nodes or connections thereto. Y.1731 states that if a MEP has not received a CCM within a timeout period of 3.5 times the transmission interval (i.e., if there has been a loss of three consecutive CCMs), the MEP should declare a network fault and take remedial action such as, for example, rerouting traffic.
Not all instances of a MEP not receiving expected CCMs are indicative of network failure, however. For example, during a software upgrade of the control plane of a network node, a MEP on the network node may not transmit any CCMs even though the forwarding plane continues to operate as normal. In such cases, while there is no actual network failure, any connected MEPs would likely falsely declare a network failure. These MEPs may then proceed to waste resources in an attempt to cope with the non-existent network failure. In other cases, a temporary interruption in service may be anticipated such as in the case of, for example, a reset of the network node. Here, because the interruption is known to be temporary, it may be undesirable to declare a network fault.
Y.1731 and 802.1ag further describe other periodic messages, such as the Alarm Indication Signal (AIS). A MEP may transmit an AIS periodically to suppress any alarms that may be raised by other MEPs at a higher level. Again, the standards provide that after a timeout period of 3.5 times the message interval has elapsed since the receipt of the most recent AIS, a MEP should declare a timeout. Once a timeout has occurred, the peer MEPs may determine that they are free to raise alarms. As with CCMs, certain conditions, such as a software upgrade, may prevent a MEP from transmitting an AIS for a period of time. Thus, a timeout may be declared and alarms may be raised, even though the MEP may wish to continue suppressing all alarms.
In view of the foregoing, it would be desirable to avoid false or otherwise unnecessary determinations of a timeout. In particular, it would be desirable to provide connectivity fault management that is able to avoid undesirable determinations of timeouts such as, for example, during periods where a network node may temporarily stop sending periodic CFM messages to connected nodes.