Synchronous Optical Network (SONET) and Synchronous Digital Hierarchy (SDH) networks have been broadly deployed by telecommunications service providers to provide the broadband infrastructure needed for many advanced telecommunications services. In addition to providing exceptionally high transmission rates in excess of 10 gigabits per second, SONET and SDH networks provide sophisticated performance monitoring and network control capabilities.
By way of example, FIG. 1 illustrates some of the major components in a typical SONET network. A SONET network 100 comprises two subnetworks 110, 120 and a network surveillance manager (NSM) 130. Each subnetwork includes a set of interconnected network elements that are each connected to a single subnetwork controller (for example, a subnetwork 110 includes a series of interconnected network elements (NEs) 112-1 through 112-n that are each connected to a subnetwork controller (SBNC) 116. SBNCs 116, 126 work in conjunction with the NSM 130 to monitor and control the NEs 112, 122.
One significant tool for SONET subnetwork maintenance is alarm monitoring. A substantial number of alarms are generated by NEs in response to a variety of detected conditions. Many of these alarms may reflect short duration transient conditions, anticipated maintenance actions (such as the installation of new equipment), or maintenance conditions detected and reported elsewhere. Such alarms have limited relevance for the purposes of monitoring subnetwork maintenance conditions.
A variety of filtering techniques have been used in the prior art to eliminate irrelevant alarms. For example, one technique employs "aging" to reduce the number of forwarded alarms. Using this method, alarms are stored at an associated SBNC for a pre-defined "aging period" before being reported. If an alarm is cleared during this period, it is suppressed.
A second technique used in the prior art is "alarm throttling." Using this technique, each NE is allowed to send at maximum a pre-defined number of alarm messages to the SBNC during a specified time period (for example, 100 alarm messages over a five minute period). All additional alarms produced during the period are suppressed.
A third technique employed in the prior art is "Access Identifier (AID) correlation." Using this technique, multiple alarms generated at the same SONET termination port (AID point) are suppressed so that only the highest severity alarm at the AID point is reported.
While each of these strategies can significantly reduce the number of forwarded alarm messages, each does so by creating some risk. For example, a risk is incurred in alarm throttling applications that a significant alarm will be discarded if it follows a period during which many symptomatic alarms were reported. Additionally, each of these techniques may be ineffective for eliminating irrelevant alarms under some conditions.
For example, a failure condition may be detected by a NE and reported as an autonomous message to an associated SBNC. In addition, the NE may alert other NE's to the condition it has detected. In turn, these NEs will send autonomous messages about this condition to their associated SBNCs. Because NE's may be alerted across a number of subnetwork boundaries, associated SBNC's and NSMs may be flooded by alarm messages produced by these NEs. Most of the messages sent are effectively "symptomatic," as they do not directly stem from the failure of interest. Notably, these symptomatic messages provide no additional maintenance-assisting information beyond that provided by the first autonomous message sent by the affected NE.
To address these shortcomings, another strategy has been proposed (see Intelligent Alarm Filtering for SONET, Bellcore Document No. SR-TSV-002672, Issue 1, Mar. 4, 1994). This scheme is illustrated in FIG. 1, where alarm filters 118, 128 are incorporated within SBNCs 116, 126 respectively. According to this Intelligent Alarm Filtering (IAF) scheme, all alarms generated by the NEs are reported to their associated SBNCs. Two classes of failures appear at the SBNCs. Directly Detected Failure Conditions (DDFCs) are considered directly indicative of a failure in the subnetwork. DDFCs indicate equipment failures (failures occurring within a NE) and facility failures (failures associated with facilities that interconnect NEs including, for example, loss of signal, loss of frame, out of frame, loss of pointer, signal label mismatch, automatic protection switching, data communications channel, and synchronization failures). In contrast, Symptomatic Conditions (SCs) are merely symptomatic indications of troubles detected at a reporting NE or at other NEs (for example, as indicated by alarm indication signal, remote failure indication, performance monitoring threshold crossing alert and successful protection switching completion alarms).
According to the Bellcore IAF requirements, each SBNC logs all autonomous messages received from NEs in the subnetwork, and reports all messages indicating a DDFC to the NSM. All messages reporting SCs that can be explained by a reported DDFC must be filtered out and not reported to the NSM. Messages associated with non-explainable SCs continue to be reported to the NSM.
Because SONET and SDH subnetworks incorporate a large number of multiplexed communications paths, in order to determine whether SCs are explainable or non-explainable, a SBNC must be able to specifically and directly trace the path between a SC message and a DDFC message in order to establish any correlation. Tracing requires realtime knowledge both about the interconnections of NEs in the subnetwork as well as provisioned cross-connections within the individual NEs. In SONET and SDH subnetworks of current proportion, a single DDFC message can generate thousands of SC messages. Thus, the potential magnitude of SC messages makes such direct tracing of correlated alarm messages prohibitive. Therefore, an improved method is desired for filtering redundant SC messages without directly tracing each SC to an associated DDFC.
Although correlated alarms may be generated almost simultaneously in affected NE's, alarm messages can be received by an associated SBNC over a widely varying time period (often referred to as the "alarm storm"). Alarms may be effectively correlated only if examined over a time period sufficient to ensure that all related alarms have been received by the SBNC. Therefore, an effective method is desired for establishing an appropriate time period for filtering alarms.