Timing loops are known to present a problem for distribution of timing signals in a communications network. In SONET/SDH networks, for example, network elements (“NEs”) may each derive their timing from another NE. If the chain of timing derivation forms a loop which is isolated from an external reference timing source, this is a timing loop. In general, a SONET Network that has been properly configured will not suffer from a timing loop problem. However, it is sometimes difficult to avoid timing loops without sophisticated network management tools.
Although conceptually simple, timing loops tend to be insidious problems in the real world. Timing loops preclude the affected NEs from being synchronized to the primary reference clock (“PRC”) and cause mysterious bit errors which are difficult to analyze and correct. The clock frequencies are traceable to an unpredictable unknown quantity, i.e., the hold-in frequency limit of one of the affected NE clocks. By design, this is bound to be well outside the expected accuracy of the clock after several days in holdover, so performance is almost certain to become severely degraded.
The importance of proper timing distribution and synchronization in a network is illustrated in the following situation. If two pieces of equipment that are synchronized to different clock sources are joined by a trunk, input buffers on the interfaces at each node periodically overflow at one end or underflow at the other end. This overflow or underflow condition is commonly known as a frame slip because an overflow condition usually causes one or more frames of data to be discarded. Clocking problems typically cause frame slips on circuit-line interfaces, especially circuit lines to TDM devices such as a PBX. Frame slips can occur on either or both ends of the line. In a TDM-based network, almost every frame slip causes data to be lost since there is likely to be data contained in at least one timeslot of every frame.
Isolating the cause of a timing loop condition is difficult for at least two reasons. One reason is that the cause is unintentional, e.g., a lack of diligence in analyzing all fault conditions, or an error in provisioning. The second reason is that there are no sync-specific alarms associated with timing loops since each affected NE accepts the situation as normal. Consequently, the network administrator must carry out trouble isolation, relying on a knowledge of the sync distribution topology and on an analysis of data on slip counts and pointer counts.