A communications network generally comprises a plurality of network elements connected by links, for example optical links or radio links. The communications network may carry many communications paths between end users of the network, with a plurality of communications paths multiplexed onto single links of the network. A communications network is generally managed by an operator at an Operations Centre (OC) site remote from the communications network through a Data Communications Network (DCN) and performance monitoring of the links of the network may be carried out at elements of the network with the elements of the network reporting the results of performance monitoring on the DCN. In this way faults on the links can be identified at the OC by an operator thereof and remedial action, such as restoration activity or initiation of repair of the faulty link, may be taken.
Performance monitoring may also be undertaken for the communications paths carried by the network, for example at a terminal network element of a communications path. Where many communications paths are multiplexed onto a link of the network, when a fault occurs on the link of the network, faults may occur on all of the paths multiplexed on that link. If path alarms are raised when a path fault occurs, then a fault on a link will generate many such alarms. This results in alarm floods or surges when a fault occurs on the network and this can impact the DCN and impair the performance of management platforms, such as a management platform responsible for processing alarm reports.
However, alarms from surveillance of communications paths are important for informing a service user and/or a service provider when a service goes down and when a service comes up again. Such alarms are of use when a specific action is required to be taken when a service goes down or up. An, examples of this might be the service provider sending an E-mail to the service user to inform them of when a service goes up and/or down.
One way of overcoming the problem of alarm surges is to turn off reporting of some or all alarms arising from performance monitoring of communication paths. However, there is still a need for a historical log of defects on the communications paths so as to provide after the fact information about the impact of network faults on the communications paths as an aid to trouble shooting. Primarily, this log is used by the operator/carrier to provide a history of the quality of service (QOS) provided to justify charges for use of the service.
Traditionally, such historical logs are stored within the network on a network element where it is recorded in Performance Monitoring (PM) reports or in text logs, which can be inspected later.
Traditionally, performance monitoring has monitored the incoming signal continuously looking for bit errors in each successive one second time interval over which the signal is received. Where less than 30% of bits in a second are detected as bit errors, an Errored Second (ES) is detected. Where 30% or more of the bits received in one second are detected as bit errors a Severely Errored Second (SES) is detected. Where ten successive SESs occur an Unavailable Time (UAT) is detected. In a 24 hour period data is collected for 96 15 minute reports and one 24 hour report. In relation to a 15 minute report, over a 15 minute period, seconds of received signal containing errors are counted as ES, SES or UAT, so that at the end of each 15 minute period a 15 minute report is generated and stored as a count of ES, SES and UAT seconds within the 15 minute period. In addition a 24 hour report is generated and stored as a count of ES, SES and UAT seconds within a 24 hour period. This performance monitoring process is described in various standards, for example:                ITU-T Recommendation G.826 (2002), End-to-end error performance parameters and objectives for international, constant bit-rate digital paths and connections;        ITU-T Recommendation G.828 (2000), Error performance parameters and objectives for international, constant bit-rate synchronous digital paths (See also Corrigendum 1); and        ITU-T Recommendation G.8201 (2003), Error performance parameters and objectives for multi-operator international paths within Optical Transport Network (OTN).        
Whilst this process produces a fairly succinct log of defect activity within a given period, there are a few shortcomings.
Firstly, the 15 minute and 24 hour reports are still collected even if there are no defects for the period. So over an error free period a total of 96 15 minute reports and 1 24 hour report will be collected. In this circumstance the PM report log is not succinct.
Secondly, if over a period a small number of defects are detected, it is not easy to see at a glance what periods of continuous defect free time were enjoyed by a given path. In this case the 24 hour report will contain a small number of defects as will some of the 15 minute reports so that continuous defect free time has to be pieced together from defect free 15 minute reports. The 24 hour report does not have the granularity required, although it is concise, given the period it covers. The 15 minute reports, have improved granularity, although not sufficient to determine exactly when a service may have failed. The 15 minute reports have the added disadvantage that it requires larger amounts of memory to store them.
Thirdly, the precise time that a defect occurred is not known beyond the granularity of the report. For example if a 15 minute report counts several ESs, it is not known when in the 15 minute period they occurred or whether the ESs occurred together or at different times. Precise time can be important in assessing whether a service outage breaches the conditions set out in a service level agreement.
In addition it is known to make a log of intervals of UAT, for example as a start time and a finish time for an interval of UAT. G.826 (referenced above) states that a period of unavailable time begins at the onset of ten consecutive SES events. These ten second are considered to be part of unavailable time. A new period of available time begins at the onset of ten seconds of consecutive non-SES events. These ten seconds are considered to be part of available time.
As an alternative to the PM report scheme described above a text log can be utilised in which defects on the incoming signal are recorded as a simple time-stamped textual log, indicating the defect detected/raised. This preserves the detail of the defect but is associated with different problems.
Firstly, logs are typically stored in non-volatile memory so that the log can be recovered even after a power loss on a network element. The available memory for logs is typically limited within a network element. A goal is that it should be possible to store 24 hours of defect activity in a log. The amount of memory required to store 24 hours of defect activity varies dependant on the stability of the network. However, it is generally accepted that log memory is unlikely to last 24 hours before information is lost by new text log entries overwriting old ones.
Secondly, when troubleshooting a rapidly toggling defect, a log containing hundreds (even thousands) of alternating defect raised, defect cleared entries does not help in detailing service performance. The likelihood is that entries in the log of important information, for example indicating the first onset of the defect, will be lost as a result of new text log entries overwriting old ones.
The present invention proposes a method of recording historical service quality information, that provides useful, precise information on service quality, yet is succinct enough to be stored using the limited available memory normally allocated on a network element for log data.