(1) Field of the Invention
The present invention relates to a fault information management apparatus and a fault information management method which are suitable for a system for monitoring information about faults arising in transmission apparatuses within a network.
(2) Description of the Related Art
In recent years, the degree of complexity and the capacity of a transmission apparatus used in a network, represented by an SDH (Synchronous Digital Hierarchy) transmission network, have increased. In association with such trend in the transmission apparatus, the accuracy of management and monitoring of faults have also been improved. Therefore, it is now essential to completely and correctly process at high speed a large amount of fault information arising in the transmission apparatuses.
FIG. 21 is a block diagram representing one example of a transmission apparatus monitoring system. As shown in FIG. 21, a transmission apparatus monitoring system 60 is comprised of a monitoring apparatus (network monitoring system (NMS)) 50, and transmission apparatuses (network element (NE)) 51-1 to 51-n (n is a natural number). The monitoring apparatus 50 monitors, in a centralized manner, fault information arising in the transmission apparatuses 51-1 to 51-n through use of a work station (WS), or the like. It is designed so that one monitoring apparatus 50 can monitor a plurality of transmission apparatuses 51-i (where i=1 to n) collectively through a network. The following descriptions will be based on the assumption that an SDH transmission scheme is applied to the above-described monitoring system 60.
As shown in FIG. 22, each transmission apparatus 51-i is comprised of a main signal unit 52, a main signal unit control section 53, and an interface processing section 54 which acts as a fault information management unit. The main signal unit 52 performs various kinds of processing, such as processing for multiplexing electrical signals received from an unillustrated exchange so as to obtain a multiplexed signal and converting it into an optical signal to be transmitted to another transmission apparatus 51-i. The main signal unit 52 is designed so as to be able to detect information regarding transmission failures, faults in the transmission apparatus, and the like that occur during the above-described processing operations.
More specifically, the main signal unit 52 is made up of; e.g., STM1 optical units (STM1 slots) 52a, multiplexing units (TSI slots) 52c, D12 channel units (D12 slots) 52e and 52f, a D12 channel changeover unit (CHSW slot) 52g, and power units (PWR slots) 52h.
Each of the STM1 optical units 52a carries out electricity-to-light conversion. Specifically, the STM1 optical unit 52a converts an electrical signal received from the exchange into an optical signal so that it can be transmitted to another transmission apparatus 51-i and the monitoring apparatus 50. The multiplexing unit (time slot interchange (TSI)) 52c provides line connection between the STM1 optical unit 52a and the D12 channel unit 52e (52f). This multiplexing unit 52c demultiplexes a multiplexed signal (i.e., an STM1 frame) received from the STM1 optical unit 52a so as to obtain separated signals corresponding a plurality of channels. Each of the thus-separated signals is sent to the corresponding D12 channel unit 52e (or 52f). In contrast, the multiplexing unit 52c multiplexes signals received from the D12 channel units 52e (or 52f) and sends the thus-multiplexed signal to the STM1.
Each D12 channel unit 52e is an interface unit capable of handling C12 signals (2 Mb/s), which conform to a PDH (Presiochronous Digital Hierarchy), for 21 channels (ch). Three D12 channel units 52e are provided in this example so that the C12 signals for up to 63 channels can be multiplexed. The D12 channel unit 52f operates as a replacement when any of the D12 channel units 52e becomes impossible to operate for reasons of a breakdown, etc. Namely, the D12 channel unit 52f is provided as a protection unit (or a spare unit) for the D12 channel units 52e.
The D12 channel changeover unit 52g carries out an operation to switch the faulty D12 channel unit 52e to the protection unit 52f, as previously described. The power units 52h supply the electrical current and voltage required to operate the transmission apparatus 51-i.
In the main signal unit 52 shown in FIG. 23, a unit which is identical with the STM1 optical unit 52a can be mounted into a free area (or an unoccupied slot) 52b. For instance, if the STM1 optical unit 52a is mounted into each free area 52b, it will become possible to process signals in a number of up to twice as many as the above-described number of channels (2 Mb/s.times.63 ch). Each of the above-described STM1 optical units 52a, the TSI units 52c, and the D12 channel units 52e has the capability of detecting faults arising in itself.
Optical transmission between the transmission apparatuses 51-i is carried out over optical cables, etc. For example, if the transmission apparatus 51-i receives C12 signals (electrical signals for a maximum of 63 channels) complying with the PDH, the transmission apparatus 51-i multiplexes the C12 signals of the channels through the D12 channel units 52e and the TSI units 52c, as shown in FIG. 24, whereby a multiplexed frame is assembled. This multiplexed frame is converted into an optical signal by the STM1 optical unit 52a, and the optical signal is then sent to another transmission apparatus 51-i. An STM1 frame complying with the SDH is used in the optical transmission between the transmission apparatuses 51-i shown in FIG. 24, and therefore the multiplexed frame is sent as an optical signal having a bit rate of 150 Mb/s.
The main signal unit control section 53 shown in FIG. 22 (corresponding to the MPL unit shown in FIG. 23) controls an operating state of each main signal unit 52 through a bus (indicated by arrow A in FIG. 22) by use of an internal CPU (Central Processing Unit not shown). In this example, the main signal unit control section 53 is designed so as to collect fault information upon reception of a report indicating occurrence of a fault in the main signal unit 52.
The interface processing section 54 establishes connection with the monitoring apparatus 50 through the network. In this example, the interface processing section 54 prepares a fault history on the basis of the fault information which is sent from the main signal unit control section 53 every time a fault arises in the main signal unit 52. The interface processing section 54 sends the fault information in a format complying with the communication scheme of the monitoring apparatus 50 (i.e., connection standard such as R232C or X.25). The interface processing section 54 is comprised of a SAC unit (SAC) 54a and an SV interface control unit (NML) 54b, as shown in FIG. 23.
To this end, the interface processing section 54 is comprised of, for example, an MPL communication processing section 54A, an alarm processing section 54B, TL1 processing sections 54C, a user management section 54D, a 232C communication section 54E, and an X.25 communication section 54F, as shown in FIG. 25.
Upon receipt of fault information (i.e., alarm information; namely, fault occurrence information and fault recovery information) from the main signal unit control section 53, the MPL communication processing section 54A prepares, for example, a mail 54A-i (i=1 to n; n is a natural number), as shown in FIG. 28. By virtue of the mail 54A-i, the MPL communication processing section 54A notifies the alarm processing section 54B of the received alarm information. If the MPL communication processing section 54A has received a plurality of alarm information items, a plurality of mails 54A-i will be prepared, as can be seen from FIG. 28. The thus-prepared mails 54A-i are sent to the alarm processing section 54B while being linked together.
Upon receipt of the mail 54A-i (i.e., the alarm information) from the MPL communication processing section 54A, the alarm processing section 54B expands and stores the mail 54A-i in resources such as a memory within the interface processing section 54. The alarm processing section 54B prepares a fault history by sequentially registering the alarm information in a table 54G (i.e., a history table). The fault information is reported to the monitoring apparatus 50 on the basis of the thus-prepared fault history. The alarm processing section 54B is provided with a report communication section 54R. With reference to the table 54G, this report communication section 54R acquires alarm information to be reported to the monitoring apparatus 50 and provides such information to the user management section 54D.
The TL1 processing section 54C is prepared by the user management section 54D for every operator (user) who accesses to (or logs in) the user management section 54D through the network using a UNIX machine or the like. For example, if the user enters a password, one TL1 processing section 54C will be assigned to that user. The number of assignable TL1 processing sections 54C depends on a quantity of resources and the performance of the CPU.
The user management section 54D manages the state of communication between the transmission apparatuses 51-i and causes the TL1 processing sections 54C in the number corresponding to the number of users being in an accessed condition to operate. In cooperation with the TL1 processing sections 54C, the user management section 54D can simultaneously manage the communication with the plurality of users (including the monitoring apparatus 50).
The 232C communication section 54E and the X.25 communication section 54F receives the information (i.e., alarm information) from the user management section 54D and sends the thus-received information to the monitoring apparatus 50, in a format that complies with the communication scheme of the monitoring apparatus 50. Further, the 232C communication section 54E and the X.25 communication section 54F receive a request for the alarm information from the monitoring apparatus 50, in a format that complies with the communication to the user management section 54D. The information from the 232C communication section 54E and the X.25 communication section 54F and the information from the corresponding TL1 processing section 54C are exchanged via the user management section 54D.
More specifically, the above-described history table 54G has, for example, areas to be filled with information about "LOCATION OF FAULT," "FAULT (NAME OF THE FAULT)," "INFORMATION," "OCCURRENCE TIME," "RECOVERY TIME," "REPORT OF OCCURRENCE," and "REPORT OF RECOVERY," as shown in FIG. 26. When a fault occurs, the above-described areas are sequentially filled with the above-described various kinds of information from the first row.
The log capacity (N) of the history table 54G is set by the monitoring apparatus 50, as required. When the log capacity (N) has been fully filled, the history information may be deleted in chronological order from the oldest information. Alternatively, new history information may be written in the history table from the first row after the old information stored in the history table has been deleted completely.
The "LOCATION OF FAULT" designates the location of the main signal unit 52 where a fault has occurred. The name, etc., of the transmission apparatus in which the fault occurred is written into the "LOCATION OF FAULT" column after having been converted into a corresponding numerical value. The above-described "FAULT" designates the detail of the fault. In this example, the name of a fault is written into the "FAULT" column after having been converted into a corresponding numerical value. The above-described "INFORMATION (RELATED INFORMATION)" designates features of the detail of the fault. Detailed information, for example, a level (i.e., a facility) and an attribute (i.e., a severity) of a line, and the direction of a signal are written into the "INFORMATION" column after having been converted into numerical values.
The above-described "OCCURRENCE TIME" designates the time at which the fault has occurred. The "RECOVERY TIME" designates the time at which the fault has been corrected. A date and a time are written into each of these columns.
The above-described "REPORT OF OCCURRENCE" designates whether or not the occurrence of a fault has already been reported to the monitoring apparatus 50. If the occurrence of the fault has already been reported to the monitoring apparatus 50, "REPORTED," for example, is written into the "REPORT OF OCCURRENCE" column as information which represents the completion of report of the fault to the monitoring apparatus 50. In contrast, if the occurrence of the fault has not yet been reported to the monitoring apparatus 50, "UNREPORTED," is written into the "REPORT OF OCCURRENCE" column.
The above-described "REPORT OF RECOVERY" designates whether or not the monitoring apparatus 50 has been informed of the correction of the fault. As is the case of the "REPORT OF OCCURRENCE," if the correction of the fault has been reported to the monitoring apparatus 50, "REPORTED" is written into the "REPORT OF RECOVERY" column. In contrast, if the correction of the fault has not yet been reported to the monitoring apparatus 50, "UNREPORTED" is written into the "REPORT OF RECOVERY" column. Usually, fault information in rows whose "REPORT OF OCCURRENCE" or "REPORT OF RECOVERY" column is still filled with "UNREPORTED" is reported to the monitoring apparatus 50. However, if the monitoring apparatus 50 has requested the history, all the details provided on the history table 54G are reported as history information (LOG) to the monitoring apparatus 50.
More specifically, for example, if a fault having name K (hereinafter referred to as fault K) has occurred at time T1, the alarm processing section 54B writes "K" into the "FAULT" area and "T1" into the "OCCURRENCE TIME" area in the X-th row of the history table 54G, as shown in FIG. 27(a). In the state in which the occurrence of a fault has not yet been reported to the monitoring apparatus 50, "UNREPORTED" is written into the "REPORT OF RECOVERY" area as well as into the "REPORT OF OCCURRENCE" area.
When information about the correction of the fault K is reported from the main signal unit control section 53 at time T2, the alarm processing section 54B searches from the log table 54G a registration area related to the fault K in which the time (T1) has already been written. For example, as shown in FIG. 27(b), "T2" is written into the "RECOVERY TIME" area in the X-th row in which the information about the fault K has been written.
In the case where the occurrence of the fault K has been reported to the monitoring apparatus 50 but the correction of the fault K has not yet been reported to the monitoring apparatus 50, "REPORTED" is written into the "REPORT OF OCCURRENCE" area, and "UNREPORTED" is written into the "REPORT OF RECOVERY" area, as shown in FIG. 27(b). When the fault information is reported to the monitoring apparatus 50, the recovery information (recovery time "T2" and the like) which has not been reported is reported to the monitoring apparatus 50.
However, when a fault whose occurrence time has been written into the "OCCURRENCE TIME" area of the history table 54G is corrected, the above-described interface processing section (i.e., fault information management unit) 54 must search the history area in the history table 54G to which a recovery time is written. For this reason, as the log capacity (N) of the history table 54G becomes larger, it takes a longer time to search the history area, thereby resulting in delays in processing.
Further, if the main signal unit control section 53 is removed and inserted due to replacement or the like and is then restarted, there is a possibility that the main signal unit control section 53 finds, as a new fault, the fault that has already been reported to the interface processing section 54 and reports the fault again to the interface processing section 54. In this case, the information about he identical fault is written into the history table 54G of the interface processing section 54 in a duplicated manner. As a result, the identical fault information will be reported to the monitoring apparatus 50 in a duplicated manner.
To prevent the above-described duplication of the fault information, it is only necessary to check whether or not the fault information reported by the main signal unit 52 after restarting of the main signal unit control section 53 has already been written into the history table 54G. However, even in this case, the alarm processing section 54B must search all the fault information provided in the history table 54G. Similarly, as the log capacity (N) becomes larger, it takes a much longer time to search the fault information.
If an abnormality such as stoppage of the communication between the interface processing section 54 and the main signal unit control section 53 or destruction of the history table 54G occurs, information regarding faults that occurred in the main signal units 52 during such an abnormal state are not written into the history table 54G. In such a case, at the time of correction of the fault, it is necessary to search all the fault information provided in the history table 54G in order to check fault information missing from the history table 54G. Therefore, in this case as well, it takes a longer time to search the fault information as the log capacity (N) becomes larger.
Further, if a breakdown arises in the communication between the transmission apparatuses 51-i and the monitoring apparatus 50, or if any fault arises in the monitoring apparatus 50 itself, the interface processing section 54 sequentially stores in the history table 54G information regarding faults that occurred in the main signal unit 52 during the period of time in which communication with the monitoring apparatus 50 has been impossible. These information pieces, whose "REPORT OF OCCURRENCE" and "REPORT OF RECOVERY" columns are still filled with "UNREPORTED," are collectively reported to the monitoring apparatus 50 at the time of recovery.
Although the fault information provided on the history table 54G is recorded in order of occurrence (i.e., in order of points in time when faults occurred), recovery times are recorded in random. As a result, contents of "REPORT OF RECOVERY" column are provided in the order differing from the order of actual points in time when the faults were corrected. To prevent this problem, the interface processing section 54 must carry out complicated sorting operations, thereby resulting in considerable delay in reporting the recovery to the monitoring apparatus 50.
If the MPL communication processing section 54A of the interface processing section 54 receives a report (i.e., alarm information) from the main signal unit control section 53, as previously described with reference to FIG. 28, the thus-received alarm information is sent to the alarm processing section 54B by means of the mail 54A-i. However, if a large number of reports are received from the main signal unit control section 53, a large number of mails 54A-i are sent to the alarm processing section 54B while being linked together. As a result, memory resources for storing (or expanding) the mails 54A-i are exhausted due to the delays in such processing as previously described; i.e., processing from the preparation of a history carried out by the alarm processing section 54B to the report of the alarm information to the monitoring apparatus 50.
More specifically, in the above-described interface processing section 54, the alarm processing section 54B carries out the preparation of the history table 54G and report of the fault information to the monitoring apparatus 50 in the form of a series of operation. Consequently, if the preparation of the history table 54G is delayed, the report of fault information to the monitoring apparatus 50 will be also delayed. Therefore, storage resources for holding the mails 54A-i are exhausted, which makes it difficult to completely notify the alarm processing section 54B of the information about all of the faults reported by the main signal unit control section 53.