Field of the Invention
The present invention relates generally to telecommunications system fault location, and more particularly relates to a system and method for telecommunications system fault diagnostics employing a neural network.
Telecommunication systems are generally complex electrical systems which are subject to failure from a variety of fault modes. The rapid and accurate classification and isolation of a fault within a telecommunications system is desired to minimize dispatch and repair costs associated with such faults. Therefore, it is a long standing objective within the telecommunications industry to provide a system which can use measured data to automatically diagnose one of several failure modes.
The accurate diagnosis of faults within a telecommunications system is hampered by the limited accessibility of test points within the system as well as the complex relationships between faults and measurable system parameters. An automated line test system (LTS) that is currently used to perform this function is illustrated in FIG. 1. In the LTS of FIG. 1, a remote test unit (RTU) 2 is employed at each local exchange (EX) 4 within the telecommunication system. The RTU 2 is a hardware device which generates test signals. These test signals are introduced into the system through the EX 4. The test signals propagate through a main distribution frame (MDF) 6 and into the telecommunications lines 8. The signals typically pass through a cross connect switch 10, to one or more distributing points (DP) 12. Ultimately the signals reach various customer apparatus (CA) 14 such as a modem, facsimile machine, telephone handset and the like.
The telecommunications system, when operating normally, exhibits characteristic parameters in response to the RTU 2 test signal. These parameters include voltage values, current values, resistance values, capacitance values and the like. The RTU 2 samples and evaluates these parameters through the use of software. During a fault condition, these parameters change in response to a given fault.
The diagnostic software 16 implements a simple heuristic algorithm. The algorithm includes decision rules which compare one or more measurements with predetermined (by an engineer) threshold values to determine whether a fault exists. As an example, the algorithm may compare measured resistance values between a pair of lines against a set of expected threshold values which are stored in the program to decide whether a fault exists in either an exchange 4 or customer apparatus 14. The algorithm uses linear decision rules to perform these functions.
The LTS is also capable of recording the measured parameters in a database 18 for future reference. Additionally, the LTS has the capability of accepting manually entered data regarding each fault from an operator via a keyboard. This information may include customer fault reports and service personnel xe2x80x9cclear offxe2x80x9d codes indicating the actual location of a fault. In this way, a large amount of data is assembled regarding fault history and parameter values associated with various fault locations. However, the LTS is unable to use this data to improve its own operation. If desired, the data stored in database 18 may be evaluated by an engineer periodically and the decision thresholds employed by the algorithm may be manually updated. This is an extremely labor intensive, and therefore expensive, operation. Therefore, it is a long standing objective in the field of telecommunication system diagnostics to develop a system which can overcome this limitation.
In diagnostic and fault location systems unrelated to telecommunications systems, neural networks have been employed to improve system performance. A neural network is a data processing system largely organized in parallel. The neural network includes a collection of processing elements, or neurons, which are mutually interconnected to one another. The various connections are known as neuronal interconnects. The network is typically formed with an input layer of neurons, an output layer of neurons and one or more hidden layers of neurons.
An important characteristic of neural networks is that they can xe2x80x9clearnxe2x80x9d by means of neural network training. During training, previously acquired measurement data is applied to the neural network input layer. An error signal is generated at the output layer and is back propagated through the hidden layers of the network. During this operation, the various weights associated with each neuronal interconnect are adjusted to minimize the error signal. If sufficient data is applied to the neural network, the neural network is able to classify unknown objects according to parameters established during training.
In U.S. Pat. No. 5,440,566 to Spence et al., a neural network is employed to perform fault detection and diagnosis for printed circuit boards. The neural network disclosed in the ""566 patent is used to process thermal image data from an energized printed circuit board. The neural network is trained by applying data from a printed circuit board with known faults to the network. Once trained, the neural network is then able to analyze new data and classify the new data into one of a plurality of printed circuit board faults.
In U.S. Pat. No. 5.537,327 to Snow et al., a neural network is used in connection with a method and apparatus for detecting high impedance faults in electrical power transmission systems. The system disclosed in the ""327 patent employs a trained neural network to evaluate fast Fourier transforms (FFT) of continuously acquired current measurements. The neural network continuously monitors the FFT data and activates a fault trigger output in the event a high impedance fault is detected.
In general, neural networks can be viewed as a powerful approach to representing complex nonlinear discriminant functions in the form yk (x; Wk) where x is an input parameter and Wk is an optimizing parameter within the neural network. One form of neural network is referred to as a multilayer perceptron (MLP) network. The topology of an MLP neural network is illustrated in FIG. 2. The MLP network includes an input layer 24, an output layer 26 and at least one hidden layer 28. These layers are formed from a plurality of neurons 22. The input layer 24 receives input parameters and distributes these parameters to each neuron 22 in the first hidden layer 28. The hidden layers 28 process this data and establish probability estimates for each of a plurality of output neurons which make up the output layer 26.
Within the MLP network, each single neuron 22 is a discrete processing unit which performs the discriminant function by first performing a linear transformation and then a nonlinear transformation on the input variable x as follows:                               u          k                =                              ϕ            ⁡                          (                                                                    W                    k                    T                                    ⁢                  x                                +                                  W                  kO                                            )                                =                      ϕ            ⁡                          (                                                                    ∑                                          j                      =                      1                                        d                                    ⁢                                                            w                      kj                                        ⁢                                          x                      i                                                                      +                                  w                  kO                                            )                                                          Eq        .                  xe2x80x83                ⁢        1            
Where xcfx86 is a nonlinear function having the form:                               ϕ          ⁡                      (            ν            )                          =                  1                      1            +                          exp              ⁡                              (                                  -                  ν                                )                                                                        Eq        .                  xe2x80x83                ⁢        2            
The general network function for the MLP neural network of FIG. 2 is as follows:                               y          k                =                  ϕ          ⁢                      {                                                            ∑                  s                                ⁢                                                      w                    ks                                          (                      2                      )                                                        ⁢                                      ϕ                    ⁡                                          [                                                                                                    ∑                            q                                                    ⁢                                                                                    w                              sq                                                              (                                1                                )                                                                                      ⁢                                                          ϕ                              ⁡                                                              (                                                                                                                                            ∑                                      j                                                                        ⁢                                                                                                                  w                                        qj                                                                                  (                                          0                                          )                                                                                                                    ⁢                                                                              x                                        j                                                                                                                                              +                                                                      w                                    qo                                                                          (                                      0                                      )                                                                                                                                      )                                                                                                                                    +                                                  w                          so                                                      (                            1                            )                                                                                              ]                                                                                  +                              w                ko                                  (                  2                  )                                                      }                                              Eq        .                  xe2x80x83                ⁢        3            
Once a network topology is established, it is necessary to xe2x80x9ctrainxe2x80x9d the network by applying previously collected training data to the input layer 24 and output layer 26 of the neural network. Optimal network parameters, or interneural weights, are estimated from this known training data. Preferably this is accomplished using a back propagation method. In this process, known data is applied to the input of the neural network and is propagated forward by applying the network equation as previously stated in equation 3. The input data results in output vectors for each layer of the MLP network. The output vectors are evaluated for all output neurons and are propagated backward to determine errors for the hidden neurons. During this process, the weights associated with each interneural link are adjusted to minimize the resultant errors. This process is iterated until the weights stabilize over the set of training data.
The error function within the neural network may be defined by a sum of square difference function between the desired output, ok (n), and the network""s actual output, yk (n). This equation may be stated as:                               E          ⁡                      (            W            )                          =                              1                          2              ⁢              N                                ⁢                                    ∑                              n                =                1                            N                        ⁢                                          ∑                                  k                  =                  1                                C                            ⁢                                                [                                                                                    o                        k                                            ⁡                                              (                        n                        )                                                              -                                                                  y                        K                                            ⁡                                              (                        n                        )                                                                              ]                                                  (                  2                  )                                                                                        Eq        .                  xe2x80x83                ⁢        4            
To minimize this error function, a gradient descent method well known in the art may be used. In applying the gradient descent method, an adjustment which is made to a weight (xcex94W) at iteration n+1 is proportional to the size, yet opposite in direction, to the partial derivative of the error function with respect to the weight at the previous (n) iteration. This can be stated as:
xcex94wkj(n+1)=xe2x88x92xcex7∂E(n)/∂wkj(1)(n)xe2x80x83xe2x80x83Eq. 5
where xcex7 is a small positive constant which is denoted as the learning rate. The learning rate is a critical parameter within the MLP network. Selecting xcex7 to be too large may cause the network to become unstable or oscillatory. On the other hand, if xcex7 is too small, the networks learning performance will be slow. To achieve an optimal learning rate, a portion of the previous delta weight is added to the current delta weight to give the following generalized delta rule:                               Δ          ⁢                      xe2x80x83                    ⁢                                    w              kj                              (                1                )                                      ⁡                          (              n              )                                      =                                            -              η                        ⁢                          xe2x80x83                        ⁢                                          ∂                                  E                  ⁡                                      (                    n                    )                                                                              ∂                                                      w                    kj                                          (                      1                      )                                                        ⁡                                      (                    n                    )                                                                                +                      α            ⁢                          xe2x80x83                        ⁢            Δ            ⁢                          xe2x80x83                        ⁢                                          w                kj                                  (                  1                  )                                            ⁡                              (                                  n                  -                  1                                )                                                                        Eq        .                  xe2x80x83                ⁢        6            
where xcex1 is a small positive constant which is denoted as the momentum.
A second neural network topology known in the prior art is referred to as a radial basis function (RBF) network. A typical RBF network is illustrated in FIG. 3. The RBF network models discrimination functions by performing a non-linear transformation on a linear combination of a set of local kernels or basis functions as follows:
yk(x)=xcfx86(xcexa3wkjgj(x)+wko)xe2x80x83xe2x80x83Eq. 7
where xcfx86 is the same as that in Equation 2 and gj is a Gaussian basis function of the form,                                           g            j                    ⁡                      (            x            )                          ⁢                  exp          ⁡                      (                                                            "LeftDoubleBracketingBar"                                      x                    -                                          μ                      l                                                        "RightDoubleBracketingBar"                                2                                            2                ⁢                                  σ                  j                  2                                                      )                                              Eq        .                  xe2x80x83                ⁢        8            
where xcexcj are centre vectors of the network and "sgr"j are widths associated with the network.
The pictorial diagram of FIG. 3 represents the above formulae graphically. The output from a hidden or RBF node is determined by the distance in Equation 8 from an input vector x to a centre or pattern vector xcexcj. The basis functions are combined and transformed at the output layer.
For computational efficiency, a Moody-Darken learning method known in the art may be used for the optimization of network parameters. In this method, the network training involves both unsupervised and supervised stages. In the unsupervised learning stage, the centre vectors, xcexcj, are determined by using an adaptive K-mean clustering algorithm. The widths, "sgr"j, are estimated based on the distances between each centre vector and its nearest neighbours. The second stage supervised learning determines the weights from the hidden layer to the output layer using a gradient descent method similar to that for MLP networks discussed above.
In accordance with a first aspect of the present invention, there is provided a diagnostic system for locating faults within a telecommunications system, the diagnostic system comprising:
a remote test unit, the remote test unit being operatively coupled to the telecommunications system and obtaining parametric data therefrom; and
a neural network, the neural network being responsive to the parametric data from the remote test unit, classifying the parametric data to at least one of a plurality of fault locations, and generating an output signal indicative of the fault location.
In accordance with a second aspect of the present invention, there is provided a method of locating faults within a telecommunications system, the method comprising the steps of:
a) measuring a plurality of parameters associated with the telecommunications system;
b) normalizing the measured parameters; and
c) classifying the normalized parameters as probabilities associated with a plurality of fault locations.
The present invention provides an apparatus and method which achieves improved fault location in a telecommunications system.
The present invention typically provides a system which uses previous fault data and present measured data to diagnose faults within a telecommunications system.
The present invention typically provides a system which can accurately classify a fault mode in a telecommunications system.
The present invention also typically provides a system which can use previous fault data to alter the boundaries of fault decisions within a telecommunications diagnostic system.
The telecommunications fault diagnostic system is formed having a remote test unit (RTU) operatively coupled to a neural network. The RTU, which is conventional in the field of telecommunications diagnostics, is operatively coupled to a telecommunications system through a local exchange. The RTU generates test signals and measures system parameters such as resistance, capacitance, voltage, etc.
The neural network is operatively coupled to the RTU and receives the system parameter data therefrom. The neural network is a trained, and dynamically trainable, processing system which is formed from a plurality of interconnected processing units, or neurons. The neurons are organized in one or more processing layers. System parameter data is applied to a first processing layer, or input layer. From the input layer, data is distributed to one or more hidden layers of neurons. Based on weights, which are learned by the neural network during xe2x80x9ctraining,xe2x80x9d each neuron makes a decision on the data which it receives. The decisions from the interconnected neurons are applied to an output layer which assigns the final probability for each fault type and location. Exemplary outputs from the output layer include the respective probabilities for a fault being located in one of the exchange, the lines, or the customer apparatus within a telecommunication system.
The neural network is xe2x80x9ctrainedxe2x80x9d using historical fault data which is collected from an RTU and stored in a database. By evaluating many measurements, along with associated fault types and locations, the neural network is able to assign the proper weights to attribute to each neuronal interconnect within the neural network. The neural network may also be easily retrained to adapt to new data.
These and other objects, features and advantages of the present invention will become apparent from the following detailed description of illustrative embodiments thereof, which is to be read in connection with the accompanying drawings.