1. Field of the Invention
The present invention relates to a method for detecting anomalies in a digitized signal.
2. Discussion of the Related Art
In the context of industrial processings or during the operation of machines, it may be useful to monitor whether the operation flow is normal or abnormal. For example, in an industrial processing, several sensors will be placed at various points of a manufacturing line to store, for example, flow rates, pressures, temperatures . . . and follow their variations. In an automobile, a plane, a rocket, for example, sensors may be arranged on various propulsion elements or in the vicinity thereof to analyze operating characteristics thereof. Here again, the sensors may be sensors of the flow rate, temperature, pressure, speed, etc. Similarly, it may be desired to analyze the characteristics of a product under manufacturing, for example in the field of chemistry, pharmaceutics, agricultural-produce industry. The occurrence of possible structural anomalies in a building or a structure may also be monitored by vibration sensors. In all these cases, sensors will provide a continuous analysis of several manufacturing, operation, structure, or formulation parameters.
More generally, interest will be taken in detecting the presence of anomalies in any signal likely to fluctuate or in any computerized transmission of digitized data such as: passenger traffic, mobile telephony traffic indicators, etc.
In the state of the art, a known method for detecting anomalies in a signal consists of performing many preliminary tests, storing a large number of signals, analyzing the operation of the associated process, identifying signals including one or several anomalies corresponding to malfunctions of the process, and storing normal signals, which do not include these anomalies and correspond to a normal process operation. An iterative calculation based on an artificial neural network enables learning to discriminate a signal including anomalies from a normal signal. Abnormal signal phases can then be identified by using this experimental learning.
A major handicap of this method results from the fact that the implemented learning is very slow and requires performing a large number of tests and having a large number of examples of signals including the anomalies to be detected. Such a method can especially not be implemented to analyze anomalies and signals occurring relatively seldom, for example, to analyze the first instants characterizing the starting of a rocket or signals that only very exceptionally include anomalies, for example nuclear plant cooling circuit monitoring signals.
Another disadvantage of this method results from the fact that the calculation program of the neural network used will be specific to the analyzed signal and will not be applicable to detecting an anomaly on a signal of another nature. Thus, for each signal to be analyzed, a specific calculation program and a corresponding programming time will have to be provided.
An object of the present invention is to provide a method for detecting anomalies overcoming the disadvantages of prior art methods.
A more specific object of the present invention is to provide such a method that is applicable to detecting anomalies in signals for which no previous sample of anomaly is available.
Another object of the present invention is to provide such a method in which anomalies of a signal can be discovered by an automatic unsupervised learning, without using previous tests on possible malfunctions of the process with which the signal is associated, and without requiring for the user to provide example or lists of possible anomalies.
Another object of the present invention is to provide such a method in which an initial learning of signal characteristics can be performed by an unsupervised automatic parameterizing to then detect anomalies by a statistical analysis or by comparison with memorized types of anomalies.
The present invention applies to a continuous signal as well as to an oscillating signal. This signal may originate in real time from a sensor recording of a system variable (temperature, pressure, vibration, noise, spectrographic analysis signal, x-ray analysis signal . . . ). The signal may also be a signal recorded by such a sensor stored in a digital databank. It may also be other types of signals stored in a digital databank, for example, as indicated, a signal characterizing the evolution of a data sequence representing any information, the variations of which are desired to be analyzed, for example data resulting from statistical analyses, from passenger or vehicle traffic indicators, etc.
Generally, according to the present invention, complex geometric characteristics, in frequency or time, of an initial signal portion are statistically analyzed to enable subsequently recognizing an anomaly on any subsequent portion of the same signal. This subsequent portion of the same signal may correspond to a separate sequence of a signal of same type. For example, if the taking-off of a rocket has been analyzed, information gathered upon analysis of a first rocket may be used for each of the considered signals to set the initial analysis parameters of the taking-off of the next rocket.
More specifically, to achieve the above-mentioned objects, the present invention provides a method for detecting anomalies in a digitized complex signal analyzed by a detection unit, including a machine learning step including a parameterizing of an automatic compression system, and a step of diagnosis of the intensity and/or the rarity of an anomaly,
the learning including the steps of:
1.1 selecting a succession of sequences of values of the analyzed signal corresponding to a succession of time windows (Fk);
1.2 transforming the signal of each of the windows to extract therefrom characteristics of a type easily extracted by a human eye to form a first vector (Dk) of dimension n; and
1.3 reducing number n of digital data by an automatic compression of the first vector (Dk) to provide a second vector with coordinates substantially independent in probabilistic terms, of dimension p smaller than n;
the diagnosis including the steps of:
2.1 applying steps 1.1 to 1.3 to a polling window (Fk) likely to include an anomaly;
2.2 comparing the obtained vector with a reference defined according to the same transformation and compression structure.
According to an embodiment of the present invention, the transformation intended for extracting signal characteristics associated with the human eye vision system is selected from the group including a fast Fourier transform (FFT), a transform on a Gabor-wavelet base, a maxima and/or minima extraction, and the like.
According to an embodiment of the present invention, the reference is an anomaly of predefined type of a signal such as a hump, a hiccup, a jolt, a trend change, a frequency shift or the like and an anomaly diagnosis signal is provided when there is a coincidence between the obtained vector and the reference.
According to an embodiment of the present invention, the reference results from a histogram of each coordinate of the second vector and an anomaly diagnosis is provided when a signal analyzed during a polling window deviates from said reference.
According to an embodiment of the present invention, applied to a digitized vibrating signal,
the learning includes the steps of:
3.1 selecting a succession of sequences of values of the analyzed signal corresponding to a succession of time windows (Fk);
3.2 calculating, for each time window Fk, a first vector (Dk) of dimension n representing the spectral density of the analyzed signal; and
3.3 reducing number n of digital data by an automatic compression of the second vector (Dk) to obtain a third spectral density vector with independent coordinates (IDk) and of dimension p smaller than n;
the diagnosis includes the steps of:
4.1 applying steps 3.1 to 3.3. to a polling window (Fk) likely to include an anomaly.
According to an embodiment of the present invention, the method further includes:
during the learning, the step of calculating, for j varying from 1 to p, the histogram Hj of each coordinate of the third vectors (IDk), calculating for each of these coordinates the probability Pj(a) for this coordinate to be greater than a (if a is greater than the median of histogram Hj) or smaller than a (if a is smaller than the median of histogram Hj), and determining a function Zj(a)=xe2x88x92log[Pj(a)],
during the diagnosis, the steps of:
4.2 calculating the sum over j, R=xcexa3Zj(IDkj), for this polling window (Fk); and
4.3 comparing said sum (R) with an intensity or rarity threshold predefined by the user.
According to an embodiment of the present invention, the method further includes a step of smoothing the first vector (Dk) to define a second vector of smoothed spectral density (LDk) and applying steps 3.2 and 3.3 to the vector (LDk).
According to an embodiment of the present invention, the method further includes the steps of:
determining a spectral noise (Bk=Dkxe2x88x92LDk);
calculating, for each spectral noise (Bk), an apse vector (EXTRk), the coordinates of which are greater than a value a*"sgr"k, where a is a predetermined integer greater than 4 depending on the detection unit used and "sgr"k is the standard deviation of the spectral noise (Bk); and
processing this vector like said second vector.
According to an embodiment of the present invention, applied to a digitized continuous signal,
the learning includes the steps of:
5.1 selecting a succession of sequences of n digital values of the analyzed signal corresponding to a succession of time windows (Fk), each sequence of n values defining a first vector (Sk) of dimension n;
5.2 smoothing, for example by sliding averages, the vector (Sk) to extract the points representing the non-linear tendency of the analyzed signal, which defines a second vector (LSk);
5.3 calculating the scalar product of each second vector (LSk) with a GABOR wavelet base, to associate with each polling window (Fk) a third vector (RGSk) of GABOR coefficients of dimension smaller than that of the vector (LSk);
5.4 reducing number n of digital data by an automatic compression of the third vector (RGSk) to obtain a fourth vector with independent coordinates (IRGVk) of dimension p smaller than n;
the diagnosis includes the steps of:
6.1 applying steps 5.1 to 5.4 to a polling window (Fk) likely to include an anomaly.
According to an embodiment of the present invention, the method further includes:
during the learning, the step of calculating, for j varying from 1 to p, the histogram Hj of each coordinate of each fourth vector (IRGVk), calculating for each of these coordinates (IRGVkj) the probability Pj(a) for this coordinate to be greater than a (if a is greater than the median of histogram Hj) or smaller than a (if a is smaller than the median of histogram Hj), and determining a function Zj(a)=xe2x88x92log[Pj(a)],
during the diagnosis, the steps of:
6.2 calculating the sum over j, R=xcexa3Zj(IRGVkj), for this polling window (Fk); and
6.3 comparing said sum (R) with an intensity or rarity threshold predefined by the user.
According to an embodiment of the present invention, the automatic compression is of the main component analysis compression type.
According to an embodiment of the present invention, the automatic compression is a compression by diabolo neural network compression.
According to an embodiment of the present invention, the automatic compression is a compression by extraction of independent components.
According to an embodiment of the present invention, the method is applied to vibration sensor signals.