The present invention relates to a system and method of evaluating a correlation. More particularly, the invention relates to a system and method of evaluating a correlation between a plurality of time-series data.
Analysis of a correlation between observed data obtained from each part of a given observation object is effective for detection of a failure occurring inside the observation object. When an automobile, for example, is under observation, it is hard to detect a failure of the automobile by observing only the gear position of the automobile. Detection of a failure occurring inside the automobile, however, can be often achieved by detecting another observed data, e.g., a value of the engine speed in a case where the gear of the automobile is at a given position.
There has been heretofore widely used the technique of analyzing a correlation between a plurality of variables taking on consecutive values, based on a covariance matrix. This technique permits analysis of a linear correlation such that one variable having a large variable value leads to another variable also tending to have a large variable value. Specifically, an empirical distribution is defined as Equation (1) below, provided that time-series data of observed data are N-dimensional vector variables x. Then, a covariance matrix is defined as <xxT>, provided that the expectation value over the data is expressed as <•>. It will be hereinafter understood that the mean of the data is prenormalized to zero. Also, each element of a correlation coefficient matrix C is defined as Equation (2) below.
                                          p            emp                    ⁡                      (            x            )                          =                              1            N                    ⁢                                    ∑                              t                =                1                            T                        ⁢                          δ              ⁡                              (                                  x                  -                                      x                    ⁡                                          (                      t                      )                                                                      )                                                                        (        1        )                                          C                      i            ,            j                          =                              〈                                          x                i                            ⁢                              x                j                                      〉                                                              〈                                  x                  i                  2                                〉                            ⁢                              〈                                  x                  j                  2                                〉                                                                        (        2        )            
It will be hereinafter understood that δ represents a Dirac delta function when observed data takes on continuous values, or δ represents a Kronecker delta function when observed data takes on discrete values.
Even if each element of a covariance matrix is zero as to a set of given variables, the variables, however, are not limited to having no correlation. For example, when an empirical distribution is expressed by Equation (3) below, p(xi|xj) is an even function of xi. Thus, a correlation coefficient is zero as expressed by Equation (4) below.
                              p          ⁡                      (                                          x                i                            |                              x                j                                      )                          =                              1            2                    ⁡                      [                                          δ                ⁡                                  (                                                            x                      i                                        +                                                                                            r                          2                                                -                                                  x                          j                          2                                                                                                      )                                            +                              δ                ⁡                                  (                                                            x                      i                                        -                                                                                            r                          2                                                -                                                  x                          j                          2                                                                                                      )                                                      ]                                              (        3        )                                          〈                                    x              i                        ⁢                          x              j                                〉                =                              ∫                                          dx                i                            ⁢                              ⅆ                                  x                  j                                            ⁢                              p                ⁡                                  (                                      x                    j                                    )                                            ⁢                              p                ⁡                                  (                                                            x                      i                                        |                                          x                      j                                                        )                                            ⁢                              x                i                            ⁢                              x                j                                              =          0                                    (        4        )            
As can be seen from Equation (3), the variable values of the variables, however, are distributed around the circumference of a circle with a radius r. As mentioned above, a very strong correlation may exist between the variables even if each element of the covariance matrix is zero.
Furthermore, kernel methods recently have come into use as an approach for incorporating a nonlinear correlation in the field of machine learning. The kernel method, when applied, involves using a kernel function K to substitute <K(x′,x)> for <xxT> which is the base quantity of covariance structural analysis. This permits kernel principal component analysis or the like. For example, a polynomial kernel is defined as (xTy)d, where x and y denote vectors and d denotes a natural number. In this example, assuming d=2 leads to the kernel function defining nonlinear mapping as expressed by Equation (5).Φ(x)=[x12, x22, . . . ,xn2, √{square root over (2)}x1x2, . . . ,√{square root over (2)}xn-1xn]T  (5)
In other words, the inner product Φ(x)TΦ(y) matches the kernel function. Accordingly, the quantity, as expressed as <Φ(x)Φ(y)T>, can be therefore considered as an extension of the covariance matrix, so that principal component analysis can be performed on this matrix. Consequently, the matrix can yield features on which nonlinear correlations reflect. Incidentally, Equation (5) is illustrative, and Φ(x) is not limited to this illustrative example but may be in the form of such map as cannot be explicitly expressed.
However, the kernel method uses a black box called the kernel function for incorporation of nonlinearity, which takes place in the black box. In other words, the kernel method is illustrative only of a general method of incorporating nonlinearity, and is incapable of detecting particular nonlinearity for detection of a correlation between variables.