In some cases, a technique whereby, when multiple mixed signals are observed by multiple sensors, original signals are decided only by using observed signals are more useful than conventional noise reduction methods, and some extensions of the application field can be expected. These fields include, for example, speech enhancement for the reduction of unwanted acoustics during speech recognition, digital communication demodulation in a complex signal environment such as QAM (Quadrature Amplitude Modulation), a medical signal restoration for the extraction of necessary organ information, and a data analysis method whereby an independent component (factor) hidden in statistical data can be extracted.
FIG. 1 depicts a conceptual diagram showing signal separation problems for separating original signals, assuming only the mutually statistical independence of signals when multiple signals are observed in a mixed state. FIG. 1 is formulated as follows. First, assume that there are m signals of scalar values s1(t), . . . , sm(t) for each index t, which are mutually statistically independent and have zero mean. FIG. 1 shows two signal sources s1 and s2 by way of example. For this, n linear weighted sums x1(t), . . . , xn(t) are to be observed, which is expressed as follows and observed by an observation apparatus.x(t)=As(t)  [Equation 1]where each element is represented as follows.x(t)=[x1(t)x2(t) . . . xn(t)]Ts(t)=[s1(t)s2(t) . . . sm(t)]T  [Equation 2]where it is assumed to be n≧m. Furthermore, assuming that n×m mixing matrix is A, which is to be a full rank matrix, i.e., a matrix where an inverse matrix of m×m matrix AHA exists. Hereinafter, a lowercase letter with an underline represents a vector, an uppercase letter with an underline represents a matrix, a subscript T represents transposition, and a subscript H represents Hermitian conjugate (i.e., conjugate transposition).
A problem of estimating a separation matrix W for obtaining a separation signal y(t) from this observed signal x(t) is a so-called signal separation problem. That is, when obtaining a separation signal y(t)=WH×x(t) using a signal separation apparatus from an observed signal x(t), which was observed by an observation apparatus shown in FIG. 1, estimation of a separation matrix W becomes a problem.
Next, a summary of the concept for estimating a separation matrix W will be described. Assuming that a multivariable probability density function of a signal vector serving as observed signal x(t) is pu(u)and a probability density function for each element of the vector is pi(ui), a mutual information of an observed vector is represented by the following Kullback-Leibler divergence.                               I          ⁡                      (                          u              _                        )                          =                  ∫                                                    p                u                            ⁡                              (                                  u                  _                                )                                      ⁢                          log              ⁡                              (                                                                            p                      u                                        ⁡                                          (                                              u                        _                                            )                                                                                                  ∏                                              i                        =                        1                                            n                                        ⁢                                                                                   ⁢                                                                  p                        i                                            ⁡                                              (                                                  u                          i                                                )                                                                                            )                                      ⁢                                                   ⁢                          ⅆ                              u                _                                                                        [                  Equation          ⁢                                           ⁢          3                ]            where the mutual information is always positive and when it is zero shows that the elements of each signal vector are independent. In fact, if the signal vector elements are independent each other, the density function of the signal vector is represented by the following equation, so that the above equation becomes zero.                                           p            u                    ⁡                      (                          u              _                        )                          =                              ∏                          i              =              1                        n                    ⁢                                           ⁢                                    p              i                        ⁡                          (                              u                i                            )                                                          [                  Equation          ⁢                                           ⁢          4                ]            
Therefore, one of the rationales of signal separation technique is that the original signals are able to be restored from the mixed observed signals by finding a transformation matrix that minimizes the mutual information of signal vectors for observed signal vectors.
However, as the probability distribution of original signals is practically unknown, the mutual information can not be made directly to be an object of minimization operation. Therefore, the signal separation is often performed by optimizing a valuation amount that is equal or approximately equal to the mutual information. For example, Reference 1 “International Journal of Neural Systems”, vol. 8, Nos. 5 & 6, pp. 661-678, October/December 1997, describes that a mutual information is able to be minimized if finding a transformation matrix W that optimizes the sum of the fourth-order cumulants with a zero time delay for each original signal (i.e., maximizing if the kurtosis is positive or minimizing if the kurtosis is negative), on the condition that the observed signals have a kurtosis with the same sign, a covariance matrix is bounded, whitening has been performed, and a separation matrix W is a unitary matrix (i.e., WHW=I(unit matrix)). Note that the kurtosis refers to a numeric obtained by the following calculation for an observed signal ui.E{ui4}−3[E{ui2}]2  [Equation 5]where E[·] represents an expectation operation. The whitening means making signal vectors uncorrelated each other to make the variance 1, the fourth-order cumulant is a statistic represented by the following equation.                                                                                           c                  4                                ⁡                                  (                                                            k                      1                                        ,                                          k                      2                                        ,                                          k                      3                                                        )                                            =                            ⁢                                                E                  ⁢                                      {                                                                                            u                          i                                                ⁡                                                  (                          t                          )                                                                    ⁢                                                                        u                          i                                                (                                                  t                          +                                                      k                            1                                                                          )                                            ⁢                                                                        u                          1                                                ⁡                                                  (                                                      t                            +                                                          k                              2                                                                                )                                                                    ⁢                                                                        u                          i                                                ⁡                                                  (                                                      t                            +                                                          k                              3                                                                                )                                                                                      }                                                  -                                                                                                      ⁢                                                E                  ⁢                                      {                                                                                            u                          i                                                ⁡                                                  (                          t                          )                                                                    ⁢                                                                        u                          i                                                (                                                  t                          +                                                      k                            1                                                                          )                                                              }                                    ⁢                  E                  ⁢                                      {                                                                                            u                          1                                                ⁡                                                  (                                                      t                            +                                                          k                              2                                                                                )                                                                    ⁢                                                                        u                          i                                                ⁡                                                  (                                                      t                            +                                                          k                              3                                                                                )                                                                                      }                                                  -                                                                                                      ⁢                              E                ⁢                                  {                                                                                    u                        i                                            ⁡                                              (                        t                        )                                                              ⁢                                                                  u                        i                                            (                                              t                        +                                                  k                          2                                                                    )                                                        }                                ⁢                E                ⁢                                  {                                                                                                              u                          1                                                ⁡                                                  (                                                      t                            +                                                          k                              1                                                                                )                                                                    ⁢                                                                        u                          i                                                ⁡                                                  (                                                      t                            +                                                          k                              3                                                                                )                                                                                      -                                                                                                                                        ⁢                              E                ⁢                                  {                                                                                    u                        i                                            ⁡                                              (                        t                        )                                                              ⁢                                                                  u                        i                                            (                                              t                        +                                                  k                          3                                                                    )                                                        }                                ⁢                E                ⁢                                  {                                                                                    u                        1                                            ⁡                                              (                                                  t                          +                                                      k                            1                                                                          )                                                              ⁢                                                                  u                        i                                            ⁡                                              (                                                  t                          +                                                      k                            2                                                                          )                                                                                                                                                    [                  Equation          ⁢                                           ⁢          6                ]            
The zero time delay means that k1, k2 and k3 are zero in the above equation.
However, generally, as a load of calculation is heavy when calculating high order statistics such as cumulants, a technique is employed such as calculating and approximating another information amount equivalent to a mutual information or minimizing a cost function equivalent to what optimizes the sum of cumulants by introducing a nonlinear function that can approximate the fourth-order cumulants. U.S. Pat. No. 5,706,402 discloses a method for finding a separation matrix by the gradient method using an unsupervised learning algorithm that optimizes output entropy instead of minimization of mutual information.
Though Reference 2 (Signal Processing, vol. 24, No. 1, pp. 1-10, July 1991) does not manifest mutual information and cumulants, it discloses a method for using an approach similar to it, wherein a square of the residual that results from subtracting a linear sum of estimated signals from the observed signal is made to be a cost function and finding a separation filter that minimizes the cost function by the gradient method. Moreover, Japanese Unexamined Patent Publication No. 2000-97758 discloses a method for improving the convergence by normalizing updated amounts of the above method.
Reference 3 (IEEE Transactions on Signal Processing, vol. 44, No. 12, pp. 3017-3030, December 1996) proposes an estimation method, wherein a nonlinear function that approximately finds the fourth-order cumulants is introduced, then updated amounts to optimize the cost function in an adaptive algorithm based on that nonlinear function are determined based on the relative gradient. This technique improves the convergence speed of the conventional adaptive algorithm, which uses a gradient of the cost function as the updated amount, and which is equivalent to the natural gradient that may be introduced from information geometric considerations.
Stability in the convergence process of the separation matrix is important when restoring signals not in a steady state. In fact, in a series of gradient methods described above, it is often the case that the relation between the convergence speed and the stability is an inverse proportion. Thus, U.S. Pat. No. 5999956 uses a method that adds a module for reducing the effect on the estimation process even when there is a big change of power between estimated signals, and outputting stable results, in addition to a signal estimation module and a separation coefficient estimation module in order to achieve a stable convergence.
Furthermore, Reference 4 (International Journal of Neural Systems, vol. 8, No. 5 & 6, pp. 601-612, October/December 1997) derives an adaptive algorithm based on the least squares method instead of the gradient method, when optimizing a cost function that introduced nonlinear function. Using this approach, as a step-size is not determined by a user like in the gradient method and what is optimal is determined automatically, the convergence speed is enhanced and the stability is achieved under a given condition.
Like the technique of Reference 4 above, within the framework of the least squares method, it has been considered that a fast and appropriate convergence is often achieved, since a step-size is calculated to be optimal under the cost function. However, there is not necessarily the conformance between the situation where the signal separation is required and the format of the cost function which the above prior art techniques including the gradient method have been employed, so that there is a case where it seems not to be best even when using the framework of the least squares method.
For example, for a portable information device, it is assumed that signal observation apparatuses are close to each other because a large area can not be obtained for the installation of apparatuses. At this time, it is easily assumed that the original signals can be mixed at a similar ratio by the observation apparatuses. When this mixing ratio is represented as a matrix element, the elements in each column (or each row) have substantially the same value.
In such a case, as the condition number of the mixing matrix becomes large, the perturbation in the estimation process of the separation matrix would have large effects in estimates. Note that the condition number refers to an amount defined by ∥Z∥·∥Z−1∥ using some norm ∥·∥ for a matrix Z, where Z−1 represents an inverse matrix of a matrix Z.
Therefore, in the conventional format of the cost function, much time is spent for obtaining normal estimates when the perturbation is large, which is likely to be a problem. Further, it is another problem that when the condition number is not large, the convergence speed becomes slower than the conventional cost function in the stage where errors still remain in the estimation process.