A blind source separation technique is to separate an interested audio source signal from a mixed signal picked up by two or more microphones.
Now, the related art of the blind source separation technique will be described.
[Independent Component Analysis Based on Non-Gaussianity]
First, an independent component analysis method based on non-gaussianity is described.
In general, the independent component analysis can be explained by using the following model.x=As  [Mathematical Formula 1]y=Ŝ=Wx=WAs≅A−1As  [Mathematical Formula 2]
In the mathematical formulas 1 and 2, y is an output vector of the independent component analysis, x is an input vector of a microphone, and s is a vector for a to-be-found audio source.
The independent component analysis has a problem in that, in the state where a mixing matrix A representing a course of mixing of an audio source signal up to the time when the audio source signal enters the microphone is unknown, a pseudo-inverse matrix W thereof is to be found.
The above-described problem of the independent component analysis can be solved by measuring non-gaussianity based on the central limit theorem according to Aapo Hyvarinen, “Fast and robust fixed-point algorithms for independent component analysis”, IEEE Trans. on Neural Networks, vol. 10, no. 3, 1999. Namely, when there exists a noise signal where independent interested audio source signals are mixed, since the mixed noise signal is a superposition of plural independent noise signals, the mixed noise signal is closer to a Gaussian distribution than the independent interested audio source signals. Therefore, individual independent components can be separated by maximizing non-gaussianity of the output signal.
FIG. 1 illustrates a histogram of an audio source signal spoken by one person and a histogram of a babble noise, that is, voices clamorously spoken by many people, and a mixed noise signal is closer to a gaussian distribution that an independent interested audio source.
[Independent Component Analysis Method Based on Negentropy Maximization]
In order to measure the non-gaussianity of the output signal yi, the definition of the negentropy expressed by Mathematical Formula 3 is used.J(yi)=H(yigauss)−Y(yi)  [Mathematical Formula 3]
Herein, yigauss is a gaussian distribution random variable having the same variance as that of yi. The entropy H of the random variable yi of which probability density function is pyi is expressed by Mathematical Formula 4.H(yi)=−∫pyi(u)log pyi(u)du  [Mathematical Formula 4]
Since a random variable having the highest entropy among the random variables having the same variance is a gaussian distribution random variable, the non-gaussianity with respect to an estimated output signal yi can be maximized by maximizing the negentropy, and the estimated output signal yi is approximated to the original audio source signal according to the central limit theorem. Since the direct calculation of the aforementioned negentropy is very complicated, the negentropy can be approximated with respect to a random variable having a symmetric distribution as expressed by Mathematical Formula 5.J(yi)∝[E{G(yi)}−E{G(yigauss)}]2 
This approximation method is a generalization of a high-dimensional superposition approximation method and utilizes an expectation value of a nonquadratic nonlinear function G of the output signal yi. The nonlinear function G is approximated to G(y)=y4 in the case where it is based on kurtosis. Besides the nonlinear function based on the kurtosis, effective nonlinear functions exist as follows.
                                          G            1                    ⁡                      (            y            )                          =                              1                          a              1                                ⁢          log          ⁢                                          ⁢          cosh          ⁢                                          ⁢                      a            1                    ⁢          y                                    [                  Mathematical          ⁢                                          ⁢          Formula          ⁢                                          ⁢          6                ]                                          G          ⁢                                          ⁢          2          ⁢                      (            y            )                          =                  -                      exp            ⁡                          (                                                -                                      y                    2                                                  /                2                            )                                                          [                  Mathematical          ⁢                                          ⁢          Formula          ⁢                                          ⁢          7                ]            
In the Mathematical Formulas 6 and 7, it is assumed that the output signal y has an average of 0 and a variance of 1. At this time, if X is whitened, the average thereof becomes 0, and the correlation matrix becomes an identity matrix. Therefore, Y having an average of 0 and variance of 1 can be estimated through transformation into a unitary matrix, and a simple transform matrix training formula can be described by using characteristics of the unitary matrix. When a whitening transformation matrix is denoted by V, a whitening output z with respect to the input signal x is expressed by Mathematical Formula 8.z=Vx=D−1/2ETx  [Mathematical Formula 8]
In the above Mathematical Formula 8, D=diag(d . . . dn) denotes a diagonal matrix of eigenvalues of an input covariance matrix; and E denotes a matrix of eigenvectors of the input covariance matrix.
The negentropy is expressed by the following Mathematical Formula 9 by using a random variable z which is whitened according to the aforementioned Mathematical Formula 8.JG(W)=[{G(wTz)−E(G(v)}]2  [Mathematical Formula 9]
In the above Mathematical Formula 9, w denotes a vector having a norm of 1; and v denotes a gaussian distribution random variable having an average of 0 and a variance of 1. Since E{G(wTz)} is always smaller than E{G(v)}, the maximization of negentropy is equivalent to the minimization of E{G(wTz)}. In order to maximize the negentropy, an algorithm of a steepest ascent method with respect to w can be derived. Therefore, the algorithm has a training rule according to Mathematical Formula 10 and Mathematical Formula 11.Δw∝γE{zg(wTz)}  [Mathematical Formula 10]w←w/∥w∥  [Mathematical Formula 11]
Herein, y=E{G(wT z)}−E{G(v)}.
In the differentiation process, E{G(v)} disappears because the value of the term of the gaussian distribution random variable having an average of 0 and a variance of 1 is fixed with respect to the w having a norm of 1. Since the sign of γ has an effect on a stability in the training process, the algorithm can be more simplified by fixing the sign. In particular, the sign of γ can be defined according to transcendental information of independent components. For example, in the case of an audio signal, since the audio signal has a super-gaussian distribution, when g(E) is tan h(E), an audio source signal can be found by fixing γ to −1. Therefore, instead of maximizing the negentropy, a specific audio source signal can be recovered by minimizing E{G(wTz)}.
Since the methods in the related art require an analysis in a frequency plane, the separation vector w, the input signal, and the output signal are complex numbers. If a cost function is a complex function, the amplitude of the function cannot be defined, and thus, it is impossible to minimize the cost function. Therefore, the cost function with respect to w is expressed in a form of a square of absolute value of wHz as follows.JG′(W)=E{G(|wHz|2)}  [Mathematical Formula 12]
Herein, G is expressed by using the following functions which are different from G in a real number plane.
                                          G            1                    ⁡                      (            y            )                          =                                            a              1                        +            y                                              [                  Mathematical          ⁢                                          ⁢          Formula          ⁢                                          ⁢          13                ]                                                      G            2                    ⁡                      (            y            )                          =                  log          ⁡                      (                                          a                2                            +              y                        )                                              [                  Mathematical          ⁢                                          ⁢          Formula          ⁢                                          ⁢          14                ]                                                      G            2                    ⁡                      (            y            )                          =                              1            2                    ⁢                      y            2                                              [                  Mathematical          ⁢                                          ⁢          Formula          ⁢                                          ⁢          15                ]            
The algorithm of the steepest ascent method with respect to w can be derived by differentiating the cost function. Therefore, the algorithm has a training rule according to Mathematical Formula 16 and Mathematical Formula 17.Δw∝−E{zg(wHz)*g(|wHz|2)}  [Mathematical Formula 16]w←w/∥w∥  [Mathematical Formula 17]
[Independent Component Analysis Method Based on Negentropy Maximization Using Independent Vector Analysis]
The independent vector analysis algorithm is a concept extension of the independent component analysis to multivariate components in a frequency plane. The independent vector analysis can be considered to be a problem of an independent component analysis in the case where all of intrinsic independent components and observed signals are multivariate signals, that is, vector signals.
In an independent vector analysis model, it is assumed that signal source vectors are independent of each other in terms of probability, and it is assumed that components of each signal source vector are not independent of each other but correlated with each other in terms of probability.
When these assumptions are applied to the algorithm to the frequency plane, it can be expressed that each signal source vector corresponds to a vector in a frequency axis, and the components of a vector, that is, the components of frequency have correlation with each other.
FIG. 2 is a schematic diagram illustrating a comparison between the frequency plane independent component analysis and the independent vector analysis with respect to a two-channel input/output frequency signal.
In the above-described independent component analysis, the nonlinear function G included in the cost function receives a multivariate vector of the frequency axis as a parameter, and thus, the cost function with respect to the changed w is expressed by Mathematical Formula 18.
                                          J            G            ″                    ⁡                      (                          w              ⁡                              (                k                )                                      )                          =                  E          ⁢                      {                          G              ⁡                              (                                                      ∑                    k                                    ⁢                                                                          ⁢                                                                                                                                                                (                                                          w                              ⁡                                                              (                                k                                )                                                                                      )                                                    H                                                ⁢                                                  z                          ⁡                                                      (                                                          k                              ,                              τ                                                        )                                                                                                                                      2                                                  )                                      }                                              [                  Mathematical          ⁢                                          ⁢          Formula          ⁢                                          ⁢          18                ]            
In the above Mathematical Formula 18, k and τ denote a frequency index and a time frame index. According to the Mathematical Formula 18, it can be understood that parameters of the nonlinear function G are changed into multivariate vectors through the independent vector analysis. By differentiating, a steepest ascent method algorithm with respect to w(k) can be derived. Accordingly, the algorithm has training rules as expressed by Mathematical Formulas 19 and 20.
                              Δ          ⁢                                          ⁢                      w            ⁡                          (              k              )                                      ∝                              -            E                    ⁢                      {                                                                                z                    ⁡                                          (                                              k                        ,                        τ                                            )                                                        ⁡                                      [                                                                                            (                                                      w                            ⁡                                                          (                              k                              )                                                                                )                                                H                                            ⁢                                              z                        ⁡                                                  (                                                      k                            ,                            τ                                                    )                                                                                      ]                                                  *                            ⁢                              g                (                                                      ∑                    k                                    ⁢                                                                          ⁢                                                                                                                                                                (                                                          w                              ⁡                                                              (                                k                                )                                                                                      )                                                    H                                                ⁢                                                  z                          ⁡                                                      (                                                          k                              ,                              τ                                                        )                                                                                                                                      2                                                  )                                      }                                              [                  Mathematical          ⁢                                          ⁢          Formula          ⁢                                          ⁢          19                ]                                                          ⁢                              w            ⁡                          (              k              )                                ←                                    w              ⁡                              (                k                )                                      /                                                        w                ⁡                                  (                  k                  )                                                                                                      [                  Mathematical          ⁢                                          ⁢          Formula          ⁢                                          ⁢          20                ]            
[Interested Audio Source Separation Algorithm]
Although the interested audio source can be effectively estimated by maximizing the negentropy of the output signal through the above-described interested audio source separation algorithm of the related art, theoretically, as many number of microphone inputs as the number of mixed audio sources need to exist in order to perform the above-described estimation.
However, in real environment, it is impossible to prepare as many number of microphones as the number of mixed audio source, and even through the microphones are prepared, there is a problem in that the number of to-be-estimated parameters becomes large.
As the related art of the above-described audio source separation, there are “Fast and robust fixed-point algorithms for independent component analysis”, by Aapo Hyvarinen, IEEE Trans. on Neural Networks, vol. 10, no. 3, 1999, “A fast fixed-point algorithm for independent component analysis of complex valued signals”, by E. Bingham and A. Hyvarinen, International Journal of Neural Systems, vol. 10, no. 1, 2000, “Fast fixed-point independent vector analysis algorithms for convolutive blind source separation”, by I. Lee, T. Kim, and T. Lee, Signal Processing, vol. 87, Issue 8, 2007, and so on.