1. Field of the Invention
The present invention relates to a sound source separation system.
2. Description of the Related Art
As a method of separating a sound source without information of a transfer system according to a separation method based on an inverse filter, blind source separation (BSS) is suggested (see Reference Documents 1 to 4). As the BSS, sound source separation based on a decoration based source separation (DSS), an independent component analysis (ICA) and a higher-order DSS (HDSS), geometric constrained source separation (GSS) obtained by adding geometric information to these method, geometric constrained ICA (GICA) and a geometric constrained HDSS (GHDSS) are known. Hereinafter, the overview of the BSS will be described.
If the frequency characteristics of M sound source signals are s(ω)=[s1(ω), s2(ω), . . . , sN(ω)]T (T denotes transposition), the characteristics of input signals of N (≦M) microphones x(ω)=[x1(ω), x2(ω), . . . , xN(ω)]T are expressed by Equation (1) using a transfer function matrix H(ω). The element Hij of the transfer function H(ω) represents a transfer function from the sound source i to the microphone j.x(ω)=H(ω)s(ω)  (1)
A sound source separation problem is expressed by Equation (2) using a separation matrix W(ω).y(ω)=W(ω)x(ω)  (2)A sound source separation process is formalized by obtaining the separation matrix W(ω) which becomes y(ω)=s(ω). If the transfer function matrix H(ω) is already known, the separation matrix W(ω) is computed using a pseudo-inverse matrix H+(ω). However, actually, the transfer function matrix H(ω) is hardly known. The BSS obtains W(ω) in a state in which H(ω) is not known.
1. BSS (Offline Process)
The general method of the BSS is described by Equation (3) as a process of obtaining y which minimizes a cost function J(y) for evaluating a separation degree.WBss=argminw[J(y)]=argminw[J(Wx)]  (3)
The cost function J(y) is changed according to the method and is calculated by Equation (4) using Frobenius norm (representing the square sum of the absolute values of all elements of the matrix) on the basis of a correlation matrix Ryy=E[yyH] of y according to the DSS.JDSS(W)=∥Ryy−Diag[Ryy]∥2  (4)
According to the ICA using K-L information amount, the cost function J is calculated by Equation (5) on the basis of a simultaneous probability density function (PDF) p(y) of y and a peripheral PDF q(y)=IIkp(yk) of y (see Reference Document 5).JICA(W)=∫dy·p(y) Log{p(y)/q(y)}  (5)W satisfying Equation (3) is determined by iteration computation according to a gradient method expressed by Equation (6) on the basis of a matrix J′ (Wk) representing the direction of W in which the gradient of J(W) is most rapid in the periphery of J(Wk) (k is the number of times of iteration) and a step-size parameter μ.Wk+1=Wk−μJ′(Wt)  (6)
The matrix J′ (Wk) is calculated by a complex gradient calculating method (see Reference Document 6). According to the DSS, the matrix J′ (W) is expressed by Equation (7).J′DSSoff(W)=2[Ryy−Diag[Ryy]]WRxx  (7)
According to the ICA, the matrix J′ (W) is expressed by Equation (8) according to the matrix Rφ(y)y=E[φ(y)yT] and the function φ(y) defined by Equations (9) and (10).J′ICAoff(W)=[Rφ(y)y−I][W−1]T  (8)φ(y)=[φ(y1),φ(y2), . . . , φ(yN)]T  (9)φ(yi)=−(∂/∂yi) Logp(yi)  (10)
2. Adaptive BSS
According to the adaptive BSS, expectation calculation of a restarting process is omitted and immediate data is used. In more detail, E[yyH] is converted into yyH. The updated equation is equal to Equation (6) and the number of times of iteration “k” includes the meaning as expressing a time. In an offline process, in order to improve precision, the number of times of iteration may be increased by a small step size, but, if this method is employed in the adaptive process, an adaptive time is increased and the quality of performance deteriorates. Accordingly, the adjustment of a step-size parameter μ of the adaptive BSS is more important than the offline BSS. The DSS of the adaptive BSS and the matrix J′ of the ICA are expressed by Equations (11) and (12), respectively. The ICA is described according to a method of using an updating method based on a natural gradient according to a method which focuses on only an off-diagonal element of a correlation matrix (see Reference Document 7)).J′DSS(W)=2[yyHDiag[yyH]]WxxH  (11)J′ICA(W)=[φ(y)yH−Diag[φ(y)yH]]W  (12)
3. BSS (GBSS) with Constraint Condition Using Geometric Information
A method of solving permutation problem and a scaling problem which occur in the ICA using geometric information (positions of the microphone and the sound source) is suggested (see Reference Documents 8 to 11). According to the GSS, a value obtained by synthesizing a geometric constraint error and a separation error is used as a cost function. For example, the cost function J(W) is decided according to Equation (13) on the basis of a linear constraint error JLC(W) based on the geometric information, a separation system error Jss(W) and a normalization coefficient λ.J(W)=JLC(W)+λJss(W)  (13)
As the linear constraint error JLC(W), a difference JLCDS(W) from a coefficient at a delay sum beamforming method expressed by Equation (14) or a difference JLCNULL(W) from a coefficient at a null beamforming method expressed by Equation (15).JLCDS(W)=∥Diag[WD−I]∥2  (14)JLCNULL(W)=∥WD−I∥2  (15)
In the GSS, as the separation system error Jss(W), JDSS(W) of Equation (4) is employed (see Reference Document 12). In addition, as the separation system error Jss(W), JICA(W) of Equation (5) may be employed. In this case, an adaptive ICA (GICA) with linear constraint using the geometric information is obtained. This adaptive GICA is a weak-constraint method which permits a linear constraint error and is different from a strong-constraint method using linear constraint as an absolute condition described in Reference Document 11.
[Reference Document 1] L. Parra and C. Spence, Convolutive blind source separation of non-stationary source, IEEE Trans. on Speech and Audio Proceeding, vol. 8, no. 3, 2000, pp. 320-327
[Reference Document 2] F. Asano, S. Ikeda, M. Ogawa, H. Asoh and N. Kitawaki, Combined Approach of Array Processing and Independent Component Analysis for Blind Separation of Acoustic Signals, IEEE Trans. on Speech and Audio Processing, vol. 11, no. 3, 2003, pp. 204-215
[Reference Document 3] M. Miyoshi and Y. Kaneda, Inverse Filtering of Room Acoustics, IEEE Trans. on Acoustic Speech and Signal Processing, vol. ASSP-36, no. 2, 1988, pp. 145-152
[Reference Document 4] H. Nakajima, M. Miyoshi and M. Tohyama, Sound field control by Indefinite MINT Filters, IEICE Trans., Fundamentals, vol. E-80A, no. 5, 1997, pp. 821-824
[Reference Document 5] S. Ikeda and M. Murata, A method of ICA in time-frequency domain, Proc. Workshop Indep. Compom. Anal. Signal. 1999, pp. 365-370
[Reference Document 6] D. H. Brandwood, B. A, A complex gradient operator and its application in adaptive array theory, Proc. IEE Proc., vol. 130, Pts. Fand H, No. 1, 1983, pp. 11-16
[Reference Document 7] S. Amari, Natural gradient works efficiently in learning, newral Compt., vol. 10, 1988, pp. 251-276
[Reference Document 8] L. Parra and C. Alvino, Gepmetric Source Separation: Merging Convultive Source Separation with Geometric Beamforming, IEEE Trans. on Speech and Audio Processing, vol. 10, no. 6, 2002, pp. 352-362
[Reference Document 9] R. Mukai, H. Sawada, S. Araki and S. Makino, Blind Source Separation of many signals in the frequency domain, in Proc. of ICASSP2006, vol. V, 2006, pp. 969-972
[Reference Document 10] H. Saruwatari, T. Kawamura, T. Nishikawa, A. Lee and K. Shikano, Blind Source Separation Based on a Fast Convergence Algorithm Combining ICA and Beamforming, IEEE Trans. on Speech and Audio Processing, vol. 14, no. 2, 2006, pp. 666-678
[Non-Patent Document 11] M. Knaak, S. Araki and S. Makino, Geometrically Constrained Independent Component Analysis, IEEE Trans. on Speech and Audio Processing, vol. 15, no. 2, 2007, pp. 715-726
[Non-patent Document 12] J. Valin, J. Rouat and F. Michaud, Enhanced Robot Audition Based on Microphone Array Source Separation with Post-Filter, Proc. of 2004 IEE/RSJ IROS, 2004, pp. 2123-2128