Recently, a technique of extracting only user speech in hands-free by using microphone array has been developed. In a system to which such speech extraction technique is applied, it is necessary to suppress such noise in order to recognize the user speech correctly, because uttered speech (interference sound) other than the user speech to be extracted and diffusive noise called ambient noise are generally mixed in the user speech.
As a processing technique for suppressing noise, frequency domain independent component analysis is effective for use that assumes that sound sources are independent, applies learning rule for filtering in the frequency domain, and separates sound sources. In this technique, filters should be classified as a filter designed for extracting sound source of user speech or noise because the filter is designed in each frequency band. Such classifying is called “solution of the permutation (transpose) problem”. When the solution is failed, even if user speech to be extracted and noise are appropriately separated in each frequency band in the independent component analysis, a sound with a mixture of user speech and noise is eventually output.
For example, a technique related to the solution of the permutation problem is proposed in Patent Document 1. In the system disclosed in this document, short-time Fourier transform is performed on observed signals, separating matrixes are obtained at each frequency by the independent component analysis, the arrival directions of the signals extracted from each row of the separating matrixes at each frequency are estimated, and it is determined whether the estimated values are reliable enough. Further, the similarity of separated signals between frequencies is calculated, and separating matrixes are obtained at each frequency, and, after that, the permutation is solved.
FIG. 6 shows an exemplary configuration of a permutation solving unit. The permutation solving unit 24 includes a sound source direction estimation unit 243 and a classifying determination unit 242. The sound source direction estimation unit 243 estimates the arrival directions of the signals extracted by each row of the separating matrixes at each frequency. The classifying determination unit 242 determines the permutation for frequencies at which the estimation of the arrival directions of the signals executed by the sound source direction estimation unit 243 is determined to be reliable enough by aligning those directions, and determines the permutation for the other frequencies so as to increase the similarity of the separated signals with the frequencies in proximity.    [Patent Document 1]    Japanese Unexamined Patent Application Publication No. 2004-145172