1. Field of the Invention
The present invention relates to a source sound separator and a method therefor, and more particularly to a source sound separator and a method for separating a target voice with an interfering sound isolated which comes from a direction other than the direction of the target voice.
2. Description of the Background Art
Such a source sound separator is applicable to, e.g. a mobile device, such as a mobile phone, or a vehicle-laden device, such as a car navigation system. In exploiting voice recognition or telephone message recording, there may be raised such a problem that the voice captured by a microphone may severely be deteriorated in precision of voice recognition by ambient noise, or that the recorded voice may become hardly perceptible due to such noise. Under such circumstances, attempts have been made to use a microphone array to control the directivity characteristics to thereby selectively get just the voice of interest. However, simply controlling the directivity characteristics is not sufficient if it is intended to take out the voice of interest in a state separated from the background noise.
The solution of controlling the directivity characteristics by a microphone array is known per se. For example, solutions for directivity characteristic control by a delayed sum array (DSA) or beam forming (BF), or for directivity characteristic control by a directionally constrained minimization of power (DCMP) adaptive array have been known to date.
As solutions for separating a voice remotely uttered, there has been known a solution, termed SAFIA, in which signals output from fixed plural microphones are subject to the narrow-band spectral analysis and a microphone having yielded the maximum amplitude in each of the frequency bands is allotted to capturing sound in that frequency band, as disclosed by Japanese Patent Laid-Open Publication No. 313497/1998. In this solution of voice separation based on bandwidth selection (BS), in order to obtain a target voice, such a microphone is selected which resides closest to a sound source uttering the target voice, and the sound in the frequency band allocated to that microphone is used to synthesize voice.
In the SAFIA, described above, two signals, when overlapping, may be separated from each other. If there are three or more sound sources, these signals could theoretically be separated one from another whereas the performance of separation would severely be deteriorated. Thus, if there are plural noise sources, it becomes extremely difficult to separate the target sound to high precision from the received sound signal corrupted with multiple noise signals.
A further solution improved over the band selection has been proposed in U.S. Patent Application Publication No. US 2009/0323977 A1 to Kobayashi et al. In the method taught by Kobayashi et al., as will be described in detail later on, the frequency characteristics, with which the sound signals, e.g. voice or acoustic signals, from respective sound sources are properly emphasized, are calculated. However, the signal captured by a microphone may contain an interfering sound in addition to a target sound. Hence, it cannot but be said that those solutions are improper to use in the vicinity of the last stage of elimination of the interfering sound. Under such a situation, the sound quality is deteriorated after ultimate source sound separation.