Field of the Invention
The present invention relates to a sound source separating device and a sound source separating method.
Description of Related Art
In a vehicle, a speech recognition system for operating a navigation system or the like with a voice command has been proposed. In such a speech recognition system, for example, when a vehicle travels on an expressway or when music is played in a vehicle, a speech recognition rate is likely to decrease due to mixture of large noise from the surroundings.
Accordingly, in the speech recognition system, a sound source position of speech of a speaker sitting in a seat is stored as preset information in advance for each seat position. In the speech recognition system, a process of retrieving the preset information on the basis of the seat position detected by a sensor, separating the speech of the speaker with reference to the retrieved preset information, and recognizing the speech has been proposed (for example, see Republished Japanese Translation No. WO2006/025106 of the PCT international Publication for Patent Application).
A sound source separating device according to the related art that separates sound sources will be described below in brief.
FIG. 14 is a block diagram illustrating a schematic configuration of a sound source separating device 900 according to the related art. As illustrated in FIG. 14, the sound source separating device 900 according to the related art includes a sound collecting unit 911, a sound signal acquiring unit 912, a sound source localizing unit 913, and a sound source separating unit 914.
The sound collecting unit 911 is a microphone array including N (where N is an integer equal to or greater than 2) microphones. The sound collecting unit 911 collects sound signals and outputs the N collected sound signals to the sound signal acquiring unit 912.
The sound signal acquiring unit 912 acquires the N sound signals output from the N microphones of the sound collecting unit 911 and outputs the N acquired sound signals to the sound source localizing unit 913 and the sound source separating unit 914. The sound source localizing unit 913 estimates a direction of a sound source (which is also referred to as sound source localization) from the N sound signals output from the sound signal acquiring unit 912, for example, using a multiple signal classification (MUSIC) method and outputs information indicating the estimated direction of a sound source to the sound source separating unit 914. The number of sound sources which are localized by the sound source localizing unit 913 dynamically varies depending on an environment in which the sound source separating device 900 is used.
The sound source separating unit 914 separates the sound source on the basis of the information indicating the direction of the sound source which is output from the sound source localizing unit 913 in response to the sound signals output from the sound signal acquiring unit 912, for example, using a geometrically constrained high-order decorrelation-based source separation with adaptive step-size control (GHDSS-AS) method which is a hybrid of blind source separation and beam forming. In the GHDSS-AS method, a separation signal is estimated from the collected sound signals using a separation matrix. When a sound source is separated using the separation matrix W in this way, it is known that stability of the separation matrix affects sound source separation performance. The sound source separating unit 914 updates the separation matrix only when a direction of a sound source is detected by the sound source localizing unit 913.