Recent years have seen the development of voice processing apparatuses, such as teleconferencing systems or telephones equipped with hands-free talking capability, that capture voices by using a plurality of microphones. For such voice processing apparatuses, developing technologies for suppressing voice coming from any direction other than a specific direction and thereby making voice coming from the specific direction easier to hear has been proceeding.
For example, Japanese Laid-open Patent Publication No. 2007-318528 discloses a directional sound-capturing device which converts a sound received from each of a plurality of sound sources, each located in a different direction, into a frequency-domain signal, calculates a suppression function for suppressing the frequency-domain signal, and corrects the frequency-domain signal by multiplying the amplitude component of the frequency-domain signal of the original signal by the suppression function. The directional sound-capturing device calculates the phase components of the respective frequency-domain signals on a frequency-by-frequency basis, calculates the difference between the phase components, and determines, based on the difference, a probability value which indicates the probability that a sound source is located in a particular direction. Then, the directional sound-capturing device calculates, based on the probability value, a suppression function for suppressing the sound arriving from any sound source other than the sound source located in that particular direction.
On the other hand, Japanese Laid-open Patent Publication No. 2010-176105 discloses a noise suppressing device which isolates sound sources of sounds received by two or more microphones and estimates the direction of the sound source of the target sound from among the thus isolated sound sources. Then, a noise suppressing device detects the phase difference between the microphones by using the direction of the sound source of the target sound, updates the center value of the phase difference by using the detected phase difference, and suppresses noise received by the microphones by using a noise suppressing filter generated using the updated center value.
Japanese Laid-open Patent Publication No. 2011-99967 discloses a voice signal processing method which identifies a voice section and a noise section from a first input voice signal and determines whether the magnitude of power of the first input voice signal in the noise section is larger than a first threshold value. When the magnitude of power of the first input voice signal is not larger than the first threshold value, the voice signal processing method suppresses noise in the voice section and noise section of the first input voice signal, based on the magnitude of power in the noise section. On the other hand, when the magnitude of power of the first input voice signal is larger than the first threshold value, the voice signal processing method suppresses the first input voice signal based on the phase difference between the first and second input voice signals.
Further, Japanese Laid-open Patent Publication No. 2003-78988 discloses a sound collecting device which divides two-channel sound signals captured by microphones into a plurality of frequency bands on a frame-by-frame basis, calculates a level or phase for each channel and for each frequency band, and calculates weighted averages of the levels and phases over a plurality of frames from the past to the present. Then, based on the difference in weighted average level or phase between the channels, the sound collecting device identifies the sound source to which the corresponding frequency band component belongs, and combines the frequency band component signals identified as belonging to the same sound source between the plurality of frequency bands.
On the other hand, Japanese Laid-open Patent Publication No. 2011-33717 discloses a noise suppressing device which calculates a cross spectrum from sound signals captured by two microphones, measures the variation over time of the phase component of the cross spectrum, and determines that a frequency component having a small variation is a voice component and a frequency component having a large variation is a noise component. Then, the noise suppressing device calculates such a correction coefficient so as to suppress the amplitude of the noise component.
However, depending on the difference in characteristics between the individual microphones used to capture the sounds or on the environment where the microphones are installed, the phase difference actually measured between the sounds received by the respective microphones from the sound source located in the specific direction may not necessarily agree with the theoretical value of the phase difference. As a result, the direction of the sound source may not be correctly estimated. Therefore, in any of the above prior art, the sound desired to be enhanced may be mistakenly suppressed or conversely, the sound to be suppressed may not be suppressed.