1. Field of the Invention
The present invention relates to an audio signal processing method for producing a speech signal obtained by emphasizing a target speech signal of an input audio signal and an apparatus for the same.
2. Description of the Related Art
When a speech recognition technology is used in actual environment, ambient noise exercises great influence on a recognition rate. In a car interior, for example, there are many noises other than a speech, such as an engine sound of a car, wind noise, a sound of an oncoming car and a forereaching car, and a sound of car audio equipment. These noises are mixed in a speech of a speaker, and are input to a speech recognizer, causing a recognition rate to decrease greatly.
One method for solving a problem of such a noise is utilization of a microphone array which is one of noise suppression techniques. The microphone array is a system for signal-processing audio signals input from plural microphones to output an emphasized target speech. A noise suppression technique using the microphone array is effective in a hands free device.
Directivity is one of characteristics of noise in acoustic environment. For example, a voice of an interfering speaker is quoted as a directivity noise and has a characteristic that an arrival direction of noise is perceivable. On the other hand, non-directivity noise (as referred to as diffuse noise) is noise whose arrival direction is not settled in a specific direction. In many cases, the noise in actual environment has an intermediate character between the directivity noise and the diffuse noise. An engine sound may be heard generally in the direction of an engine room, but it does not have a strong directivity capable of specifying to one direction.
Since the microphone array performs noise suppression by using a difference between arrival times of audio signals of plural channels, great noise suppression effect for the directivity noise can be expected even by few microphones. On the other hand, the noise suppression effect is not great for the diffuse noise. For example, the diffuse noise can be suppressed by synchronous addition, but a number of microphones are necessary for a sufficient noise suppression to be obtained, so that the synchronous addition is distant.
Further, there is a problem of sound reverberation in actual environment. The sound emitted in closed space is observed by being reflected back in wall surfaces many times due to sound reverberation. Therefore, a target signal is to come from a direction different from an arrival direction of a direct wave to a microphone, so that the direction of a sound source becomes unstable. As a result, there is a problem that suppression of directivity noise by the microphone array becomes difficult and also the signal of target speech to be not suppressed is partially eliminated as the directivity noise. In other words, a problem of “target speech elimination” occurs.
JP-A 2007-10897 (KOKAI) discloses a microphone array technique under such sound reverberation. The filter coefficient of the microphone array, which includes influence of sound reverberation in acoustic environment assumed beforehand, will be learned. In actual use of the microphone array, the filter coefficient is selected based on a feature quantity derived from an input signal. In other words, JP-A 2007-10897 (KOKAI) discloses a technique of so-called learning type array. This method can suppress enough the directivity noise in the sound reverberation, and avoid the problem of “target speech elimination” too. However, the prior art disclosed in JP-A 2007-10897 (KOKAI) cannot suppress the diffuse noise using the directivity. The noise suppression effect is not enough even if using the technique disclosed in JP-A 2007-10897 (KOKAI).
The present invention is directed to enabling emphasis of a target speech signal by a microphone array while suppressing diffuse noise.