1. Field of the Invention
The present invention relates to microphone devices and audio players and, more specifically, to a microphone device and an audio player which detects a desired sound coming from a specific direction with noise being suppressed.
2. Description of the Background Art
The configurations of conventional microphone devices are described with reference to FIGS. 24 through 26.
FIG. 24 is an illustration showing the configuration of a conventional microphone device of Example 1. In FIG. 24, the conventional microphone device includes a first microphone unit 1010, a second microphone unit 1020, a signal adding section 1030, a first signal subtracting section 1031, a signal amplifying section 1050, an adaptive filter section 1060, and a second signal subtracting section 1062. Each of the microphone units 1010 and 1020 is placed so as to be oriented to the front (left in FIG. 24). The signal adding section 1030 adds a signal output from the first microphone unit 1010 and a signal output from the second microphone unit 1020. The first signal subtracting section 1031 subtracts the signal output from the second microphone unit 1020 from the signal output from the first microphone unit 1010. The signal amplifying section 1050 multiplies a signal output from the signal adding section 1030 by ½. The adaptive filter section 1060 is supplied with a signal output from the first signal subtracting section 1031, and outputs a signal obtained through filtering performed by an adaptive filter included therein. The second signal subtracting section 1062 subtracts a signal output from the adaptive filter section 1060 from a signal output from the signal amplifying section 1050. An output from the second signal subtracting section 1062 is an output from the microphone device. The adaptive filter section 1060 learns a filter coefficient from the signal output from the second signal subtracting section 1062 and the signal output from the first signal subtracting section 1031.
The operation of the conventional microphone device of Example 1 is described below. In order to detect a sound coming from the front, the microphone units 1010 and 1020 each output approximately the same signal. In order to detect a sound coming from other directions, the microphone units 1010 and 1020 output signals that are different in phase. The output signals from the microphone units 1010 and 1020 are then added together by the signal adding section 1030. The resultant signal obtained through addition is then normalized in level by the signal amplifying section 1050. That is, the amplitude of the signal is amplified by ½. With this, a main signal having components of the sound coming from the front can be obtained. Also, with the output from the first signal subtracting section 1031, it is possible to achieve a directivity characteristic such that the main axis of directivity is oriented to a direction of 90 degrees with respect to the front and the front direction is a direction of a minimum sensitivity in the directivity (that is, the sensitivity of directivity is minimum in the front direction). That is, the signal output from the first signal subtracting section 1031 serves as a noise reference signal which does not include the components of the sound coming from the front. The adaptive filter section 1060 uses the main signal output from the signal amplifying section 1050 and the noise reference signal output from the first signal subtracting section 1031 to achieve adaptive directivity. That is, the direction of a minimum sensitivity in the directivity is uniquely determined to be oriented to a noise sound coming from a direction other than the front direction.
FIG. 25 is an illustration showing the configuration of a conventional microphone device of Example 2. In FIG. 25, the conventional microphone device includes a first microphone unit 1010, a second microphone unit 1020, a first adaptive filter section 1040, a first signal delaying section 1041, a first signal subtracting section 1042, a second adaptive filter section 1060, a second signal delaying section 1061, and a second signal subtracting section 1062.
The first adaptive filter section 1040 is supplied with an output signal from the second microphone unit 1020 and then outputs the filtering results obtained by an adaptive filter included therein. The first signal delaying section 1041 delays a signal output from the first microphone unit 1010. The first signal subtracting section 1042 subtracts a signal output from the first adaptive filter section 1040 from a signal output from the first signal delaying section 1041. The first adaptive filter section 1040 learns a filter coefficient from a signal output from the first signal subtracting section 1042 and a signal output from the second microphone unit 1020. The second signal delaying section 1061 delays the signal output from the first signal delaying section 1041. The second adaptive filter section 1060 is supplied with a signal output from the first signal subtracting section 1042, and then outputs the filtering results obtained by an adaptive filter included therein. The second signal subtracting section 1062 subtracts a signal output from the second adaptive filter section 1060 from a signal output from the second signal delaying section 1061. The subtraction result is an output from the microphone device. The second adaptive filter section 1060 learns a filter coefficient from a signal output from the second signal subtracting section 1062 and a signal output from the first signal subtracting section 1042.
The operation of the conventional microphone device of Example 2 is described below. The first adaptive filter section 1040, the first signal delaying section 1041, and the first signal subtracting section 1042 performs a canceling operation on sound waves coming to the microphone units 1010 and 1020. That is, the signal output from the first signal subtracting section 1042 serves as a noise signal for the second adaptive filter section 1060. That is, the signal output from the first signal subtracting section 1042 is a signal serving a purpose similar to that of the signal output from the first subtracting section 1031 in FIG. 24. However, the conventional microphone device of Example 2 is different from that of Example 1 in the following point. That is, the directivity is fixed in Example 1, whilst the directivity can be changed by using the adaptive filters in Example 2.
FIG. 26 is an illustration showing the configuration of a conventional microphone device of Example 3. The conventional microphone device illustrated in FIG. 26 includes a first unidirectional microphone unit 1011, a second unidirectional microphone unit 1012, a first FFT section 1070, a second FFT section 1080, a two-input-type spectrum subtraction section 1090, and a voice recognition section 2000.
In FIG. 26, the first unidirectional microphone unit 1011 is placed so that the main axis of its directivity is oriented to the front. The second unidirectional microphone unit 1012 is placed so that the main axis of its directivity is oriented to the back. The first FFT section 1070 is supplied with a signal output from the first unidirectional microphone unit 1011 to find a frequency spectrum. The second FFT section 1080 is supplied with a signal output from the second unidirectional microphone unit 1012 to find a frequency spectrum. The two-input-type spectrum subtraction section 1090 is supplied with signals output from both of the FFT sections 1070 and 1080 to subtract, in a power spectrum region, the signal spectrum derived by the second FFT section 1080 from the signal spectrum derived by the first FFT section 1070, thereby outputting a spectrum of a target signal. The voice recognition section 2000 is supplied with the spectrum of the target signal output from the two-input-type spectrum subtraction section 1090 for voice recognition.
The operation of the conventional microphone device of Example 3 is described below. In Example 3, the first unidirectional microphone unit 1011 has a directivity characteristic of collecting a desired sound (target sound) from the front. The second unidirectional microphone unit 1012 has a directivity characteristic of mainly collecting noise. Therefore, a main signal m1 is obtained from the first unidirectional microphone unit 1011, while a noise reference signal m2 is obtained from the second unidirectional microphone unit 1012. Then, a spectrum of the main signal m1 is found by the first FFT section 1070, while a spectrum of the noise reference signal m2 is found by the second FFT section 1080. The power spectrum of the noise reference signal is subtracted from the power spectrum of the main signal by the two-input-type spectrum subtraction section 1090. With this, the power spectrum of the signal components are estimated. Note that, in a one-input-type spectrum subtraction scheme, a noise spectrum is estimated, assuming that noise is stationary during a time section in which the target sound has not yet arrive. Therefore, in the one-input-type spectrum subtraction scheme, only suppression of stationary noise is possible. On the other hand, according to the configuration of the microphone device of Example 3 adopting a two-input-type spectrum subtraction scheme, the spectrum of the noise reference signal can always be obtained by the second unidirectional microphone unit 1012. Therefore, suppression of non-stationary noise is possible. As such, according to the microphone device of Example 3, the ratio of voice recognition at the voice recognition section 2000 at a later stage can be improved by suppressing stationary noise and non-stationary noise. Note that, although the device illustrated in FIG. 26 is dedicated for voice recognition, the device can be used as a microphone device by performing IFFT at the last stage to convert the spectrum to a time signal and then to a waveform signal with frame overlap.
In the microphone device of Example 1, a large noise suppressing effect can be achieved under an environment where noise is coming from a certain direction. However, the microphone device of Example 1 does not handle noise coming from a plurality of directions. Therefore, under the actual noisy environment where noise sources simultaneously exist in various directions, the microphone device of Example 1 can merely achieve a noise suppressing effect equivalent to that obtained by conventional unidirectional microphone devices.
In the microphone device of Example 2, the noise reference signal is obtained by using the first adaptive filter. Here, in order to stably operate the first adaptive filter under the actual environment, it is required to cause the first adaptive filter to learn a filter coefficient only when the voice from the talker is sufficiently larger than the surrounding noise. Therefore, the microphone device of Example 2 cannot achieve a noise suppression effect until filter convergence has been completed. Moreover, under the noisy environment, filter convergence is difficult. Further, as with Example 1, the microphone device of Example 2 cannot handle a plurality of noise sources. Still further, since the microphone device of Example 2 was devised with the aim of suppressing wind noise, which has no correlation between unit signals, the direction of the target sound cannot be restricted. In other words, the largest one of the sounds that has arrived at the microphone device is regarded as the target sound. Therefore, it is impossible to performing a process of collecting sounds with a sound in a specific direction being enhanced.
In the microphone device of Example 3, the main signal and the noise reference signal are converted into spectrums. Then, noise is suppressed based on the power spectrums by using a spectrum subtraction scheme. With this, even if noise sources exist in a plurality of directions, their noise can be simultaneously suppressed. In the microphone device of Example 3, however, inclusion of even a slightest amount of components of the target sound in the noise reference sound will significantly deteriorate the sound quality of the processed sound or, at worse, may cancel the target sound itself. Moreover, in the actual sound field, a reflected wave may be diffracted to enter the microphone device even if the direction of a minimum sensitivity in the directivity of the unidirectional microphone unit are oriented to the direction of the target sound. Further, in normal microphone units, the amount of attenuation in the direction of a minimum sensitivity in the directivity is not infinite but on the order of 10 to 15 db. Therefore, the direct wave of the target sound may not be completely eliminated and may be included in the noise reference signal. Still further, in the spectrum subtraction scheme, a process delay will occur due to a frame processing. Therefore, the microphone device using the spectrum subtraction scheme is not suitable for simultaneous calls or loudspeakers.
Moreover, the above conventional microphone devices focus on suppressing additive noise, which is different from the target sound. The above conventional microphone devices cannot suppress multiplicative noise, which arrives after being reflected on a surface of reflection, such as a wall, a desk, or a floor. Therefore, the frequency characteristic of the target sound may be distorted due to, for example, the influence of reflection in a sound field where the microphone device is actually used. For this reason, particularly for the purpose of voice recognition, a mismatch in recognition may occur, leading to erroneous recognition.