1. Technical Field of the Invention
The present invention relates to an apparatus for detecting a direction of sound source and an image pick-up apparatus with the sound source detection apparatus, applicable to a video conference and a video phone.
2. Description of the Prior Art
A direction of a narrator in conventional video conference using a plurality of microphones is detected, as disclosed in JP 4-049756 A (1992), JP 4-249991 A (1992), JP 6-351015 A (1994), JP 7-140527 A (1995) and JP 11-041577 A (1999).
The voice from a narrator reaches each of the microphones after each time delay. Therefore, the direction of the narrator or sound source is detected by converting time delay information into angle information.
FIG. 4 is a front view of a conventional apparatus for the video conference, which comprises image input unit 200 including camera lens 103 for photographing a narrator, microphone unit 170 including microphones 110a and 110b, and rotation means 101 for rotating image input unit 200.
The video conference apparatus as shown in FIG. 4 picks up the voice of the narrator and detects the direction of the narrator, thereby turning the camera lens 103 toward the narrator. Thus, the voice and image of the narrator are transmitted to other video conference apparatus.
FIG. 5 is an illustration for explaining a principle of detecting the narrator direction by using microphones 110a and 110b. There is a delay between the time when microphone 110b picks up the voice of the narrator and the time when microphone 110a picks up the voice of the narrator.
The narrator direction angle xcex8 is equal to sinxe2x88x921(Vxc2x7d/L), where V is speed of sound, L is a microphone distance and xe2x80x9cdxe2x80x9d is a delay time period, as shown in FIG. 5.
However, an accuracy of determining the direction xcex8 is lowered, when the delay and xcex8 becomes great.
Further, the voice of the narrator reflected by a floor and walls is also picked up by the microphones. The background noises in addition to the voice are also picked up. Therefore, the narrator direction may possibly be detected incorrectly.
An object of the present invention is to provide an apparatus for detecting a direction of a sound source such as a narrator, thereby turning an image pick-up apparatus toward the sound source.
An another object of the present invention is to provide an apparatus for detecting the direction of sound sources which move quickly or are switched rapidly.
A still another object of the present invention is to provide a sound source detection apparatus which is not easily affected by the reflections and background noises.
The apparatus for detecting the direction of sound source comprises a microphone pair, narrator direction detection means for detecting a delay of sound wave detected by the microphones, rotation means for rotating the microphone pair, driving means for driving the rotation means on the basis of the output from the narrator direction detection means, so that the microphone are equidistant from the sound source.
The apparatus for detecting the sound direction of the present invention may further comprises another fixed microphone pair, for turning quickly the rotatable microphone set toward the direction of the sound source.
The narrator direction detection means may comprises mutual correlation calculation means for calculating a mutual correlation between the signals picked up by left and right microphones of the microphone pair, delay calculation means for calculating the delay on the basis of the mutual correlation. Further, the delay may be calculated in a plurality of frequency ranges and averaged with such weights that the lower frequency components are less effective in the averaged result.
According to the variable gain amplifier of present invention, the first microphone pair is turned toward a narrator, so that the sound wave arrives at the microphones simultaneously. Accordingly, the microphone is directed just in front of the sound source.
Further, according to the present invention, the second fixed microphone pair executes a quick turning of the microphone direction. Furthermore, according to the present invention, the direction of the sound source is quickly detected by directing the second microphone set toward the center of the sound sources, when the sound source such as a narrator is changed.
Furthermore, according to the present invention, the detection result is hardly affected by the reflections from floors and walls in the lower frequency range, because the outputs from a plurality of band-pass filters are averaged such that the lower frequency components are averaged with smaller weight coefficients.