Field of the Invention
The present invention relates to an audio signal processing apparatus for capturing sound as a surround audio signal, a movie capturing apparatus having the audio signal processing apparatus, and control methods for the same.
Description of the Related Art
In recent years, the prevalence of digital media such as DVD and Blu-ray™ discs, network video distribution, digital TV broadcasting, and the like has made it possible for even households to easily obtain video content containing surround sound. It has therefore become widely common for normal households to have an environment capable of reproducing surround audio signals. Due to this circumstance, the market has seen the entry of normal household movie capturing apparatuses such as camcorders and digital cameras that can also capture and reproduce surround sound.
In some of these movie capturing apparatuses, the upper portion of the casing is provided with a sound capturing unit that has three or more microphones arranged geometrically adjacent to each other, and sound is picked up from all directions in mainly the horizontal plane. Also, some movie capturing apparatuses can use an external microphone array for sound capturing. The audio signals of multiple channels captured by these sound capturing units are subjected to appropriate audio processing and converted into a surround audio signal of the 5.1ch format or the like.
When a surround audio signal is captured using this type of sound capturing unit, the videographer's voice is captured by all of the microphone units, and therefore can be heard as coming from above or behind in the reproduced sound field. Also, the videographer's voice is captured with a high sound pressure due to being at a short distance, and can seem unpleasant in the reproduced sound field in some cases.
Regardless of whether or not a surround audio function is provided, the distance between the videographer and the built-in microphone of general-use camcorders and the like is short, and therefore the voice level of the videographer tends to be higher than that of the subject in the recorded sound. For this reason, innovations have conventionally been made to make it possible to record voice as intended by the user by taking this characteristic into consideration.
For example, Japanese Patent Laid-Open No. 2009-122370 (referred to hereinafter as “Patent Document 1”) discloses a technique in which voice recognition is performed on captured audio, and processing is performed using the voice recognition results. Specifically, if the videographer's voice is recognized, the volume of a specific voice is suppressed, for example, by lowering the volume or filtering processing, or controlling the directionality of the microphones.
Also, Japanese Patent Laid-Open No. 2007-005849 (referred to hereinafter as “Patent Document 2”) discloses a configuration in which, in a video camera capable of capturing surround sound, a surround sound field suited to the video is captured by changing the directionality, sound capturing level, frequency characteristic, delay characteristic, or the like in surround sound capturing according to the shooting mode. In this case, in a narration mode, which is one example of a shooting mode, the videographer's voice is intensified by raising the volume of the two back channels and intensifying the rearward directionality, and thus the videographer's voice is captured as the principal voice.
However, Patent Document 1 is based on the premise of recognizing the videographer's voice, which leads to an increase in the number of constituent elements, such as a voice recognition apparatus and a voice database, and processing becomes complicated and heavy. Also, in Patent Document 1, the videographer's voice is suppressed by adjusting the microphone amplifier gain or directionality, and consideration is not given to the position of the sound image that corresponds to the videographer's voice. For this reason, envisioning the use of a surround sound function, the videographer's voice will be located rearward, and it will be difficult to adjust the sound image in the surround sound field. Also, although a shooting mode for emphasizing the videographer's voice is prepared in Patent Document 2, the videographer's voice is located rearward in the surround sound field in this case as well, and is furthermore emphasized, thus leading to cases where the sound field seems unnatural.