1. Field of the Invention
The present invention relates to an audiovisual (AV) data recording device and method for recording both of a stereo motion image and a sound.
2. Description Related to the Prior Art
A digital video camera (camcorder) for recording audiovisual (AV) data is widely used, and there are various types of digital video cameras developed from commercial use to home use. The digital video camera is provided with an imaging unit and a microphone, and records motion image data captured by the imaging unit and sound data converted by the microphone together on a recording medium.
According to U.S. Patent Application Publication No. 2002/057347 and U.S. Pat. No. 6,714,238, the directionality of the microphone, including spreading, direction, and sensitivity, is controlled in response to panning, tilting, and zooming operation of the digital video camera. Upon zooming in on a main object with a narrow angle of view, for example, the directionality of the microphone is narrowed therewith, and only sound or voice from the main object is captured to add a sense of realism.
In reproduction of the AV data, the sound is outputted from two speakers disposed side-to-side. A viewer listens to a composite sound, and perceives where the composite sound comes from, in other words, where a sound image being a virtual sound source is located. The position of the sound image perceived by the viewer is referred to as a sound image location. The sound image is located by varying volume levels of the sound outputted from the left and right speakers, for example. If the sound is outputted at the same volume level from both of the speakers, for example, the sound image is located in the middle of the two speakers. If the sound is outputted only from the left speaker, the sound image is located near the left speaker. If the sound is outputted only from the right speaker, the sound image is located near the right speaker. When a reference line denotes a line that connects the viewer to the midpoint of the two speakers, a location angle refers to an angle that a line connecting the viewer to the located sound image forms with the reference line.
The sound that is outputted from the speakers and reaches the viewer is composed of a reverberation sound and a direct sound. The reverberation sound reaches the viewer after reverberation against surrounding walls and the like, while the direct sound directly reaches the viewer. If the ratio of the reverberation sound to the direct sound is high, the sound image is located on a back side, and the viewer perceives as if the sound emerged from a distant sound source. If the ratio of the reverberation sound to the direct sound is low, on the other hand, the sound image is located on a front side, and the viewer perceives as if the sound emerged from a near sound source. Thus, increasing the ratio of the reverberation sound or lowering the volume level of the output sound locates the sound image on the back side, while decreasing the ratio of the reverberation sound or raising the volume level of the output sound locates the sound image on the front side.
In an audio signal processing apparatus of U.S. Patent Application Publication No. 2007/0189551, when zooming in to telephoto on three persons out of five persons, the location angle of the sound image of each object person is changed. Only sounds or voices from the three object persons are recorded, while sounds or voices from the excluded two persons are not recorded. This allows increase in a sense of togetherness between a motion image and the sounds in reproduction. The sound image of the object person positioned in the middle of the image is located in the middle of the left and right speakers. The sound image of the object person positioned at the left of the image is located near the left speaker, and the sound image of the object person positioned at the right of the image is located near the right speaker.
In a three-dimensional space reproduction system according to Japanese Patent Laid-Open Publication No. 6-105400, a stereo image is produced from two images having disparity, that is, an L viewpoint image seen by a viewer's left eye and an R viewpoint image seen by a viewer's right eye. Output of the sound data is controlled in accordance with the depth of the stereo image. In the stereo image, the amount of disparity becomes a maximum at a point that looks nearest to the viewer, and the amount of disparity becomes a minimum at a point that looks farthest from the viewer. If the difference between the maximum disparity and the minimum disparity is small, the stereo image has a shallow depth. In this case, the sound image is located on the front side by reducing the ratio of the reverberation sound to the direct sound, and the depth of the sound is shallowed to increase a sense of realism in a scene. If the difference between the maximum disparity and the minimum disparity is large, the stereo image has a deep depth. Thus, the ratio of the reverberation sound is increased to locate the sound image on the rear side and deepen the depth of the sound.
According to the U.S. Patent Application Publication No. 2002/057347, the U.S. Pat. No. 6,714,238, and the U.S. Patent Application Publication No. 2007/0189551, the obtained sound data is processed in response to variation of the image size of the main object with the panning, tilting, and zooming operation of the digital video camera, in order to increase the senses of realism and togetherness between the motion image and the sound. The sound data, however, is not processed, when the main object itself gets near to or goes away from the digital video camera without the panning, tilting, and zooming operation.
According to the Japanese Patent Laid-Open Publication No. 6-105400, the depth of the sound depends on the difference between the maximum disparity and the minimum disparity. Thus, even if the image size of the main object is large, when the difference between the maximum disparity and the minimum disparity is large and the stereo image has the deep depth, the sound also has the deep depth. This causes a lack of a sense of realism.