Field of the Invention
The present invention relates to a stereoscopic video and audio recording method, a stereoscopic video and audio reproducing method, a stereoscopic video and audio recording apparatus, a stereoscopic video and audio reproducing apparatus, and a stereoscopic video and audio recording medium.
Description of the Background Art
Along with the progress of encoding techniques for compressing digital video signals, the record of compressed video signals to an optical disk have enabled implementation of an optical disk apparatus with excellent searchability and operability. Since such an optical disk apparatus records video as a digital signal, the optical disk apparatus is free from deterioration in dubbing compared with recording of an analog video signal, and achieves a higher reliability with its non-contact optical recording and reproduction. Examples of encoding methods for compressing data of such a digital video signal include the MPEG (Moving Picture coding Experts Group) method. This MPEG compression method achieves improved compression efficiency as compared with intra compression methods such as motion JPEG, but the MPEG compression method is still restricted by the following constrained conditions. For example, since the MPEG compression method uses a motion⋅compensated prediction in the temporal direction, the MPEG compression method allows searching only in units of videos consisting of group of pictures (GOPs), and in order to access a disk, it is necessary to initially access one picture made by the intraframe compression.
At least in terms of audio information, Japanese Patent No. 3935507, Japanese Patent Application Laid-Open No. H11-191895, Japanese Patent No. 4036239, Japanese Patent No. 4124268, and Japanese Patent Application Laid-Open No. 2008-236757 use the same data format as that used for a conventional two-dimensional planar video. In this respect, a method for allowing sound-video localization in synchronization with viewing of a stereoscopic video is disclosed in Japanese Patent Application Laid-Open No. 2006-128818 or Japanese Patent Application Laid-Open No. 2006-128816, but Japanese Patent Application Laid-Open Nos. 2006-128818 and 2006-128816 merely relate to a method for selectively reproducing something that has been recorded with a plurality of microphones. Meanwhile, Japanese Patent Application Laid-Open No. H6-105400 describes a method for allowing stereoscopic sound-video localization, but is silent on a format made with special consideration paid to recording and reproduction of a medium.
For example, NTT DOCOMO Technical Journal, VOL II, No. 1, relates to stereoscopic sound-video of sound, and explains, e.g., a mechanism of surround techniques for enhancing realistic sensation. However, Journal, VOL II, No. 1, does not describe sound processing performed in synchronization with stereoscopic videos. Accordingly, it has been desired to achieve a method for enhancing realistic sensation in an acoustic system used to see stereoscopic video. It is to be noted that the above cited patent publications, i.e., Japanese Patent No. 3935507, Japanese Patent Application Laid-Open No. H11-191895, Japanese Patent No. 4036239, Japanese Patent No. 4124268, Japanese Patent Application Laid-Open No. 2008-236757, Japanese Patent Application Laid-Open No. 2006-128818, Japanese Patent Application Laid-Open No. 2006-128816, and Japanese Patent Application Laid-Open No. H6-105400 describe neither a later-described stereoscopic-viewing audio file easily generated during content production nor a medium format made with consideration paid to viewing-safety, and do not describe a more specific method for, e.g., how to determine a portion of a video and an OSD to which sound-video localization in synchronization with videos should be focused on.
As screens become larger and attain a higher resolution, more realistic viewing has been achieved. Meanwhile, in the field of audio processing, pseudo surround techniques have emerged, which allow listening of more realistic audio even in a case of 5.1 ch surround or 2 ch audio. With the advent of a TV capable of showing stereoscopic videos, it has been desired to reproduce the sound in such a manner to allow a user to perceive audio-video sensations differently for each of a two-dimensional video and a three-dimensional video, because the two-dimensional video and the three-dimensional video appear differently in terms of perception of persons and objects in the depth direction. Especially, when a medium or distributed content information includes both of two-dimensional videos and three-dimensional videos but includes only one set of audio information that is shared by both of two-dimensional video and three-dimensional video, there is a problem in that the audio information cannot be optimized in order to provide realistic sensation suitable for listening of audio with each of the two-dimensional video and the three-dimensional video.
In addition, there is a problem in that a user cannot view stereoscopic videos with sound localization and sound-video being in synchronization with the amount of depth perception of videos in accordance with the stereoscopic videos.
Further, in a case where localization in the depth direction is performed in synchronization with a stereoscopic video in sound-video processing, it is impossible to determine a portion of the stereoscopic video to which a position of sound should be determined in the depth direction, which causes a problem in that a user cannot listen to realistic sounds.
Further, in a case where localization in the depth direction is performed in sound-video processing in synchronization with a stereoscopic video upon determining a position of the face or the mouth of a person appearing in a stereoscopic screen, it takes a long time to perform the processing for determining the position of the face or the mouth of the person appearing in the stereoscopic screen, and this processing is likely to fail to catch up with real-time process.
Further, in a case where audio information suitable for a stereoscopic video is further added to a medium including a stereoscopic video stream, it is necessary to have a format that allows an apparatus to immediately determine the content of audio information suitable for the stereoscopic video and recognize the sequence of menu operation and reproduction based on disk information management information.
Further, in a case where audio information suitable for a stereoscopic video is further added to a video and audio distribution stream including a stereoscopic video stream, it is necessary to have a format that allows an apparatus to immediately determine the content of audio information suitable for the stereoscopic video.