1. Field of the Invention
The present invention relates to a method, disk and apparatus for system encoding bitstreams to connect seamlessly thereof and, more particularly, bitstreams for use in an authoring system for variously processing a data bitstream comprising video data, audio data, and sub-picture data constituting each of plural program titles containing related video data, audio data, and sub-picture data content to generate a bitstream from which a new title containing the content desired by the user can be reproduced, and efficiently recording and reproducing the generated bitstream using a particular recording medium.
2. Description of the Prior Art
Authoring systems used to produce program titles comprising related video data, audio data, and sub-picture data by digitally processing, for example, multimedia data comprising video, audio, and sub-picture data recorded to laser disk or video CD formats are currently available.
Systems using Video-CDs in particular are able to record video data to a CD format disk, which was originally designed with an approximately 600 MB recording capacity for storing digital audio data only, by using such high efficiency video compression techniques as MPEG. As a result of the increased effective recording capacity achieved using data compression techniques, karaoke titles and other conventional laser disk applications are gradually being transferred to the video CD format.
Users today expect both sophisticated title content and high reproduction quality. To meet these expectations, each title must be composed from bitstreams with an increasingly deep hierarchical structure. The data size of multimedia titles written with bitstreams having such deep hierarchical structures, however, is ten or more times greater than the data size of less complex titles. The need to edit small image (title) details also makes it necessary to process and control the bitstream using low order hierarchical data units.
It is therefore necessary to develop and prove a bitstream structure and an advanced digital processing method including both recording and reproduction capabilities whereby a large volume, multiple level hierarchical digital bitstream can be efficiently controlled at each level of the hierarchy. Also needed are an apparatus for executing this digital processing method, and a recording media to which the bitstream digitally processed by said apparatus can be efficiently recorded for storage and from which said recorded information can be quickly reproduced.
Means of increasing the storage capacity of conventional optical disks have been widely researched to address the recording medium aspect of this problem. One way to increase the storage capacity of the optical disk is to reduce the spot diameter D of the optical (laser) beam. If the wavelength of the laser beam is l and the aperture of the objective lens is NA, then the spot diameter D is proportional to l/NA, and the storage capacity can be efficiently improved by decreasing l and increasing NA.
As described, for example, in U.S. Pat. No. 5,235,581, however, coma caused by a relative tilt between the disk surface and the optical axis of the laser beam (hereafter xe2x80x9ctiltxe2x80x9d) increases when a large aperture (high NA) lens is used. To prevent tilt-induced coma, the transparent substrate must be made very thin. The problem is that the mechanical strength of the disk is low when the transparent substrate is very thin.
MPEG1, the conventional method of recording and reproducing video, audio, and graphic signal data, has also been replaced by the more robust MPEG2 method, which can transfer large data volumes at a higher rate. It should be noted that the compression method and data format of the MPEG2 standard differ somewhat from those of MPEG1. The specific content of and differences between MPEG1 and MPEG2 are described in detail in the ISO-11172 and ISO-13818 MPEG standards, and further description thereof is omitted below.
Note, however, that while the structure of the encoded video stream is defined in the MPEG2 specification, the hierarchical structure of the system stream and the method of processing lower hierarchical levels are not defined.
As described above, it is therefore not possible in a conventional authoring system to process a large data stream containing sufficient information to satisfy many different user requirements. Moreover, even if such a processing method were available, the processed data recorded thereto cannot be repeatedly used to reduce data redundancy because there is no large capacity recording medium currently available that can efficiently record and reproduce high volume bitstreams such as described above.
More specifically, particular significant hardware and software requirements must be satisfied in order to process a bitstream using a data unit smaller than the title. These specific hardware requirements include significantly increasing the storage capacity of the recording medium and increasing the speed of digital processing; software requirements include inventing an advanced digital processing method including a sophisticated data structure.
Therefore, the object of the present invention is to provide an effective authoring system for controlling a multimedia data bitstream with advanced hardware and software requirements using a data unit smaller than the title to better address advanced user requirements.
To share data between plural titles and thereby efficiently utilize optical disk capacity, multi-scene control whereby scene data common to plural titles and the desired scenes on the same time-base from within multi-scene periods containing plural scenes unique to particular reproduction paths can be freely selected and reproduced is desirable.
However, when plural scenes unique to a reproduction path within the multi-scene period are arranged on the same time-base, the scene data must be contiguous. Unselected multi-scene data is therefore unavoidably inserted between the selected common scene data and the selected multi-scene data. The problem this creates when reproducing multi-scene data is that reproduction is interrupted by this unselected scene data.
When one of the multiple scenes is connected to common scene data, the difference between the video reproduction time and the audio reproduction time differs on each of the reproduction paths because of the offset between the audio and video frame reproduction times. As a result, the audio or video buffer underflows at the scene connection, causing video reproduction to stop (xe2x80x9cfreezexe2x80x9d) or audio reproduction to stop (xe2x80x9cmutexe2x80x9d), and thus preventing seamless reproduction. It will also be obvious that the difference between the audio and video reproduction times can cause a buffer underflow state even when common scene data is connected 1:1.
Therefore, the object of the present invention is to provide a data structure whereby multi-scene data can be naturally reproduced as a single title without the video presentation stopping (xe2x80x9cfreezingxe2x80x9d) at one-to-one, one-to-many, or many-to-many scene connections in the system stream; a method for generating a system stream having said data structure; a recording apparatus and a reproduction apparatus for recording and reproducing said system stream; and a medium to which said system stream can be recorded and from which said system stream can be reproduced by said recording apparatus and reproduction apparatus.
The present application is based upon Japanese Patent Application No. 7-252735 and 8-041581, which were filed on Sep. 29, 1995 and Feb. 28, 1996, respectively, the entire contents of which are expressly incorporated by reference herein.
The present invention has been developed with a view to substantially solving the above described disadvantages and has for its essential object to provide an optical disk for recording more than one system stream containing audio data and video data, wherein the audio data and video data of the plural system streams recorded to the optical disk are interleaved such that the difference between the input start times of the video data and audio data to the video buffer in the video decoder and the audio buffer in the audio decoder is less than the reproduction time of the number of audio frames that can be stored in the audio buffer plus one audio frame.