1. Field of the Invention
The present invention relates to a digital signal reproduction method and apparatus for reproducing a digital signal in which a digital audio signal and a digital video signal are multiplexed and in particular, to a digital signal reproduction method and apparatus preferable for a high-speed reproduction including a double-speed reproduction of a digital signal recorded on a recording medium.
2. Description of the Prior Art
A so-called digital video disc (hereinafter, referred to as a DVD) is known as a recording medium such as an optical disc on which a digital video signal, a digital audio signal, and the like are recorded.
In the DVD format, a cell is used as a basic unit for reproducing contents of an information. This cell further consists of video object units (VOBU) of 0.4 to 1.2 second reproduction unit which is the smallest reproduction unit. At the head of this smallest reproduction unit VOBU is arranged a navigation pack (NV_PCK) which is a control information pack. This NV_PCK contains a presentation control information (PCI) and a data search information (DSI). These information items are used, for example, as the VOBU address information for scanning before and after the aforementioned smallest reproduction unit VOBU.
Here, FIG. 1 shows a data configuration of the aforementioned DVD format. As shown in FIG. 1, a video object set (VOBS) unit is used for management of a main video data, a sub-video data, and an audio data. This VOBS, for example, corresponds to one movie work. This VOBS consists of a plurality of video objects (VOB). The VOB is a unit of a group of data recorded on a disc. The VOB consists of a plurality of cells. The cell corresponds to, for example, one scene or one cut of a movie. Moreover, the DVD has a multi-story format offering one movie in plurality of story arrangements and a parental lock function for skipping an undesirable scene such as a violence scene. These functions are created by combination of the cells.
One cell consists of a plurality of video object units (VOBU). The VOBU corresponds to 0.4 to 1.2 seconds in a moving picture and this VOBU contains, for example, one GOP (group of pictures) in the so-called MPEG format. It should be noted that the MPEG is a hybrid data compression specification using the motion compensative predictive coding and the discrete cosine transform (DCT) in combination which has been discussed and suggested by the ISO-IEC/JTC1/SC2/WG11. The GOP (group of pictures) of the MPEG specification contained in a VOBU includes an intra-frame coding picture (I picture), a forward frame-to-frame predictive coding picture (P picture), and a bidirectional frame-to-frame predictive coding picture (B picture).
FIG. 2A shows a configuration example of a frame-to-frame prediction in the MPEG method in a case when one GOP consists of, for example, 15 frames.
In FIG. 2A, the I picture is an intra-frame coding picture which has been subjected to a predictive coding within one frame; the P picture is a forward frame-to-frame coding picture which predicts referencing a temporally preceding frame (I picture or P picture) which has been coded; and the B picture is a bidirectional predictive coding picture which predicts referencing two frames, i.e., temporally preceding and following pictures.
That is, as shown by the arrows in the figure, the I picture I2 is coded with prediction within the frame; the P picture P5 is coded with prediction referencing the I picture I2; and the P picture P8 is coded with prediction referencing the P picture P5. Furthermore, the B pictures B3 and B4 are coded each referencing two pictures, i.e., I picture I2 and P picture P5; and the B pictures B6 and B7 are coded each referencing two pictures, i.e., P picture P5 and P picture P8. Thus, predictive coding is carried out for creating the remaining pictures. It should be noted that the subscript in each of the pictures represents a temporary reference (hereinafter, referred to as TR). Here, the TR indicates the picture sequence in the GOP and during a normal picture reproduction, the frames are reproduced in this TR sequence.
When decoding these pictures which have been predictive-coded, various pictures are required depending on the picture type. The I pictures which have been coded with prediction within a frame can be decoded with the I pictures alone. However, the P pictures which have been coded referencing a temporally preceding I picture or P picture require the temporally preceding I picture or P picture for decoding. The B pictures which have been coded referencing temporally preceding and following I picture or P picture require the temporally preceding and following I picture or P picture for decoding.
In order that those pictures required for decoding can be decoded in advance, the pictures are rearranged as shown in FIG. 2B. That is, the B pictures B0 and B1, during decoding, require an I picture or P picture of the preceding GOP and the I picture I2 and accordingly, the I picture I2 is arranged prior to the B pictures B0 and B1. The B pictures B3 and B4, during decoding, require the I picture I2 and the P picture P5 and accordingly, the P picture P5 is arranged prior to the B pictures B3 and B4. The B pictures B6 and B7, during decoding, require the P pictures P5 and P8 and accordingly, the P picture P8 is arranged prior to the B pictures B6 and B7. The B pictures B9 and B10 , during decoding, require the P pictures P8 and P11 and accordingly, the P picture P11 is arranged prior to the B pictures B9 and B10. The B pictures B12 and B13, during decoding, require the P pictures P11 and P14 and accordingly, the P picture P14 is arranged prior to the B pictures B12 and B13. Thus, in the MPEG method, the decoding order is different order is different from the presentation order of the pictures displayed.
Back to FIG. 1, one VOBU (video object unit) consists of: a navigation pack NV_PCK which is a control data pack containing the VOBU management information and the like; a video pack V_PCK containing a main video data; an audio pack A_PCK containing an audio data; and a sub-picture pack SP-PCK containing a sub-audio data. The V_PCK, A_PCK, and SP_PCK are respectively compressed according to a format such as MPEG2 and recorded on a recording medium, i.e., an optical disc.
FIG. 3 shows a configuration of the navigation pack NV_PCK which is a control data pack arranged at the head of a VOBU. As shown in this FIG. 3, the NV_PCK has a pack header and a system header which is followed by a PCI (presentation control information) packet containing a PCI data and a DSI (data search information) packet containing a DSI data.
Furthermore, the DSI data contains DSI-GI (not depicted) which is a general information of the DSI. This DSI_GI contains the end addresses of the reference pictures (I pictures and P pictures) of each picture in the aforementioned MPEG. More specifically, the DSI_GI contains a data on the end address of the first reference picture (I picture) in the VOBU VOBU_1STREF_EA, the end address of the second reference picture (the first P picture) in the VOBU VOBU_2NDREF_EA, and the end address of the third reference picture (the second P picture) in the VOBU VOBU_3RDREF_EA.
When carrying out a forward or backward high-speed reproduction such as double-speed reproduction, if only the pictures of the addresses obtained from the aforementioned NV-PCK are reproduced, there often arises a difficulty for a user of the reproduction apparatus to find a target picture during scan of a scene containing no persons or scene containing no moving objects.
It is therefore an object of the present invention to provide a digital signal reproduction method and apparatus capable of simultaneously reproducing a video data and an audio data at a high-speed reproduction so as to facilitate search not only by the video information but by the audio information.
In order to achieve the aforementioned object, the present invention is characterized in that when reproducing a digital signal containing a plurality of signal types including an audio signal multiplexed in blocks as the smallest reproduction unit, at a higher speed than a standard reproduction speed, an audio signal corresponding at least a part of the area in the aforementioned smallest reproduction unit is successively reproduced.
The aforementioned digital signal contains a compressed video signal which has been compression-coded by way of predictive coding in the time axis direction. During a high-speed reproduction, at least a part of reference pictures of the compressed video signal in the smallest reproduction unit alone are successively reproduced. Simultaneously with this, it is preferable that an audio signal corresponding to a continuous area containing the aforementioned part of reference pictures in the smallest reproduction unit be successively reproduced or, if the smallest reproduction unit does not contain a predetermined picture data of the compressed video signal, an audio signal corresponding to approximately half area of the smallest reproduction unit be successively reproduced.
Here, the aforementioned smallest reproduction unit is, for example, a VOBU (video object unit) in the so-called DVD format. The aforementioned continuous area is a continuous area containing the aforementioned part of reference pictures as well as pictures sandwiched by these reference pictures. The aforementioned reference pictures are pictures which are referenced during predictive coding in the time axis direction, and more specifically, an intra-frame prediction-coded picture (I picture) and frame-to-frame forward direction predictive coding picture (P picture). Moreover, the predetermined picture data is, more specifically, the data on the first, second, and third reference pictures in the VOBU. When the data on the third reference picture is missing, an audio data in half area of the VOBU is read out to be reproduced.
Thus, during a high-speed reproduction, a video signal in a part of area of the smallest reproduction unit is reproduced together with an audio signal of that part of area, so as to enable to search not only by the picture information but also by audio information.