1. Field of the Invention
The present invention relates to a video signal reproduction apparatus and method for reproducing a video signal which has been prediction-recorded in the time axis direction on a recording medium and in particular, to a video signal reproduction apparatus and method that can preferably be used for high-speed (fast) reproduction of a video signal recorded on a recording medium.
2. Description of the Prior Art
A so-called digital video disc (hereinafter, referred to as DVD) has been known as a recording medium such as an optical disc having a digital video signal and a digital audio signal recorded thereon.
The DVD format uses a cell as a basic unit for reproducing contents of information. The cell consists of smaller reproduction units of 0.4 to 1.2 seconds, which unit is called Video Object Unit (VOBU). This smallest reproduction unit VOBU has at its head a control information pack called a navigation pack, NV_PCK. The NV_PCK contains a presentation control information (PCI) and a data search information (DSI). These information items are used as a VOBU address information for scanning before and after the smallest reproduction unit, VOBU.
FIG. 1 shows a data configuration of the DVD format. As shown in this FIG. 1, a main video data, a sub video data, and an audio data are managed in the video object set (VOBS). The VOBS is, for example, a unit of one movie work. A VOBS consists of a plurality of vide object (VOB). The VOB is a group of data recorded on a disc. Moreover, the VOBS consists of a plurality of Cell. The Cell, for example, corresponds to one scene or one cut of a movie. One Cell lasts several minutes to several tens of minutes. Moreover, the DVD can use a multi-story function which enables to represent one movie in a plurality of stories, or a parental lock function which skips a violence scene or other undesirable scene for education. These functions are created by combination of the Cells.
As has been described above, one Cell consists of a plurality of video object unit (VOBU). This VOBU contains, for example, one GOP (group of pictures) in the MPEG format. It should be noted that the MPEG has been discussed in the ISO-IEC/JTC1/SC2/WG11 and suggested as a standard specification. The MPEG is a hybrid data compression specification using the motion prediction coding and the discrete cosine transform (DCT) coding in combination. The GOP (group of pictures) of the MPEG specification contained in the aforementioned VOBU contains an intra-frame prediction coding picture (I picture), an inter-frame forward direction prediction coding pictures (P picture), and an inter-frame bidirectional prediction coding pictures (B picture). For example, a GOP consisting of 15 frames contains one frame of I picture, four frames of P picture, and ten frames of B picture.
FIG. 2A shows a configuration example of the inter-frame prediction in the MPEG method when a GOP consists of 15 frames.
In FIG. 2A, the I picture is an intra-frame prediction coding picture which has been prediction coded within one frame; the P picture is an inter-frame forward prediction coding picture which performs prediction by referencing a temporally preceding frame (I picture or P picture); and the B picture is a bidirectional prediction coding picture which performs prediction by referencing temporally preceding and following frames.
That is, as indicated by arrows, the I picture I2 is prediction coded within that frame; the P picture P5 is prediction coded by referencing the I picture I2; the Picture P8 is prediction coded by referencing the P picture P5. Furthermore; B pictures B3 and B4 are inter-frame prediction coded by referencing two pictures, i.e., the I picture I2 and the P picture P5; the B picture B6 and B7 are inter-frame coded by referencing two P pictures P5 and P8. Similarly, the other pictures are prediction coded and created. It should be noted that each subscribed numeric represents a temporary reference (hereinafter, referred to as TR). This TR picture sequence in the GOP and in a normal video reproduction, the frames are reproduced in this TR sequence.
The pictures thus prediction coded are decoded as follows. An I picture which has been prediction coded within a frame can be decoded with the I picture alone. Decoding of a P picture which has been prediction coded by referencing a temporally preceding I picture or P picture requires the temporally preceding I picture or P picture. Decoding of a B picture which has been prediction coded by referencing temporally preceding and following I pictures or P pictures requires the temporally preceding and following I pictures or P pictures.
To scope with this, the pictures are rearranged as shown in FIG. 2B, so that the pictures required for decoding can be decoded in advance. That is, decoding of B pictures B0, B1 requires an I picture or P picture in the preceding GOP and an I picture I2 in the current GOP. Accordingly, the I picture I2 is arranged before the B pictures B0, B1. Decoding of B pctures B3, B4 requires the I picture I2 and the P picture P5. Accordingly, the P picture P5 is arranged before the B pictures B3, B4. Decoding of the B pictures B6, B7 requires P pictures P5 and P8. Accordingly, the P picture P8 is arranged before the B pictures B6, B7. Decoding of B pictures B9, B10 requires P pictures P8, and P11. Accordingly, the P picture P11 is arranged to precede the B pictures B9, B10. Decoding of B pictures B12, B13 requires P pictures P11 and P14. Accordingly, P picture P14 is arranged to precede the B pictures B12, B13. Thus, in the MPEG method, the decoding order is different from the presentation order.
For decoding, the aforementioned prediction coding and Huffman coding are used. Moreover, recording is performed with a variable bit rate (variable transfer rate). For example, a scene with a rapid motion or complicated image requires a large data amount, whereas a simple image or an almost still image requires a small data amount. When this fact is taken into consideration, making use of a variable bit rate variable up to a transfer rate of 9.8 Mbps for example, an average bit rate of 3.5 Mbps is sufficient to realize a picture quality which requires the twice higher rate if a fixed rate is used.
Referring back to FIG. 1, one VOBU is constituted by a navigation pack NV_PCK containing a control information containing a VOBU management information; a video pack V_PCK containing a main video; an audio pack A_PCK containing an audio data; and sub picture pack SP_PCK containing a sub video data. The V_PCK, A_PCK, and SP_PCK are respectively compressed by a format such as MEPG2 and recorded on an optical disc as a recording medium.
FIG. 3 shows a configuration of the navigation pack NV_PCK which is a control information pack contained at the head of a VOBU. As shown in this FIG. 3, the NV_PCK contains a pack header and system header which are followed by a PCI (presentation control information) packet and DSI (data search information) packet. The PCI packet and the DSI packet contain a PCI data and a DSI data, respectively.
Furthermore, although not depicted, the DSI data contains DSI_GI which is a DSI general information. This DSI_GI contains end addresses of the reference pictures (I picture and P picture) for each of the pictures of the aforementioned MPEG. More specifically, the DSI_GI contains VOBU_1STREF_EA which is the end address of the first reference picture within the VOBU, VOBU_2NDREF_EA which is the end address of the second reference picture within the VOBU, and VOBU_3RDREF_EA which is the end address of the third reference picture within the VOBU.
When performing a fast reproduction, for example, twice faster than the standard reproduction speed, the aforementioned NV_PCK address information can be used to reproduce parts of reference pictures (I picture and P picture) within one GOP of the MPEG specification consisting of a plurality of video packs (V_PCK) in a VOBU. For example, Japanese Patent Application 7-32944 (Specification and Drawings) filed by the applicant of the present invention discloses a technique for fast reproduction by reproducing one I picture and two P pictures following the I picture within a GOP before proceeding to the next GOP. This corresponds to, in a DVD case, reproduction of three reference pictures having addresses up to the address obtained by the end address VOBU_3RDREF_EA of the third reference picture in the aforementioned NV_PCK.
Here, one video data in a VOBU corresponds to one GOP. If the GOP is constituted by one frame of I picture, four frames of P picture, and ten frames of B picture, among the five reference pictures (I picture and four P pictures), three reference pictures up to the aforementioned address VOBU_3RDREF_EA are reproduced, while the remaining two reference pictures are track-jumped without being reproduced. Such a fast reproduction can realize a certain high speed by performing a track jump but cannot perform reproduction with a smooth motion. Accordingly, when a smooth motion is desired, such a fast reproduction cannot be used.
It is therefore an object of the present invention to provide a video signal reproduction apparatus and method capable of performing a fast reproduction with a smooth motion.
The video signal reproduction apparatus according to the present invention is for performing a reproduction processing by reading a video signal which has been prediction coded in the time axis direction and recorded on a recording medium, wherein, during a fast reproduction, according to a reproduction state, switching is performed between a plurality of fast reproduction modes requiring reproduction of different numbers of reference pictures from the video signal recorded on the recording medium.
Here, as an example of the switching, a reproduction state is detected, and the detection output is compared to a threshold value so as to identify an appropriate fast reproduction mode, to which the switching is performed. The reproduction state may be, for example, a reproduction speed and a reproduction bit rate.
When the video signal has been coded and recorded on the recording medium with a variable bit rate, as a higher speed is detected, a fast reproduction mode requiring reproduction of a greater number of reference pictures is set in. Alternatively, as a lower bit rate is detected, a fast reproduction mode requiring reproduction of a greater number of reference pictures is set in.
That is, when a high reproduction speed or a low reproduction bit rate is detected, i.e., when a sufficient reproduction capacity is available, a fast reproduction mode using a greater number of reference pictures is selected so as to obtain a fast representation with a smooth motion.