1. Field of the Invention
The present invention relates to a video signal reproduction apparatus and method for reproducing a video signal from a recording medium containing a video signal prediction-coded in the time axis direction and in particular, to a video signal reproduction apparatus and method preferably used for performing a forward direction slow reproduction and forward direction stepped reproduction of a video signal recorded on a recording medium.
2. Description of Prior Art
A so-called digital video disc (hereinafter, referred to as DVD) is known as a recording medium of an optical disc containing a digital video signal and a digital audio signal recorded.
This DVD is formatted in cells as basic units for reproducing information contents. Each of the cells consists of a reproduction unit of 0.4 to 1.2 seconds called a video object unit (VOBU). At the head of VOBU, a navigation pack NV_PCK is allocated as a control information pack. This NV_PCK contains a presentation control information (PCI) and a data search information (DSI). These information items are used as a VOBU address information for scanning before and after a VOBU as the aforementioned reproduction minimum unit
Here, FIG. 3 shows a data configuration of the aforementioned DVD format. as shown in this FIG. 3, a unit of a video object set (VOB) is used for management of a main video data, sub video data, and an audio data. This VOBS, for example, includes a whole one work. The VOBS consists of a plurality of video objects (VOB""s). The VOB consists of a group of data items recorded on a disc. Each VOBS consists of a plurality of cells. Each cell represents a single scene or one cut in a movie. One cell lasts 10 seconds or so. Moreover, the DVD has a function of a multi-story format for showing one movie in a plurality of stories, or a function of a so-called parental lock for skipping an educationally unacceptable scene such as violence. These functions are created by combination of these cells.
One cell consists of a plurality of video object units (VOBU""s). Each VOBU is a unit of 0.4 to 1.2 seconds of a moving picture. This VOBU contains, for example, one GOP (Group of Pictures) in a so-called MPEG format. It should be noted that the MPEG is a hybrid data compression standard suggested by ISO-IEC/JTC1/SC2/WG11 and uses a motion compensation predictive coding in combination with discrete cosine transform (DCT). The GOP (group of pictures) of MPEG standard contained in the VOBU consists of intra-frame predictive coding pictures (I picture), inter-frame forward predictive coding pictures (P pictures), and inter-frame bidirectional predictive coding pictures (B pictures). For example, if one GOP consists of 15 frames, the GOP contains one frame of I picture, four frames of P picture, and 10 frames of B picture.
FIG. 6A shows a configuration example of inter-frame prediction in the MPEG method when one GOP consists of 15 frames, for example.
In this FIG. 6A, the I picture is an intra-frame predictive coding picture which is predictive-coded within a frame; the P picture is an inter-frame forward predictive coding picture which performs prediction referencing a temporally preceding coded picture (I picture or P picture); and the B picture is a bidirectional predictive coding picture which performs prediction referencing two frames, i.e., temporally preceding and following pictures.
That is, as shown by arrows, I picture I2 is intra-frame prediction coded within the frame; P picture P5 is inter-frame prediction coded by referencing the I picture I2; and a P picture P8 is inter-frame prediction coded by referencing the P picture P5. Furthermore, B pictures B3 and B4 are inter-frame prediction coded by referencing two pictures, I picture I2 and P picture P5; and B pictures B6 and B7 are inter-frame prediction coded by referencing two pictures, P picture P5 and P picture P8. In the same way, the other pictures are prediction coded. It should be noted that the subscription number of each picture represents a temporary reference (hereinafter, referred to as TR). This TR indicates the frame sequence within GOP. In a normal video reproduction, the frames are reproduced in this TR sequence.
When decoding the pictures thus coded, the decoding is performed as follows. In the case of I picture which has been prediction coded within a frame, the I picture alone can be decoded. However, in the P picture which has been prediction coded by referencing a temporally preceding I picture or P picture, a temporally preceding I picture or P picture is required for decoding. In the case of B picture which has been prediction coded by referencing two pictures temporally preceding and following, its decoding requires an I picture or P picture temporally preceding and following.
For this, pictures are arranged as shown in FIG. 6B so that pictures required for decoding are decoded beforehand. That is, decoding of B pictures B0 and B1 requires an I picture or P picture in a preceding GOP and the I picture I2 in the current GOP. Accordingly, the I picture I2 is arranged prior to the B pictures B0 and B1. Decoding of B pictures B3 and B4 requires the I picture I2 and P picture P5. Accordingly, the P picture P5 is arranged prior to the B pictures B3 and B4. Decoding of B pictures B6 and B7 requires the P picture P5 and the P picture P8. Accordingly, the P picture P8 is arrange prior to the B pictures B6 and B7. Decoding of B pictures B9 and B10 requires P pictures P8 and P11. Accordingly, the P picture P11 is arranged prior to the B pictures B9 and B10. Decoding of B pictures B12 and B13 requires P pictures P11 and P14. Accordingly, the P picture P14 is arranged prior to the B pictures B12 and B13. Thus, in the MPEG method, the decoding order is different from the presentation order of the frames.
Referring back to FIG. 3, each VOBU contains a navigation pack NV_PCK which is a control information having a VOBU management information, a video pack V_PCK which is a pack having a main video data, an audio pack A_PCK having an audio data, and a sub picture pack SP_PCK which is a pack having a sub video data. The V_PCK, A_PCK, and SP_PCK are compressed in the MPEG2 format or the like and recorded on an optical disc as a recording medium.
FIG. 8 shows a configuration of a navigation pack NV_PCK which is control information pack arrange at the head of a VOBU. As shown in FIG. 8, the NV_PCK contains a Pack header and a System header followed by a Presentation control information (PCI) and a Data search information (DSI). The PCI packet contains a PCI data, and the DSI packet contains a DSI data.
Furthermore, the DSCI data contains a DSI general information DSI_GI (not depicted). This DSI_GI contains end addresses of the reference pictures (I picture and P picture) in the MPEG pictures, i.e., VOBUxe2x80x941STREF_EA which is the end address of the first reference picture in the VOBU, VOBUxe2x80x942NDREF_EA which is the end address of the second reference picture in the VOBU, and VOBUxe2x80x943RSREF_EA which is the end address of the second reference picture in the VOBU.
On the other hand, as shown in FIG. 9, a video signal consists of a top field and a bottom field that constitute a frame as a unit. In a normal Play mode, reproduction of the top field is performed before the bottom field.
This video signal using as a unit the pair of two pictures, i.e., top field and bottom field, is applied to, for example, to a skip scan, a so-called interlace method. It should be noted that there is also a progressive method which scans the scan lines one after another without skipping a scan line.
In a conventional video signal reproduction apparatus, the forward slow reproduction (SlowF) or forward stepped reproduction (StepF), as shown in FIG. 9, only the top fields are reproduced one after another. That is, a top field of one frame is followed by the top field of the next frame, skipping the bottom field of the current frame.
Accordingly, in the conventional forward slow reproduction (SlowF) or the forward stepped reproduction (StepF), only half of the fields are reproduced, slightly lacking in a smoothness of the representation.
Furthermore, in the conventional forward slow reproduction (SlowF) or the forward stepped reproduction (StepF), there are fields not displayed and accordingly, an unnatural feeling may be caused in correspondence between the video and the audio.
It is therefore an object of the present invention to provide a video signal reproduction apparatus and method capable of a video signal reproduction giving a smooth motion representation while maintaining a proper correspondence between the audio and the video even in the forward slow reproduction (SlowF) or in the forward stepped reproduction (StepF).
In order to achieve the aforementioned object, the video signal reproduction apparatus according to the present invention is for reproducing a video signal from a bit stream of coded video signal in which each frame consists of a first field and a second field, the apparatus comprising a controller operating in such a way that in a forward slow reproduction mode for reproducing a video signal from the bit stream in a forward direction at a lower speed than a standard reproduction speed, or in a forward stepped reproduction mode for reproducing a video signal from the bit stream in a forward direction picture by picture, if a currently reproduced field is the second field, a reproduction position pointer is shifted to the next frame to be reproduced.
The video signal reproduction method according to the present invention is for reproducing a video signal from a bit stream of coded video signal in which each frame consists of a first field and a second field, wherein in a forward slow reproduction mode for reproducing a video signal from the bit stream in a forward direction at a lower speed than a standard reproduction speed, or in a forward stepped reproduction mode for reproducing a video signal from the bit stream in a forward direction picture by picture, if a currently reproduced field is the second field, a reproduction position pointer is shifted to the next frame to be reproduced.