1. Field of the Invention
The present invention relates to a video decoding device, a video decoding method, and a program product therefor. More particularly, the present invention relates to a video decoding device which decodes a video stream containing MPEG-compressed picture frames each consisting of a first and second fields while inserting extra fields into the video stream in process a specified number of times. The present invention further relates to a video decoding method and a computer program which perform for the same.
2. Description of the Related Art
MPEG, short for Moving Picture Experts Group, is known as the name of international standard specifications for video compression systems. MPEG-based video coding and decoding systems play an essential role in our multimedia processing environments of today, and various types of MPEG coders and decoders have been developed.
FIG. 15 shows a typical configuration of a conventional MPEG video decoding device. As seen, this device comprises the following elements: a buffer memory 50, a video decoder 51, a decoding controller 52, a frame memory 53, and a display controller 54 and v-sync generator 55.
The buffer memory 50 serves as a temporary storage space for buffering an incoming bit stream. The video decoder 51 decodes video data stored in the buffer memory 50 in response to a decoding start command given from the decoding controller 52. Here, the video data includes: intra-coded pictures (I pictures), predictive-coded pictures (P pictures), and bidirectionally predictive-coded pictures (B pictures). The resulting decoded pictures are stored into the frame memory 53. The decoding controller 52 controls video decoding processes, including issuance of decoding start commands to the video decoder 51.
The frame memory 53 has the capacity of four pictures (or four frames), the space of which is segmented into four sections to store individual pictures reproduced by the video decoder 51. Those memory sections are referred to herein as the “banks.”
The display controller 54 determines the direction of playback operation according to a playback direction flag which indicates whether it is forward playback or reverse playback. The display controller 54 also provides the frame memory 53 with display start commands in synchronization with a vertical synchronization (v-sync) signal. Further, the display controller 54 determines in what order to read pictures when displaying a video stream, consulting display parameters stored in the frame memory 53.
The operation of the conventional video decoding device of FIG. 15 will be explained below, assuming a short video bitstream containing four pictures I2, B0, B1, and P3. The explanation begins with forward playback operation, and then proceeds to reverse playback operation.
(1) Forward Playback
The buffer memory 50 stores and forwards an incoming bitstream to the video decoder 51. The decoding controller 52 issues a decoding start command, which causes the video decoder 51 to reproduce motion pictures by decoding the source bitstream supplied from the buffer memory 50 in accordance with the syntax of MPEG video specifications. The resultant decoded pictures are supplied to the frame memory 53 for display.
During the decoding process, various display parameters are also reproduced along with the motion pictures themselves and stored in their relevant parameter banks of the frame memory 53. Suppose, for example, that the picture I2 has been decoded and stored into the second bank (bank #2). Display parameters for the picture I2 are written into its associated parameter bank #2, which is shown on the right of the bank #2 in FIG. 15. In this way, four decoded pictures I2, B0, B1, and P3 are stored in the banks #2, #0, #1, and #3, respectively, and their corresponding parameters are stored in the associated parameter areas.
The pictures decoded in the above process will then be displayed as follows. As mentioned earlier, the display controller 54 operates in synchronization with the v-sync signal supplied from the v-sync generator 55. Upon detection of every falling edge of the v-sync signal, the display controller 54 reads display parameters out of one of the parameter banks relevant to the next picture to be displayed. In forward playback mode, the four pictures should be displayed in the order of B0, B1, I2, and P3, and accordingly, the display controller 54 begins a display process with fetching parameters for the first picture B0. Such parameters describe the intended display format of each picture in the video stream. The display controller 54 uses them to determine how the picture of interest should be presented on a television screen. Suppose here that the display parameters of picture B0 are:
display_horizontal_size_value=704
display_vertical_size_value=480
closed_gop=1
top_field_first=1
Those parameters tells the display controller 54 that the decoded picture B0 is 704×480 pixels in size and its top field (described later) has to be displayed first.
Having parsed the display parameters, the display controller 54 retrieves picture data from the frame memory 53 for display. In the present example, the display controller 54 reads the data of picture B0 out of the bank #0 and outputs it for display purposes, since the first picture is B0.
Here, each picture frame is interlaced into two fields called “top field” and “bottom field.” The display controller 54 obtains one frame of 704×480 pixels by actually reading its top field first and then its bottom field.
Subsequent to the above processing for B0, the display controller 54 starts working at the next picture B1. As in the case of picture B0, it first reads relevant parameters out of the parameter bank #1, parses them, and retrieves the picture data of B1 from the bank #1 of the frame memory 53 for display. The display controller 54 processes and outputs the other pictures I2 and P3 just in the same way.
FIG. 16 is a timing diagram which explains the operation of the conventional video decoding device of FIG. 15. The topmost two rows (A) and (B) of FIG. 16 show a v-sync signal and a sequence of reproduced pictures. The sequence starts with the top field of picture B0 (B0t), which is followed by the bottom field of the same (B0b). Note here that the lower-case letters “t” and “b” are used to mean “top field” and “bottom field,” respectively. Similarly, the fields of other pictures are read and displayed in the order of: B1t, B1b, I2t, I2b, P3t, and P3b. 
(2) Reverse Playback
In reverse playback mode, the video decoding device outputs pictures backward. The decoding phase of this process, however, is the same as that in the forward playback mode described in (1). That is, the video decoder 51 decodes a given video bitstream and stores the resultant pictures into the frame memory 53.
The subsequent read operation is different from that in the forward playback mode. The display controller 54 makes read access to the frame memory 53 in the reverse order as will be described below.
To play back a video in the reverse direction, the display controller 54 has to be so notified. This is accomplished by setting the playback direction flag to “reverse.” Recognizing the requested playback direction from this flag, the display controller 54 starts reading video data in synchronization with the falling edge of the v-sync signal. In the present example, the four pictures are read in the order of P3, I2, B1, and B0, which is the opposite to how they are read out in forward playback mode.
The display controller 54 uses display parameters also in reverse playback mode, interpreting the parameter “top_field_first” adaptively to the playback direction. More specifically, this parameter “top_field_first” specifies whether to read the top field first (“1”) or the bottom field first (“0”) in forward playback mode. When “top_field_first” is set to “1” in reverse playback mode, the display controller 54 interprets it in the opposite way, thus reading out the bottom field first and then the top field. For smooth playback, it is important for the video decoding device to reverse the reading order of fields that is defined on the assumption that video frames are played back in the forward direction.
Parts (C) and (D) of FIG. 16 show the v-sync signal and the sequence of pictures I2, B0, B1, and P3 which are produced in the reverse direction according to the above-described rules. As this diagram shows, the sequence starts with P3b, and it is followed by the fields of: P3t, I2b, I2t, B1b, B1t, B0b, and B0t. Note that this order is exactly opposite to that in the forward playback mode.
(3) Forward Playback of 3:2 Pulldown Video
When recording a movie, a motion picture camera captures images at 24 frames every second. Frame rate conversion is therefore required to play a 24-fps motion picture on 30-fps television systems. This is known as the “telecine conversion.” A telecine converter inserts an extra frame every fourth frame to increase the frame rate to 30 fps. The resultant sequence of pictures is referred to as “3:2 pulldown video.” The following will describe how a 3:2 pulldown video is played back in the forward direction.
FIG. 17 is a timing diagram which explains how the conventional video decoding device plays a 3:2 pulldown video, where the topmost two parts (A) and (B) show the result of forward playback. As seen from the illustrated sequence, the fifth field is a copy of the third field “B1t,” and the tenth field is a copy of the eight field “P3b.” In this way, some field images in the original motion picture are repeatedly used in the playback sequence just to increase the frame rate so as to adhere to the existing television standards.
Part (B) of FIG. 17 also demonstrates that the video decoding device maintains the normal field order of top, bottom, top, bottom, and so on, even if it has to insert extra fields to achieve the telecine conversion. In this way, the conventional video decoding device can smoothly play back 3:2 pulldown videos.
(4) Reverse Playback of 3:2 Pulldown Video
When playing a 3:2 pulldown video backward, the video decoding device reads and outputs the pictures P3, I2, B1, B0 in that order, as in the case of ordinary videos. When the aforementioned parameter “top_field_first” is set to “1” (i.e., read the top field first in forward playback mode), the conventional video decoding device interprets it reversely, thus outputting the bottom field first and then the top field.
The bottommost two parts (C) and (D) of FIG. 17 show the field sequence in the present example. As seen, the first field is P3t, and the second is P3b. The third field is a copy of “P3t.” This is followed by I2t, I2b, B1b, and then B1t. The next field is a copy of B1b. Finally, the sequence ends with B0b and B0t. 
Referring to FIG. 17, we have a forward-playback picture sequence (B) and a reverse-playback picture sequence (D) which are produced from the same set of 3:2 pulldown video and display parameters. It should be pointed out that the reverse-playback picture sequence (D) has some inconsistency in terms of the arrangement of neighboring fields.
In every output picture sequence, the system requires that top fields be sent out when the v-sync signal is low, while bottom fields be sent out when the v-sync signal is high. As for the reverse-playback picture sequence (D), however, a top field I2t comes in the fourth slot which is assigned to a bottom field. Likewise, the fifth slot is occupied by the bottom field I2b although that slot is for a top field. Similar violation occurs in the ninth and tenth slots, too.
As seen from the above, the conventional method fails to maintain the consistent timing relationships between bottom fields and top fields. This inability causes an artifact called “jaggies” illustrated in FIG. 18. The left part (A) of FIG. 18 shows an original picture encoded in a video stream. The conventional video decoding device, however, would reproduce it as shown in the right part (B) of FIG. 18 when playing the video stream backward. Viewers might perceive such jaggies as an irritating flicker of images.
The above-described problem of jaggies could be solved by employing a filter to correct the temporal alignment of top and bottom fields. It is difficult, however, to implement such a correction filter in hardware because it requires costly field memories to store picture data. Another problem is that a filter adds an extra delay time to the video stream. To compensate for that delay, the video decoding device has to prefetch a certain amount of data from the frame memory, which increases the complexity of video decoding tasks.