In countries that use 525-line interlaced display systems such as, for example, the United States and Canada, television video signals are sampled and transmitted at approximately 59.94 fields per second (fps). For such countries, digital television video streams are generally encoded and transmitted by using a particular Moving Pictures Experts Group (MPEG) standard (e.g., MPEG-2 Video) at approximately 29.97 frames per second (FPS).
Hereinafter, an integral value of fps or an integral value of FPS may be an approximation including, within its scope, a range of equivalent values. Thus, for example, the expression 30 FPS may be used to refer to rates such as, for example, approximately 29.97 FPS or approximately 30 FPS. Furthermore, the expression 24 FPS may be used to refer to rates such as, for example, approximately 23.976 FPS or approximately 24 FPS. Similarly, the expression 60 fps may be used to refer to rates such as, for example, approximately 59.94 fps or approximately 60 fps.
Film material produced at 24 FPS is routinely converted to 60 fps in many applications. Broadcast networks usually encode and transmit movies that were originally filmed at 24 FPS and not at 60 fps. However, at the receiver, the decoded video at 24 FPS is often converted to 60 fps for interlaced display. A conventional process for converting 24 FPS to 60 fps sampling includes the Telecine Process (named after the original type of machine used to perform the conversion from film to video). It is also known as the 3:2 pull-down process. The Telecine Process inserts repeated fields derived from the original film frames in such a way that 5 video frames (i.e., 10 fields) are produced for every 4 original film frames. FIG. 1 illustrates one example of a process 12 that performs a 3:2 pull-down. The original film sequence 10 filmed at 24 FPS is converted to a video sequence 14 at 30 FPS, or equivalently 60 fps. A mechanism of handling 3:2 pull-down for film material in digital video systems is usually referred as film mode.
The Telecine Process or 3:2 pull-down process is supported in the MPEG-2 Video compression standard. When using the MPEG-2 Video standard with the film mode, the frame rate encoded in the sequence header is 30 FPS for interlaced display, even though the video is actually coded as a 24 FPS film sequence. The encoder also conveys, to the decoder, proper display timing based on the frame rate of 30 FPS. The flags top_field_first and repeat_first_field in the picture coding extension header are used for indicating how a picture should be displayed. These two flags are mandated as MPEG-2 syntax elements that are carried all the time and are followed by decoder. However, such inflexibility may not be desirable, particularly, when the type of display device can vary from, for example, an interlaced television to a progressive monitor. Furthermore, the encoder does not know the type of display employed at the decoder end.
In MPEG-2 Video elementary streams, the flags top_field_first and repeat_first_field are used to indicate the current film state. Four film states A, B, C and D are illustrated in FIG. 1. The four possible film mode states are generally repeated in the same order every four pictures. FIG. 2 illustrates the mapping between the film states and these 3:2 pull-down flags in MPEG-2 Video.
In MPEG-2, the decoder generally follows the top_field_first and repeat_first_field flags to display film state B and D frames for three field times to re-construct the 3:2 pull-down pattern. The decoder re-displays the first field to create the third field. This is because, in the 3:2 pull-down algorithm, the first field is repeated every other picture to convert film material at 24 FPS to video mode at 30 FPS. Film state A and C pictures are displayed for only two field times. A film mode sequence of four pictures will therefore be displayed as a total of 10 field times. In this way, the decoded video is displayed at the correct video picture rate of 30 FPS
In MPEG-2, the flags top_field_first and repeat_first_field along with the frame rate can also be applied to derive Decoding Time Stamps (DTS) and Presentation Time Stamps (PTS) for some pictures. The flags (i.e., top_field_first and repeat_first_field) are used to achieve proper timing for decoding and displaying the coded 24 FPS film material to generate output video at 30 FPS.
However, for compressed video formats without these flags (or similar flags), the 3:2 pull-down process or the film mode is supported in a different manner and not supported by, for example, new video compression standards (e.g., MPEG-4 Advanced Video Coding (AVC)) as well as with some of the existing video transport standards (e.g., MPEG-2 Systems).
In formats other than those following the MPEG-2 Systems standard (i.e., ISO/IEC 13818-1), decoding time and presentation time may be indicated via syntax elements that differ from DTS specifications and PTS specifications found in MPEG-2 Systems. As used herein, the terms DTS and PTS may be interpreted as including, within their meaning, decoding time or buffer removal time and presentation time or display time, respectively, regardless of how they may be encoded in the bitstream.
Further limitations and disadvantages of conventional and traditional approaches will become apparent to one of ordinary skill in the art through comparison of such systems with some aspects of the present invention as set forth in the remainder of the present application with reference to the drawings.