In countries that use 525-line interlaced display systems such as, for example, the United States and Canada, television video signals are sampled and transmitted at approximately 59.94 fields per second (fps). For such countries, digital television video streams are generally encoded and transmitted by using a particular Moving Pictures Experts Group (MPEG) standard (i.e., MPEG-2 video) at approximately 29.97 frames per second (FPS).
Hereinafter, an integral value of fps or an integral value of FPS may be an approximation including, within its scope, a range of equivalent values. Thus, for example, the expression 30 FPS may be used to refer to rates such as, for example, approximately 29.97 FPS or approximately 30 FPS. Furthermore, the expression 24 FPS may be used to refer to rates such as, for example, approximately 23.976 FPS or approximately 24 FPS. Similarly, the expression 60 fps may be used to refer to rates such as, for example, approximately 59.94 fps or approximately 60 fps.
Film material produced at 24 FPS is routinely converted to 60 fps in many applications. Broadcast networks usually encode and transmit movies that were originally filmed at 24 FPS and not at 60 fps. However, at the receiver, the decoded video at 24 FPS is often converted to 60 fps for interlaced display. A conventional process for converting 24 FPS to 60 fps sampling includes the Telecine Process (named after the original type of machine used to perform the conversion from film to video). It is also known as the 3:2 pull-down process. The Telecine Process inserts repeated fields derived from the original film frames in such a way that 5 video frames (i.e., 10 fields) are produced for every 4 original film frames. FIG. 1 illustrates one example of a process 12 that performs a 3:2 pull-down. The original film sequence 10 filmed at 24 FPS is converted to a video sequence 14 at 30 FPS or equivalently 60 fps.
For film material that has been converted to video, it is often desirable to restore the film sequence to a 24 FPS form prior to compression by eliminating, for example, the repeated fields inserted by the Telecine Process. Such a process reduces the amount of data for compression, thereby improving the quality of video or reducing the bit rate for transmission. The process of eliminating the repeated fields is commonly known as the inverse Telecine Process or the inverse 3:2 pull-down process. FIG. 1 also illustrates one example of the process 16 that performs an inverse 3:2 pull-down. The video sequence 14 at 30 FPS is restored or converted into the film sequence 18 at 24 FPS. The mechanism for handling the 3:2 pull-down and/or the inverse 3:2 pull-down for film material in digital video systems is usually referred to as film mode.
A film mode for encoding, decoding and displaying converted film material exists in MPEG-2 video. However, the use of the film mode in MPEG-2 video results in encoded streams that are specifically adapted for 30 FPS, interlaced display devices. Such adaptations may be disadvantageous for decoders that otherwise would benefit from having the content in 24 FPS form. Other examples include decoders that are coupled to progressive (non-interlaced) display devices or decoders that perform format conversion to, for example, high definition display devices.
When using the MPEG-2 video standard with the film mode, the frame rate encoded in the sequence header is 30 FPS for interlaced display, even though the video is actually coded as a 24 FPS film sequence. The encoder also conveys, to the decoder, proper display timing based on the frame rate of 30 FPS. The flags top_field_first and repeat_first_field in the picture coding extension header are used for indicating how a picture should be displayed. These. two flags are mandated as MPEG-2 syntax elements that are carried all the time and are followed by decoder. However, such inflexibility may not be desirable, particularly, when the type of display device can vary from, for example, an interlaced television to a progressive monitor. Furthermore, the encoder does not know the type of display employed at the decoder end.
In MPEG-2 video elementary streams, the flags top_field_first and repeat_first_field are used to indicate the current film state. Four film states A, B, C and D are illustrated in FIG. 1. The four possible film mode states are generally repeated in the same order every four pictures. FIG. 2 illustrates the mapping between the film states and these 3:2 pull-down flags in MPEG-2 video.
Film mode encoding may refer to a situation in which an encoder directly compresses a 24 FPS sequence or in which an encoder uses an inverse 3:2 pull-down process to convert a 30 FPS video to a 24 FPS sequence and subsequently performs compression. If the input sequence prior to compression by an encoder is a 24 FPS film sequence, top_field_first and repeat_first_field flags indicate the “fields” that need to be repeated for a 30 FPS display device. If the input sequence is a 30 FPS video sequence that was converted from a film sequence, then a 3:2 pull-down detector is used to restore the film sequence prior to encoding. In this case, two repeated fields are removed from each ten-field sequence by the 3:2 pull-down detector as illustrated in FIG. 1.
In MPEG-2, the decoder generally follows the top_field_first and repeat_first_field flags to display film state B and D frames for three field times to re-construct the 3:2 pull-down pattern. The decoder re-displays the first field to create the third field. This is because, in the 3:2 pull-down algorithm, the first field is repeated every other picture to convert film material at 24 FPS to video mode at 30 FPS. Film state A and C pictures are displayed for only two field times. A film mode sequence of four pictures will therefore be displayed as a total of 10 field times. In this way, the decoded video is displayed at the correct video picture rate of 30 FPS. However, this may be undesirable for decoding systems that employ progressive displays or decoding systems that would otherwise benefit from direct 24 FPS progressive sequences.
In MPEG-2, the flags top_field_first and repeat_first_field along with the frame rate can also be applied to derive Decoding Time Stamps (DTS) and Presentation Time Stamps (PTS) for some pictures. The flags (i.e., top_field_first and repeat_first_field) are used to achieve proper timing for decoding and displaying the coded 24 FPS film material to generate output video at 30 FPS. However, this may not be desirable when the display device is not an interlaced television (e.g., a progressive monitor). In general, the encoder does not know the type of display employed at the decoder end. The problem may be further compounded because, in broadcast systems, there may be many decoders decoding the same signal and a number of different types of monitors being employed to display the same signal. In markets where 60 fps interlaced televisions are most common, current broadcast systems commonly use the MPEG-2 film mode flags and therefore create compressed bit streams that are optimized for display only on 60 fps interlaced displays. However, such an assumption may no longer be valid in light of the massive deployment of progressive displays. Furthermore, the increased proliferation of high definition, 60 fps interlaced displays also challenges the assumptions made in conventional systems. Methods of converting standard definition content for display on such devices uses progressive frame-based video signals where possible, instead of fields of video.
Further limitations and disadvantages of conventional and traditional approaches will become apparent to one of ordinary skill in the art through comparison of such systems with some aspects of the present invention as set forth in the remainder of the present application with reference to the drawings.