For most applications, digital video is displayed at a known frame rate or at a known field rate. For example, in countries that use 525-line interlaced display systems such as, for example, the United States and Canada, television video signals are sampled and transmitted at approximately 59.94 fields per second (fps). For such countries, digital television video streams are generally encoded and transmitted by using a particular Moving Pictures Experts Group (MPEG) standard (e.g., MPEG-2 Video) at approximately 29.97 frames per second (FPS).
Hereinafter, an integral value of fps or an integral value of FPS may be an approximation including, within its scope, a range of equivalent values. Thus, for example, the expression 30 FPS may be used to refer to rates such as, for example, approximately 29.97 FPS or approximately 30 FPS. Furthermore, the expression 24 FPS may be used to refer to rates such as, for example, approximately 23.976 FPS or approximately 24 FPS. Similarly, the expression 60 fps may be used to refer to rates such as, for example, approximately 59.94 fps or approximately 60 fps.
A given frame rate implies a fixed inter-picture display time interval. Therefore, the frame rate can be used to derive the decoding and presentation times for some or all of the pictures in a video sequence.
In the MPEG-1 Systems and MPEG-2 Systems standards, there are syntax elements such as, for example, a decoding time stamp (DTS) and a presentation time stamp (PTS) which specify the decoding time and presentation time, respectively, of some pictures in terms of a Hypothetical Reference Decoder model called a System Target Decoder (STD). The decoding time refers to the time that a compressed picture is removed from a buffer in the STD model. This is not necessarily the exact time that a practical decoder decodes the picture. The presentation time in the STD model is the time a picture is presented. This may be taken to mean the display time in a real decoder, although practical decoders may display pictures at slightly different times than those indicated by the video stream. The MPEG-2 Video standard (i.e., ISO/IEC 13818-2) specifies a VBV delay syntax element that indicates similar information to that contained in the DTS in the Hypothetical Video Decoder model and represents the delay from the time certain compressed data enters a buffer in the model to the time a given picture is extracted from that buffer in the model. In the MPEG-1 and MPEG-2 standards, the DTS, PTS and VBV_delay syntax elements specify times in units of a 90 kHz reference clock. The principle of providing decoding and presentation times can apply to any video coding and decoding standard regardless of the syntax specification and the units of time reference. Hereinafter, the terms presentation time and display time are to be used interchangeably, and the terms decoding time and buffer removal time are to be used interchangeably.
However, for some video applications, the intervals between successive pictures may not be a constant or fixed. It may be desirable for a video coding standard to support dynamically variable frame rates or different types and durations of pictures interspersed to form a video sequence. Thus, variable inter-picture decoding and display times need to be supported. Inter-picture display time can no longer be represented properly by a frame rate. It may be considered necessary to encode the presentation times of all pictures, which is not necessary in MPEG-2 due to the use in MPEG-2 of constant frame rates. The inclusion of a presentation time stamp such as PTS for every picture or the inclusion of a decoding time stamp such as DTS for every picture is undesirable because of the number of bits (e.g., 33 bits per time stamp) and the syntax layers used for encoding in the MPEG-2 standard. Thus, conventional systems and methods may lack an efficient method to represent and to encode presentation times with either fixed or variable inter-picture display intervals.
The inter-picture decoding interval is, in general, not the same as the inter-picture display interval. For example, in MPEG-2 Video while the frame rate is constant for a given sequence, the interval between the decoding times of successive pictures, represented by DTS or by the VBV_delay field, may vary independently every picture. However, in MPEG-1 and MPEG-2, the decoding time is specified with reference to a time that specific data elements enter a buffer in a Hypothetical model, which requires precise specifications of when each data element enters the Hypothetical buffer. Precise specifications may not be available in all cases. Precise specifications also tend to lead to large syntax elements to specify the decoding time and complexity in specifying and interpreting the DTS and VBV_delay values.
Further limitations and disadvantages of conventional and traditional approaches will become apparent to one of ordinary skill in the art through comparison of such systems with some aspects of the present invention as set forth in the remainder of the present application with reference to the drawings.