a. Field of the Invention
The present invention concerns the communication, or distribution, of encoded data, such as MPEG (i.e., Motion Pictures Expert Group) or MPEG-2 encoded video data for example, from a server (such as an MPEG transport stream server for example) to one or more decoders.
b. Related Art
The MPEG-2 standard focuses on the encoding and transport of video and audio data. In general, the MPEG-2 standard uses compression algorithms such that video and audio data may be more efficiently stored and communicated.
The International Organisation for Standardisation (or the Organisation Internationale De Normalisation) (hereinafter referred to as "the ISO/IEC") has produced drafts of the MPEG-2 standard for the coding of moving pictures and associated audio. This standard is set forth in four documents. The document ISO/IEC 13818-1 (systems) specifies the system coding of the specification. It defines a multiplexed structure for combining audio and video data and means of representing the timing information needed to replay synchronized audio and video sequences in real-time. The document ISO/IEC 13818-2 (video) specifies the coded representation of video data and the decoding process required to reconstruct pictures. The document ISO/IEC 13818-3 (audio) specifies the coded representation of audio data and the decoding process required to reconstruct the audio data. Lastly, the document ISO/IEC 13818-4 (conformance) specifies procedures for determining the characteristics of coded bitstreams and for testing compliance with the requirements set forth in the ISO/IEC documents 13818-1, 13818-2, and 13818-3. These four documents, hereinafter referred to, collectively, as "the MPEG-2 standard" or simply "the MPEG standard", are incorporated herein by reference.
A bit stream, multiplexed in accordance with the MPEG-2 standard, is either a "transport stream" or a "program stream". Both program and transport streams are constructed from "packetized elementary stream" (or PES) packets and packets containing other necessary information. A "packetized elementary stream" (or PES) packet is a data structure used to carry "elementary stream data". An "elementary stream" is a generic term for one of (a) coded video, (b) coded audio, or (c) other coded bit streams carried in a sequence of PES packets with one and only stream identifier (or "ID"). Both program and transport streams support multiplexing of video and audio compressed streams from one program with a common time base.
Transport streams permit one or more programs with one or more independent time bases to be combined into a single stream. Transport streams are useful in instances where data storage and/or transport means are lossy or noisy. The rate of transport streams, and their constituent packetized elementary streams (PESs) may be fixed or variable. This rate is defined by values and locations of program clock reference (or PCR) fields within the transport stream.
FIG. 1 illustrates the packetizing of compressed video data 106 of a video sequence 102 into a stream of PES packets 108, and then, into a stream of transport stream packets 112. Specifically, a video sequence 102 includes various headers 104 and associated compressed video data 106. The video sequence 102 is parsed into variable length segments, each having an associated PES packet header 110 to form a PES packet stream 108. The PES packet stream 108 is then parsed into segments, each of which is provided with a transport stream header 114 to form a transport stream 112. Each transport stream packet of the transport stream 112 is 188 bytes in length.
Although the syntax of the transport stream 112 and transport stream packets is described in the MPEG-2 standard, the fields of the transport stream packet pertaining to the present invention will be described below with reference to FIG. 2 for the reader's convenience. As shown in FIG. 2, a transport stream 112 includes one or more 188 byte transport stream packets 200, each of the transport stream packets 200 having a header 114 and an associated payload 216.
Each header 114 includes an eight (8) bit synch byte field 218 and a packet identification (or PID) field 220. The synch byte field 218 has a value of "01000111" (or 47 hex) and identifies the start of a 188 byte transport stream packet 200. The PID field 220 indicates the type of data (e.g., audio, video, secondary audio program (or "SAP"), private, etc.) stored in the payload 216 of the 188 byte transport stream packet. Certain PID values are reserved.
The payloads 216 of one or more transport stream packets 200 may carry "packetized elementary stream" (or PES) packets 300. To reiterate, a "packetized elementary stream" (or PES) packet 300 is a data structure used to carry "elementary stream data" and an "elementary stream" is a generic term for one of (a) coded video, (b) coded audio, or (c) other coded bit streams carried in a sequence of PES packets with one and only stream ID.
FIG. 3 is a diagram which illustrates the syntax of a PES packet 300. As FIG. 3 shows, a PES packet 300 includes a 24 bit start code prefix field 302, an eight (8) bit stream identifier field 304, a sixteen (16) bit PES packet length field 306, an optional PES header 308, and a payload section 106. Each of these fields is described in the MPEG-2 standard. However, for the reader's convenience, the fields particularly relevant to the present invention are described below.
The sixteen (16) bit PES packet length field 306 specifies the number of bytes in the PES packet 300 following this field 306. A value of 0 in this field 306 indicates that the PES packet length is neither specified nor bounded. Such an unspecified and unbounded PES packet 00 is only allowed in PES packets whose payload is a video elementary stream contained in transport stream packets. As can be deduced from the description of the PES packet length field 306, the PES packet 300 can be much longer (e.g., 4000 bytes) than the length of the payload 216 of a 188 byte transport stream packet. Thus, as shown in FIG. 1, a PES packet 300 is typically carried in consecutive payloads 216 of a series of transport stream packets. The payload 106 of a PES packet 300 may carry a sequence of video frames or audio frames, for example.
FIG. 4 is a high level block schematic showing a system 400 for communicating and decoding video and audio data in accordance with the MPEG-2 standard. This system 400 basically includes a MPEG stream server 402 which provides data to an MPEG decoder 404 via a communications link 406.
The MPEG stream server 402 includes a storage system 408, a timing and control unit 410, a transfer buffer 412, and an interface unit 414. The storage system 408 stores files of packetized encoded data, such as PES or transport stream packets of MPEG data for example. The encoded data has been encoded, by an MPEG encoder for example, at an encoder rate. The storage system 408 may include a disk or array of disks which are well-known in the art. The timing and control unit 410 controls a reading out of one or more files stored in the storage systems 408 based on a clock signal CLK. The files of packetized encoded data read out from the storage system 408 are buffered in the transfer buffer 412. Under the control of the timing and control unit 410, the interface unit 414 provides data, stored in the transfer buffer 412, to the communications link 406. The interface unit 414 multiplexes packets of encoded audio and video data to form a program stream or a transport stream 112. The control signals provided by the timing and control unit 410 to the interface unit 414 may be based on the clock signal CLK or may be based on an independent clock signal.
At a remote end of the communications link 406, the MPEG decoder 404 includes a transport stream demultiplexer 416, a video decoder 418, an audio decoder 420, and clock control unit 422. The transport stream demultiplexer 416 receives the output of the stream server 402 in the form of a transport stream 112. Based on the packet identification (or PID) number 220 of a particular transport stream packet 200, the transport stream demultiplexer 416 separates the encoded audio and video packets and provides the video packets to the video decoder 418 and the audio packets to an audio decoder 420. The transport stream demultiplexer 416 also provides timing information to a clock control unit 422. The clock control unit 422 provides timing signals to both the video decoder 418 and the audio decoder 420 based on the timing information provided by the transport stream demultiplexer 416. The video decoder 418 provides decoded video data which corresponds to the video data originally encoded. Similarly, the audio decoder 420 provides decoded audio data which corresponds to the audio data originally encoded.
In any real-time system utilizing a digital source to derive an analog signal, the requisite number of data bits must be provided to analog generation circuitry in a timely manner so that the analog signal may be generated and transmitted to preserve the real-time characteristics of the system. In the current digital art, each video channel is derived from a single storage medium and a concomitant storage controller. In a system having mismatches in capacity between the digital source and the analog generation circuitry, such as will occur when digital data must be retrieved from local storage and transmitted as a digital stream over a transport medium, the real-time operation is aided by the use of intermediate buffer memory. The buffer memory ensures the requisite bits are available to the analog generation circuitry when needed.
It is important to provide a buffer memory of the proper size. Too little memory will cause lost analog frames, and too much memory is costly. As shown in FIG. 4, encoded data, such as encoded video data for example, may be buffered in the decoder 404 at a point "A" before the transport demultiplexer 416 and/or at a point "B" between the transport demultiplexer 416 and the video decoder 418.
Similarly, decoding encoded data involves the timely delivery of the encoded data (which comprise a video program for example) from a storage system (e.g., server 402) to the decoder 404. Of particular importance to the timely delivery of encoded data are two, sometimes unavoidable characteristics of the delivery process itself; namely "drift" and "jitter".
Drift is a monotonic error in the rate of data transfer from the server 402 to the decoder 404. Drift occurs when, on average, the rate at which a decoder (e.g., video decoder 418) consumes encoded data differs from the rate at which encoded data are provided (e.g., by the transport demultiplexer 416). Drift may cause a decoder buffer to run dry or overflow, particularly during extended transfers of data. If the decoder buffer runs dry, the decoder has no data to decode. Thus, for example, a video decoder would have to display the same frame for two or more consecutive frame periods. If, on the other hand, the decoder buffer overflows, data are lost. Drift is of particular concern when relatively long transfers of encoded data are provided to a relatively small decoder buffer.
Jitter is a random variation in the rate of data transfer from the server to the decoder. Jitter may cause variations in the level of decoder input buffers. However, since jitter tends to average out over time, it typically does not cause a decoder buffer to run dry or overflow.
In the known system 400, the MPEG digital source (i.e., the stream server 402) is the master of downstream circuitry (i.e., the decoder 404). That is, the downstream circuitry must be arranged to process the incoming data bits without the ability to control the rate at which incoming bits arrive. In operation, the MPEG stream server 402 provides the MPEG data to the MPEG decoder 404 (or video decoder 418) at a constant output rate that matches an original MPEG encode rate. This encode rate is either provided by the encoder or may be calculated from the size of the file containing the MPEG data and the number of frames contained within that data file. Thus, in the known system 400, the stream server 402 and the communications link 406 must provide the packets of encoded data at a fixed rate.
Small buffers located at "A" or "B" usually suffice to avoid random variances in the provision of encoded data, due to jitter for example. Unfortunately, even if the MPEG encode rate, the MPEG stream server output rate and the MPEG decoder rate are equal, problems can occur in the system 400 nonetheless, due to data transmission anomalies which alter the data transmission rate. In particular, as discussed above, drift of the transmitted MPEG data may cause the decoder buffers to overflow or run dry, particularly with relatively long transfers and relatively small buffers. Such anomalies obviously impact the overall quality and reliability of transmitted MPEG data and ultimately, viewers of a transmitted program,
Although the probability of queue overflow can be reduced by increasing the size of the decoder buffers, this solution increases the cost of the decoder 404. If multiple decoders 404 are needed, this increased memory cost is exacerbated.
An additional transmission-rate related problem which limits the prior art system 400 occurs when different programs are MPEG encoded at different encode rates and then transmitted as concatenated programs. To reiterate, a transport stream 112 may include one or more programs. In particular, video programs which comprise a sequence of video data are MPEG encoded and then stored as files in the storage system 408 of the stream server 402. Due to inherent differences in the video data from one program to another, not all of the programs are MPEG encoded at the same encode rate, however. Thus, during operations of the stream server 402, it may be necessary to transmit a first encoded program at a given rate (the program encode rate) and then transmit another, concatenated program at a different rate. Oftentimes, the stream server 402 cannot compensate for the variations in the transmission rates. This may lead to "garbled" or unintelligible transmissions of encoded data.
In view of the above described problems with known systems 400, a method and apparatus for preventing buffer(s) arranged between a server and a decoder from running dry or overflowing, due to drift for example, while, at the same time, compensating for variations in encoding rates in more than one program, are needed.