Various types of data streams are processed using data processing technology. Examples of these data streams include video, audio, and closed captioning. Video data is often compressed to reduce bandwidth requirements. As an example, video data is often compressed into a Moving Picture Experts Group (MPEG) format.
Video data is comprised of sequences of frames, wherein each frame constitutes an instantaneous image. As the frames are displayed in sequence the appearance of a moving picture is formed. While each frame may be considered to contain a certain amount of information, that amount of information may be reduced if some of the information is redundant with information contained in other frames. Thus, the overall amount of information being transmitted may be reduced. As the amount information contained in different frames is reduced by different amounts, the bit lengths of the frames of data vary relative to other frames of data. Thus, the video data is represented as variable-frame-length data.
Audio data is comprised of sequences of samples, wherein each sample constitutes instantaneous information about an audio signal. Since audio data does not exhibit sample-to-sample redundancy in the same manner as video data, communication or compression of audio data generally does not involve variable-sample-length data. Thus, audio samples typically occupy a fixed length in the data stream. Typically, a compressed audio stream is comprised of a sequence of fixed size audio frames, each of which represents a fixed number of audio samples.
It is often useful to combine various types of data streams into a single data stream. A multiplexer may be used to combine the data streams. The multiplexer combines portions of the different data streams to form a combined data stream. To obtain the portions of the different data streams, the data streams may be divided into packets, which are usually of fixed size.
FIG. 1 is a block diagram illustrating a multiplexer system and the multiplexer system for multiplexing and demultiplexing data streams as is known in the art. The multiplexer system includes multiplexer 101. Multiplexer 101 receives video stream input 108, audio stream input 109, and closed caption stream input 110. Multiplexer 101 produces a multiplexed output stream 111.
The demultiplexer system includes demultiplexer 102, Video Buffer Verifier (VBV) buffer 103, video System Target Decoder (STD) buffer 104, audio STD buffer 105, video decoder 106, and audio decoder 107. Demultiplexer 102 provides closed caption stream 114. Demultiplexer 102 also provides signal 112 to VBV buffer 103 and to video STD buffer 104. VBV buffer 103 provides a signal 115 to video decoder 106. Video STD buffer 104 provides a signal 116 to video decoder 106. Video decoder 106 provides video stream 118. VBV buffer 103 and video STD buffer 104 are typically combined into a single block.
Demultiplexer 102 also provides signal 113 to audio STD buffer 105. Audio STD buffer 105 provides a signal 117 to audio decoder 107. Audio decoder 107 provides audio stream 119.
FIG. 2 is a timing diagram illustrating a video data stream and an audio data stream as is known in the art. Video stream 201 includes several frames of video data. Each frame of video data is associated with a decoding time stamp. The first frame of video data is associated with decoding time stamp 203 (DTS V0). The second frame of video data is associated with decoding time stamp 204 (DTS V1). The third frame of video data is associated with decoding time stamp 205 (DTS V2). The fourth frame of video data is associated with decoding time stamp 206 (DTS V3). The fifth frame of video data is associated with decoding time stamp 207 (DTS V4). The sixth frame of video data is associated with the time stamp 208 (DTS V5). The seventh frame of video data is associated with decoding time stamp 209 (DTS V6). The eighth frame of video data is associated with decoding time stamp 210 (DTS V7). The ninth frame of video data is associated with decoding time stamp 211 (DTS V8). The tenth frame of video data is associated with decoding time stamp 212 (DTS V9).
To multiplex a video data stream with other data streams, the video data stream 201 is divided into several packets. These include packet A 242, packet B 243, packet C 244, packet D 245, packet E 246, and packet F 247. The video data stream may be divided into packets independently of its video frames. This is particularly well illustrated in the case of variable-frame-length data.
Audio stream 202 includes several audio frames. Each frame is associated with a decoding time stamp. The first audio frame is associated with decoding time stamp 213 (DTS A0). The second audio frame is associated with decoding time stamp 214 (DTS A1). The third audio frame is associated with decoding time stamp 215 (DTS A2). The fourth audio frame is associated with decoding time stamp 216 (DTS A3). The fifth audio frame is associated with decoding time stamp 217 (DTS A4). The sixth audio frame is associated with decoding time stamp 218 (DTS A5). The seventh audio frame is associated with decoding time stamp 219 (DTS A6). The eighth audio frame is associated with decoding time stamp 220 (DTS A7). The ninth audio frame is associated with decoding time stamp 221 (DTS A8). The tenth audio frame is associated with decoding time stamp 222 (DTS A9). The eleventh audio frame is associated with decoding time stamp 223 (DTS A10). The twelfth audio frame is associated with decoding time stamp 224 (DTS A11). The thirteenth audio frame is associated with decoding time stamp 225 (DTS A12). The fourteenth audio frame is associated with decoding time stamp 226 (DTS A13). The fifteenth audio frame is associated with decoding time stamp 227 (DTS A14). The sixteenth audio frame is associated with decoding time stamp 228 (DTS A15). The seventeenth audio frame is associated with decoding time stamp 229 (DTS A16). The eighteenth audio frame is associated with decoding time stamp 230 (DTS A17). The nineteenth audio frame is associated with decoding time stamp 231 (DTS A18). The twentieth audio frame is associated with decoding time stamp 232 (DTS A19). The twenty-first audio frame is associated with decoding time stamp 233 (DTS A20). The twenty-second audio frame is associated with decoding time stamp 234 (DTS A21).
To multiplex the audio stream with other data streams, the audio stream may be divided into several packets. These packets may include packet A′ 248, packet B′ 249, packet C′ 250, packet D′ 251, packet E′ 252, and packet F′ 253. The audio stream may be divided into packets independently of the audio frames.
Since video data typically requires higher bandwidth than audio data, the process of combining portions of a video data stream with portions of an audio data stream usually involves interleaving several packets of video data with one packet of audio data. If video data and audio data were provided at a constant bit rate, video and audio packets could simply be interleaved at a fixed ratio. However, the rates are not necessarily constant. Some data sources produce video and audio data at inaccurate rates. For example, a video source that should produce 30 frames per second of video data may instead produce 31 frames per second. Also, the frame rates change during modes such as fast forward and rewind.
If data stream processing apparatus were provided with a buffer of infinite length, all video and audio data would always be available and the appropriate portions of the audio and video data could be played at the appropriate times. However, in practice, buffer sizes are limited. Limited buffer sizes place constraints on the manner in which different data streams may be multiplexed. For example, when video data and audio data are multiplexed, a portion of the video data and the corresponding portion of the audio data should be multiplexed such that both the portion of the video data and the portion of the audio data may be simultaneously present in a buffer to allow for simultaneous playback.
In the MPEG format, a system target decoder (STD) buffer is defined as having a fixed size, referred to as the STD buffer size. The STD buffer places constraints on the multiplexing of multiple data streams.
FIG. 3 is a diagram illustrating communication of a data stream over time relative to a system target decoder (STD) as is known in the art. In this diagram, the horizontal axis 301 represents time. The vertical axis 302 represents a number of bits. At the beginning 303 of a first clock cycle (at or near system time clock 0 (STC0)), communication of the data stream begins. During the first clock cycle, a first packet of the data stream is transmitted. This is represented by the number of bits 312 increasing during the first clock cycle. From the beginning 304 of a second clock cycle to the beginning 305 of a third clock cycle, no additional packets are transmitted. This is represented by the number of bits 312 remaining constant during the second clock cycle. This time may be used to transmit packets of other data streams being multiplexed with this data stream. Additional packets of this data stream are transmitted at varying intervals over additional clock cycles, although, during some additional clock cycles, such as that having beginning 306, no additional packets are transmitted. Thus, the number of bits 312 increases monotonically over many clock cycles.
When a first decoding time stamp 307 (DTS0) arrives in time, a first frame of the video stream is decoded or otherwise processed. Before the first decoding time stamp 307 arrives, the packets being transmitted are stored in a STD buffer that has a capacity represented by the height of the segment of STD upper limit 314 before decoding time stamp 307 (DTS0). After the first frame of the video stream is decoded or otherwise processed, the amount of the STD buffer used to store the first frame of the video stream is made available to store additional packets. This is indicated by the STD lower limit 313 and the STD upper limit 314 increasing at decoding time stamp 307 (DTS0) by an amount representative of the number of bits in the first frame of the video stream. The distance between the STD upper limit 314 and the STD lower limit 313 represents the size of the STD buffer.
When a second decoding time stamp 308 (DTS1) arrives in time, a second frame of the video stream is decoded or otherwise processed. Thus, the STD lower limit 313 and the STD upper limit 314 increase correspondingly at the second decoding time stamp 308 (DTS1). Since the second frame of the video stream is shorter (i.e., has fewer bits) than the first frame of the video stream, the amount of the increase in the STD lower limit 313 and the STD upper limit 314 is less at second decoding time stamp 308 (DTS 1) than at the first decoding time stamp 307 (DTS0).
At subsequent decoding time stamps 309 (DTS2), 310 (DTS3), and 311 (DTS4), the STD lower limit 313 and the STD upper limit 314 continue to increase over time. Likewise, the number of bits 312 increases as additional packets are transmitted. The rate at which the number of bits 312 increases is constrained by the capacity of the STD buffer. If the number of bits 312 were to exceed the STD upper limit 314, the STD buffer would overflow, resulting in lost data. If the number of bits 312 were to fall below the STD lower limit 313, the STD buffer would underflow, resulting in insufficient data available to decode or otherwise process a frame of the video stream at its corresponding decoding time stamp.
While maintaining the number of bits 312 between the STD upper limit 314 and the STD lower limit 313 results in a data stream that is compatible with the STD buffer, the timing of the data stream is not optimal and is computationally inefficient to obtain. Thus, a improved technique for multiplexing data streams is needed.