The present invention relates generally to streaming compressed MPEG content in a network. More particularly, the present invention relates to systems and methods for reformatting MPEG files to increase transmission performance in a network.
MPEG is a popular standard for converting multimedia content into high bit rate digital signals. Using coding specifications provided by the MPEG standard, audio and video information may be compressed into an MPEG stream. This MPEG stream may then be packetized for network transmission onto a network from a server. The server is responsible for packetization of the MPEG stream using network protocol. A goal of the server is to transmit the compressed stream at a low enough bit rate such that it can make economic use of available transmission bandwidth. At some other location in the network, a “viewer” receives the streaming media. The viewer machine unpacks the data from each network packet, and sends the data to an MPEG decompressor. The decompressor passes decompressed data to a renderer. For video data, each frame of video will be decompressed and then passed to a video renderer, which will display the image on a monitor. For audio data, the data will be decompressed and then passed to an audio renderer, which will drive a speaker.
FIG. 1A illustrates a conventional system 100 for packetizing MPEG files for network transmission. The system 100 includes a media file 102, such as a movie, which contains compressed synchronized audio and video streams. The audio and video streams within the media file 102 are provided in a multiplexed format. An MPEG server 103, or “streamer,” reads the file 102 from a storage device such as a hard disk and transmits the file 102 onto a network 106 in real time. The MPEG server typically includes at least two components: a packetizer 104 and a network interface 105.
Before the data in the media file 102 can be sent onto the network 106, the packetizer 104 must encapsulate the file in network packets using a network packetization protocol. The protocol, or standard, will designate a number of rules for packetization of the data for the network 106. By way of example, the rules may specify how the MPEG bitstream is to be parsed. A common network packetization protocol for elementary streams is the Real Time Protocol (RTP) network packetization protocol. See for example RFC-2250“RTP Payload Format for MPEG1/MPEG2 Video”, January 1998, and RFC-1889 “RTP: A Transport Protocol for Real-Time Applications”, January 1996. Both of these documents are incorporated herein by reference for all purposes.
The packetizer 104 may begin by demultiplexing the audio and video streams in the media file 102. The packetizer 104 then produces a series of network packets, each of which contains a portion of data from the media file 102 along with a network packet header. The network packet header includes additional information useful for transmission in the network 106. An input buffer 108 may also be included for temporarily holding the data before streaming onto the network 106. Upon request, the network interface 105 sends the packetized RTP packets onto the network 106 in real-time.
The media file 102 has its own packetization protocol, which is distinct from the network packetization protocol. FIG. 1B illustrates an exemplary elementary MPEG stream 120, which may be found in the MPEG file 102. The elementary MPEG stream 120 is segmented into hierarchical sections, each comprising multiple pictures or frames. At the beginning of each hierarchical block is a header sequence 124. The header sequence 124 typically includes at least one of a sequence header 126, a Group of Pictures (GOP) header 128 and a picture header 130. Each header begins with a unique start code to signal the beginning of the header. A GOP header 128 is placed at the beginning of a Group of Pictures (GOP) 132 which typically consists of a set of pictures 134a and 134b related to one another by common use of some temporally redundant information in the pictures. Each picture within set 134 includes its own picture header 136 and a frame 138. The picture header 136 precedes the frame 138, which follows it. The frame 138 contains picture data, which is divided into a number of slices 142. Each slice includes a slice header 144 and slice data 146.
FIG. 1C illustrates the repacketization of the MPEG packets from the elementary stream 120 into RTP packets. As shown, an RTP stream 150 is the elementary MPEG stream 120 after it has been divided into multiple RTP packets. A first RTP packet 152 contains an RTP packet header 153a and a payload 151. In this example, the payload 151 includes the sequence header 126, the GOP header 128, the picture header 130 and some data from the first frame 140. RTP packet headers 153a, 153b and 153c specify the size of the RTP packet used in the RTP stream 150.
When converting the MPEG stream 120 into the RTP stream 150, a number of rules must be followed according to the RTP standard. One rule requires that the packetizer 104 create the RTP network packet header 153 using certain data in the MPEG header sequence 124. A second rule specifies how the bitstream is parsed relative to start code in the header sequence 124. Specifically, any start code must appear at the beginning of an RTP packet. For example, a sequence header (e.g., header 126), when present, will always be placed at the beginning of a new RTP packet. Similarly, a GOP header such as header 128, when present, will always be placed at the beginning of a new RTP packet or will follow a sequence header if present. Further, a picture header (e.g., picture header 130), when present, will always be placed at the beginning of a new RTP packet, or will follow a GOP header if present. Unfortunately, the packetizer 104 typically does not know where any of these elements are located and must comb through every byte in the MPEG stream 120 to find them, resulting in considerable computational effort.
The packetizer 104 will fragment or aggregate media packets into network packets according to their respective sizes. Media packets are generally described as constant-sized packets containing either video or audio data. Specifically, if the size of a media packet in media file 102 is larger than the optimal network packet size, the packetizer 104 will fragment the large media packet into two or more successive network packets. On the other hand, if the size of a media packet in media file 102 is smaller than the optimal network packet size, packetizer 104 may aggregate two or more media packets into a single network packet—so long as this would not place a start code at a forbidden location within the RTP packet. This may have varying effects depending on the RTP packet protocol which is being implemented on the network 106. For constant-size RTP packets as found in an ATM network, any unfilled portions of a constant-size RTP packet will be “padded” by, for example, filling the remainder of the packet with zeros. For variable-size RTP packets as found in an Ethernet network, for example, the variable-size packet is truncated such that it is shorter than a maximum size specified by the relevant network protocol.
Typically, the packetizer 104 will segment the MPEG bitstream by putting as much data into an RTP packet as possible. When the packetizer 104 runs into any of the three start codes, the bitstream that begins with that sequence start code will begin at the beginning of a fresh RTP packet. As a result, some of the previous RTP packet may be left unfilled. By way of example, if the first RIT packet 152 is not large enough to accommodate all the data from the first frame 140, some of the first frame data will spill into a portion 154 of a second RTP packet 156. After the data from the first frame 140 has been entered into the RTP stream 150, the next picture header 136 is placed at the beginning of a new RTP packet 158 in RTP packet 153c. In this case, all the frame data from frame 138 can fit into the third RTP packet 158. As the RTP stream 150 includes variable size RTP packets, RTP packet 156 is smaller than RTP packet 152. Similarly, RTP packet 156 is also smaller than RTP packet 152.
Another common constraint imposed by the RTP protocol is on the fragmentation of slices. More specifically, the beginning of a slice must either be located at the beginning of an RTP payload (after any start code) or must follow after some integral number of slices in a packet. It must not follow a part of a slice that has been divided between two RTP packets. This requirement insures that the beginning of the next slice after one with the missing slice can be found without requiring that the receiver scan packet contents. The slices may be fragmented across RTP packets as long as the above rules are met. By way of example, for the frame 140, one slice within the frame 140 may be fragmented between the first RTP packet 152 and the second RTP packet 156. However, no other slices within the frame 140 may be added to RTP packet 156.
There are several problems commonly encountered when repacketizing MPEG data into RTP packets. First, the server 103 must parse the entire MPEG bitstream, bit by bit, in order to determine how it will carve the MPEG system stream. More specifically, it must parse the entire MPEG bitstream to apply the protocol rules to locate appropriate start and end points for each RTP packet. In addition, the server must gather information to create the RTP packet headers. This parsing and information gathering imposes substantial processing load on the server CPU and may limit the ability of the server 103 to deliver real-time multimedia.
The second problem arises because two copy operations are required to parse the bitstream. The first copy operation transfers the MPEG data from the file 102 into the buffer 108 where it is parsed. The second copy operation moves the data from the buffer 108 into the network packets. These two copy operations require significant CPU processing load, which again may limit the ability of the server 103 to deliver real-time multimedia.
As result of the significant CPU processing load required to parse the entire MPEG bitstream and create RTP packets, the speed of the server 103 is limited. This problem can become so significant that the server cannot serve MPEG data fast enough to meet the requirements of real time streaming. In view of the foregoing, improved systems and techniques for MPEG to RTP repacketization would be desirable.