1. Field of the Invention
The present invention relates generally to systems and methods for processing compressed bitstreams of data. In particular, the present invention relates to a system and a method for transcoding video bitstreams on frame and smaller basis. Still more particularly, the present invention relates to a system and method for transcoding multiple channels of compressed video streams using self-contained data units such as autonomous frames wherein such an autonomous frame includes a frame header portion and a frame payload portion.
2. Description of the Related Art
There are presently a variety of different communication channels for transmitting or transporting video data. For example, communication channels such as digital subscriber loop (DSL) access networks, ATM networks, satellite, or wireless digital transmission facilities are all well known. The present invention relates to such communication channels, and for the purposes of the present application a channel is defined broadly as a connection facility to convey properly formatted digital information from one point to another. A channel includes some or all of the following elements: 1) physical devices that generate and receive the signals (modulator/demodulator); 2) physical medium that carries the actual signals; 3) mathematical schemes used to encode and decode the signals; 4) proper communication protocols used to establish, maintain and manage the connection created by the channel. The concept of a channel includes but is not limited to physical channel, but also logical connections established on top of different network protocols, such as xDSL, ATM, wireless, HFC, coaxial cable, etc.
The channel is used to transport a bitstream, or a continuous sequence of binary bits used to digitally represent compressed video, audio or data. The bit rate is the number of bits per second that the channel is able to transport. The bit error rate is the statistical ratio between the number of bits in error due to transmission and the total number of bits transmitted. The channel capacity (or channel bandwidth) is the maximum bit rate at which a given channel can convey digital information with a bit error rate no more than a given value. A video channel or video program refers to one or more compressed bit streams that are used to represent the video signal and the associated audio signals. Also included in the video channel are relevant timing, multiplexing and system information necessary for a decoder to decode and correctly present the decoded video and audio signals to the viewer in a time continuous and synchronous manner. And finally, a multiplex is a scheme used to combine bit stream representations of different signals, such as audio, video, or data, into a single bit stream representation.
One problem with existing communication channels is their ability to handle the transportation of video data. Video data is much larger than many other types of data, and therefore, requires much more bandwidth from the communication channels. Since transmission of digitally sampled video data with existing communication channels would require excessive amounts of time, compression is an approach that has been used to make digital video images more transportable. Digital video compression schemes allow digitized video frames to be represented digitally in much more efficient manner. Compression of digital video makes it practical to transmit the compressed signal by digital channels at a fraction of the bandwidth required to transmit the original signal without compression. International standards have been created on video compression schemes and include MPEG-1, MPEG-2, H.261, H.262, H.263, etc. These standardized compression schemes mostly rely on several key algorithm schemes: motion compensated transform coding (for example, DCT transforms or wavelet/sub-band transforms), quantization of the transform coefficients, and variable length encoding (VLC). The motion compensated encoding removes the temporally redundant information inherent in video sequences. The transform coding enables orthogonal spatial frequency representation of spatial domain video signals. Quantization of the transformed coefficients reduces the number of levels required to represent a given digitized video sample and is the major factor in bit usage reduction in the compression process. The other factor contributing to the compression is the use of variable length coding (VLC) so that most frequently used symbols are represented by the shortest code word. In general, the number of bits used to represent a given image determines the quality of the decoded picture. The more bits used to represent a given image, the better the image quality. The system that is used to compress digitized video sequence using the above-described schemes is called an encoder or encoding system.
In the prior art compression schemes, the quantization scheme is lossy, or irreversible process. Specifically, it results in loss of video textural information that cannot be recovered by further processing at a later stage. In addition, the quantization process has direct effect on the resulting bit usage and decoded video quality of the compressed bit stream. The schemes at which the quantization parameters are adjusted control the resulting bit rate of the compressed bit stream. The resulting bit stream can have either a constant bit rate (CBR) or a variable bit rate (VBR). CBR compressed bit stream can be transmitted over channel delivers digital information at a constant bit rate.
A compressed video bit stream generally is intended for real-time decoded playback at a different time or location. The decoded real-time playback must be done at 30 frames per second for NTSC standard video and 25 frames per second for PAL standard video. This implies that all of the information required to represent a digital picture must be delivered to the destination in time for decoding and display in timely manner. Therefore, this requires that the channel must be capable of making such delivery. From a different perspective, the transmission channel imposes bit rate constraint on the compressed bit stream. In general, the quantization in the encoding process is adjusted so that the resulting bit rate can be accepted by the transmission channel.
Because both temporal and spatial redundancies are removed by the compression schemes and because of variable length encoding, the resulting bit stream is much more sensitive to bit errors or bit losses in the transmission process than if the uncompressed video is transmitted. In other words, minor bit error or loss of data in compressed bit stream typically results in major loss of video quality or even complete shutdown of operation of the digital receiver/decoder.
Further, a real-time multimedia bit stream is highly sensitive to delays. A compressed video bit stream, when transmitted under excessive and jittery delays, will cause the real-time decoder buffer to under flow or overflow, causing the decoded video sequence to be jerky, or causing the audio video signals out of synchronization. Another consequence of the real-time nature of compressed video decoding is that lost compressed data will not be re-transmitted.
Despite the increase in channel bandwidth, there continues to be a need for adjusting the number of bits for representing a bitstream to the amount of available channel bandwidth. Another particular problem, especially when several channels are multiplexed over a single channel, is the allocation of the available bandwidth to multiple channels. Often it is necessary to recode bitstreams to maximize the utilization of the channel bandwidth. However, the use of compression techniques also introduces significant computational complexity into both the encoding and decoding process. Specifically, the compressed video bit streams, at any given bit rate, cannot be altered again to a different bit rate without decoding and recoding. In addition, the resulting number of bits required to represent digital video pictures varies from picture to picture and the coded pictures are highly correlated as a result of motion estimation. The problem of delivering real-time digital video bit stream over a channel of a given bandwidth becomes even more complex because the available bandwidth must be matched to the coded video bit stream rate. When the mismatch occurs, recoding, or re-compression, must be done. A final problem is that existing recoding processes only allow recoding on a stream by stream basis. There are many instances when only a portion of a stream may need to be recoded to resolve a temporary shortage of channel bandwidth.
Therefore, there is a need for a system and method for transcoding a bitstream on a frame or smaller basis. Furthermore, there is a need for a system that allows transcoding on a section of the compressed video data and on an autonomous basis anywhere within a video stream.
The present invention overcomes the deficiencies and limitations of the prior art with a system and a method for transcoding multiple channels of compressed video streams using autonomous frames. More particularly, a system according to the present invention includes an autonomous frame processing unit having an autonomous frame generator and an autonomous frame recoder. The autonomous frame generator receives video data and divides it into a series of autonomous frames. Each autonomous frame preferably comprises 1) a frame header including all header information from the original video data plus enough additional information to allow the frame to be recoded using pre-defined autonomous frame structure, and 2) a frame payload including the original video data information. The autonomous frame generator outputs autonomous frames to the autonomous frame recoder which in turn process the autonomous frame including extracting processing parameters, extracting the video data and setting up or initializing the recoder to process the extracted video data. The autonomous frame recoder preferably further comprises a parser coupled to an initialization unit and a recoder. The autonomous frame recoder outputs a bitstream that has been adjusted in the number of bits used to represent the data. In other embodiments, the autonomous frame processing unit may include a plurality of autonomous frame generators and a plurality of autonomous frame recoders coupled in a variety of different configurations.
The present invention also includes a method for processing video data including the steps of: receiving a video bitstream, storing recoding information, dividing the video bitstream into a plurality of autonomous frames each frame including a portion of the video bitstream and recoding information, outputting the plurality of autonomous frames, receiving the plurality of autonomous frames, extracting processing information from the autonomous frame, extracting video data from the autonomous frame, setting the recoding according to the processing information and recoding the extracted video data.