1. Field of the Invention
The field of art to which this invention relates is insertion or splicing of digital video frames into input video sequence. Specifically, this invention relates to splicing new video frames into a compressed digital video sequence while maintaining Video Buffer Verifier (VBV) integrity as required, in order to have MPEG-2 compliant streams.
2. Description of the Related Art
Within the past decade, the advent of world wide electronic communications systems has enhanced the way in which people can send and receive information. In particular, the capabilities of real-time video and audio systems have greatly improved in recent years. To provide services such as video-on-demand, video conferencing, and multimedia communications to subscribers, an enormous amount of a network bandwidth is required. In fact, a network bandwidth is often the main inhibitor in the effectiveness of such systems.
To overcome the constraints imposed by networks, compression systems have emerged. These systems reduce the amount of video and/or audio data that must be transmitted by removing redundancy in the picture sequence. At the receiving end, the picture sequence is uncompressed and may be displayed in real-time.
One example of an emerging video compression standard is the Moving Picture Experts Group ("MPEG") standard. Within the MPEG standard, video compression is defined both within a given picture and between pictures. Video compression within a picture is accomplished by conversion of the digital image from the time domain to the frequency domain by a discrete cosine transform, quantization, and variable length coding, all of which are well known in the art. Video compression between pictures is accomplished via a process referred to as motion estimation and compensation, in which a motion vector is used to describe the translation of a set of picture elements from one picture to another picture. Motion compensation takes advantage of the fact that video sequences are most often highly correlated in time, each frame in any given sequence may be similar to the preceding and following frames. These motion estimation and compensation techniques are also well known in the art.
To carry out the video compression, an encoder scans subsections within each frame, called macro blocks, and identifies which ones will not change position from one frame to the next. The encoder also identifies reference macro blocks while noting their position and direction of motion, and assigns a motion vector that identifies the motion of the reference block from one frame to another. Only the motion vector between each reference macro block and the affected current macro block is transmitted to the decoder. The decoder stores the information that does not change from frame to frame in its buffer memory and uses it to periodically fill in the macro blocks of the frame that do not change. The video sequence is subsequently decompressed and displayed close enough to the original video sequence to be acceptable for most viewing.
The MPEG-1 standard was introduced to handle the compressed digital representation of non video sources of multimedia. Subsequently it was adapted for the transmission of video signals as long as the video material was first converted from interlaced to progressively scanned format. That standard was adapted to transmit the compressed data bit stream at a rate (bandwidth) of 1.5 MBits per second, which is the rate of the uncompressed audio, CD and DAT.
The MPEG-2 standard was developed to produce higher quality images at bit rates of 3 to 10 MBits per second, for moving images of various applications such as digital storage and communication. The MPEG-2 standard supports both video material in interlaced or progressively scanned formats.
MPEG utilizes the interframe compression that builds reference frames so that subsequent frames can be compared with their previous and following frames. Interframe compression thus allows greater compression ratios because only the difference between frames is stored. That is to say, the encoder/codec can compress the video better because it does not have to store every frame of information-only what is unique to each frame.
An MPEG stream accomplishes this compression by using three types of frames: I or intra frames, P or predicted frames, and B or bidirectional frames. Intraframes are the only full frames in an MPEG stream, containing enough information to qualify them as entry points in the stream via random access. Intraframes are thus the largest of the frames. Predicted frames are based on a previous frame, either an intra or predicted frame, in turn becoming eligible for reference by following predicted frames. Since only the changes between the new frame and the reference frame need to be saved, the predicted frames are usually highly compressed. Bidirectional frames refer to both a future and previous frame, and are the most highly compressed frames in the stream. Because they contain so little information, they are never used as a reference frame for other frames.
Since the MPEG stream comprises only a few full frames, namely infrequent I frames, editing of the MPEG stream with frame accuracy is difficult. Most video is edited before encoding, which means that the edited sequence must be printed out to video losing a crucial generation before encoding.
In video encoding, users will often need to replace, insert or splice frames into an existing encoded video stream. A problem arises in assuring that the new encoded frames will fit into the encoded stream in the space allotted without running over their given bit allocation.
Therefore, it is an object of the present invention to provide a method by which a defined number of frames may be encoded to a precise bit target given the start and end parameters of how much data is contained in the decoder buffer.
It is yet another object of the present invention to provide a method for adjusting the contents of the buffer using the rate control algorithm when the target size is overrun or under run.