This invention generally relates to compressing digital video data, and more specifically, to methods and systems using MPEG standards for compressing such data.
Full motion video displays based upon analog video signals have long been available in the form of television. With recent increases in computer processing capabilities and affordability, full motion video displays based upon digital video signals are becoming more widely available. Digital video systems can provide significant improvements over conventional analog video systems in creating, modifying, transmitting, storing, and playing full motion video sequences.
Digital video displays include large numbers of image frames that are played or rendered successively at frequencies of between 30 and 75 Hz. Each image frame is a still image formed from an array of pixels according to the display resolution of a particular system. As examples, VHS based systems have display resolutions of 320×480 pixels, NTSC based systems have display resolutions of 720×486 pixels, and high-definition television (HDTV) systems have display resolutions of 1360×1024 pixels.
The amounts of raw digital information included in video sequences are massive. The storage and transmission of these massive amounts of video information is infeasible with conventional personal computer equipment. For instance, a two hour full length motion picture, shown in VHS image format, may have 100 gigabytes of digital information.
In response to the limitations in storing or transmitting such massive amounts of digital video information, various video compression standards or processes have been established, including MPEG-1 and MPEG-2. These conventional video compression techniques utilize similarities between successive image frames, referred to as temporal or interframe correlation, to provide interframe compression in which pixel based representations of image frames are converted to motion representations. In addition, the conventional video compression techniques use similarities within image frames, referred to as spatial or intraframe correlation, to provide intraframe compression in which the motion representations within an image frame are further compressed. Intraframe compression is based upon conventional processes for compressing still images, such as discrete cosine transform (DCT) encoding.
The MPEG standard provides interframe and intraframe compression based upon square blocks or arrays of pixels in video images. A video image is divided into macroblocks having dimensions of 16×16 pixels. Each macroblock 16×16 is broken into 4 8×8 luminances blocks and 2 or 4 8×8 chrominance blocks. For each macroblocks Tn in an image frame N, a search is performed across the image of the next successive video frame N+1 or an immediately preceding image frame N−1 (i.e., bidirectionally) to identify the most similar respective macroblocks TN+1 or TN−1.
In an ideal case, the pixels in macroblocks TN and TN+1 are identical, even if the macroblocks have different positions in their respective image frames. Under these circumstances, the pixel information in macroblocks TN+1 is redundant to that in macroblocks TN. Compression is achieved by substituting the positional translation between macroblocks TN and TN+1. In this simplified example, a single translation vector (Δx, Δy) is designated for the video information associated with the 256 pixels in macroblocks TN+1.
With prior art MPEG compression, or encoding, routines, each macroblocks of pixels, or coefficients, is encoded by a variable length encoding (VLE) unit, and then sent to a compressed output interface as part of an encoded bitstream. In constant bitrate (CBR) encoding, the average compressed output, which consists of headers plus VLE unit output, must match a user selected bitrate. The encoding system translates the bitrate into target bits per picture and subsequently into target bits per block. The bits used by the headers are predictable, but the bits used by the VLE unit output are variable. If the VLE unit passes the first N bits of its output per block, where N is the target bits per block, then a constant bitrate can be achieved. If the number of bits being used per block is known in advance, the speed of the encoding process can be increased by eliminating the time needed to wait for the actual number of bits to be reported. A sophisticated look-ahead bit production scheme is needed to accomplish this.