This invention relates generally to computer systems and more specifically to efficient data transmission between MPEG video compression stages.
Decompression and compression of video and audio data is used for video playback and for teleconferencing applications. Video playback and teleconferencing applications require compression methods that are capable of reducing video frame data to the smallest number of bits that can accurately represent the original signal. The main reasons for this are to enable real-time transmission of the compressed files across integrated services data networks (ISDN) lines and across standard telephone (POT) lines, and to reduce the required amount of data storage space.
There are many types of video compression and decompression techniques provided in the art. Seven of these techniques include the MPEG, MPEG-2 and MPEG-4 standards developed by the Moving Pictures Experts Group, the IPEG standard, the JPEG standard developed by the Joint Picture Experts Group, the P.times.64 standards, and the H.26.times. video teleconferencing standards. Each standard uses a variety of encoding methods for encoding frames of sound and video data. For example, the MPEG standards use a combination of Huffman run-level encoding, quantization, discrete cosine transfer (DCT), and motion compensation to compress, or encode, sound and video data Regardless of the standard that is used, the procedures used to compress a file are simply reversed to uncompress, or decode, that file.
The MPEG procedures used during decompression of compressed data would be performed in a pipeline manner as follows. First, a compressed data file is accessed by the system that is to perform the decompression. The compressed file is comprised of variable length codes, referred to as Huffman run-level codes, which represent patterns of logical ones and zeroes. The Huffman run-level codes enable those patterns to be represented in a manner that occupies a significantly smaller amount of memory than the patterns otherwise would. For example, the shortest Huffman run-level codes represent patterns of logical ones and zeroes that are most frequently encountered. Likewise, the longest Huffman run-level codes represent patterns of logical ones and zeroes that are least frequently encountered. Accordingly, the most frequently encountered patterns are replaced with the shortest Huffman run-level codes, thereby producing a significant reduction in storage space.
When the system accesses a compressed file, it is parsed to extract the Huffman run-level codes. The Huffman run-level codes are then reconverted into the patterns of logical ones and zeroes that they represent. Those patterns will be referred to as coefficients. Typically the coefficients are arranged in groups of sixty four, and further arranged in eight-by-eight matrices organized in the order in which they are translated from the run-level codes. Such a matrix is comprised of storage locations in a memory storage unit. Those storage locations are logically arranged in a row and column configuration and are accessed with respect to their relative position with the matrix.
It should be noted that although eight-by-eight matrices of coefficients are typically used in the art, four-by-four matrices will be used for simplicity of illustration. One of ordinary skill in the art will be able to scale the illustrations appropriately to the eight-by-eight implementation.
For illustration purposes, consider a group of sixteen coefficients (C.sub.n), each having eight bits of data. The coefficients are arranged in the following four-by-four coefficient matrix where C.sub.1 is the first coefficient translated: ##EQU1##
The second stage of the decompression pipeline is the inverse quantization stage wherein an element-wise multiplication is performed. The element-wise multiplication multiplies each of the coefficients in the four-by-four matrix by corresponding quantization factors (Qf.sub.n) stored in a quantization matrix. The quantization matrix is the same size as the coefficient matrix, in this case 4.times.4. The multiplication is performed as follows: ##EQU2##
For example, Q1 is the product of coefficient C1 and quantization factor QF1. Therefore inverse quantization operation scales each coefficient by the associated quantization factor. In this manner, coefficients can be stored using a smaller representative number of bits and, upon inverse quantization, the coefficients are returned to their original representation.
Upon completion of the inverse quantization operation, the coefficients are each represented by a sixteen bit word The resulting sixteen-bit coefficients are packed into eight longwords (32 bit words) in the following arrangement: ##EQU3##
The coefficients in the above mentioned matrix are input to the third stage of the decompression pipeline, referred to as the inverse discrete cosine transfer stage. When the files are compressed, a discrete cosine function is applied to each eight-by-eight block of coefficients using the following equation: ##EQU4##
To reverse the effects of the discrete cosine transfer an inverse discrete cosine function is performed, thereby restoring the original data. The inverse cosine function is applied using the following equation: ##EQU5##
Because the two-dimensional discrete cosine transfer is an orthogonal function with orthonormal basis vectors, it can be performed as a series of one-dimensional row transforms followed by a series of one-dimensional column transforms. Accordingly, the inverse discrete cosine transform operation is also performed in two one-dimnensional portions, i.e. a series of row transforms followed by a series of column transforms. The row operation portion is typically performed first. The sixteen-bit data in the matrix of the inverse quantization stage is re-ordered in the following manner and input to the inverse discrete cosine transform row operation: ##EQU6##
Because each of the elements typically include sixteen bits of data, each row of the matrix represents two longwords. The coefficients are output from the inverse quantization stage in the same order that the row transform operation requires. Therefore the individual words are not re-ordered but are simply packed into the two-longword pairs.
Conversely, the column operation portion of the inverse discrete cosine transfer requires a significantly different configuration of coefficients from that which is required for the row operation portion. Specifically, the rows and columns of the matrix used in the row operations need to be exchanged, or transposed. Typically a transpose operation is required to arrange the coefficients output from the inverse quantization stage into the following order: ##EQU7##
The transpose operation is performed by copying the coefficients into general purpose registers and then reordering them.
After the inverse discrete transform operation is complete, the resulting error coefficients (E.sub.n) remains in the same order as the data input to the column operation. Accordingly, the resulting data is not ordered in raster order, i.e., the order in which the data is arranged on an output display. Therefore the rows and columns of the matrix are again transposed before the error data is input to the next stage in the decompression pipeline, i.e., the motion compensation stage. After the rows and columns are transposed, the resulting error coefficients are arranged as follows: ##EQU8##
The motion compensation stage adds the error coefficients to an associated motion vector, generated by a motion estimation stage, to produce actual pixel data. The motion estimation stage of the decompression pipeline compares the value of each pixel in the matrix to the value of each surrounding pixel in a consecutive frame. Based on those values, the operation determines which direction the pixels are moving and then determines a local gradient, i.e. the direction of greatest change. The local gradient is represented as a vector (m.sub.13 X, m.sub.13 Y) which, when added to a pixel's position in the prior frame, gives that pixel's position in the current frame. That vector adding computation is referred to as motion compensation and requires the pixel data to be in raster order. The data should also be in raster order so that the uncompressed pixel data can easily be displayed on an output device as it is output from the decompression pipeline. When data is arranged in raster order it is arranged in the order that it is to be displayed on the output device. Accordingly the pixel that is to be displayed at the top left corner of the output device is the first pixel in the matrix. The other pixels in the matrix are those which follow from left to right and from top to bottom, with respect to the output display device.
Such transpose operations are performed thousands of times for each frame of data that is decompressed which increases the duration of a decompression operation. Accordingly it is desirable to minimize or eliminate the transpose operations. Further, in order to decompress and display video and audio data in a real time manner, the data must be communicated through the operational stages in a manner which is highly efficient. The current manner of using transpose operations does not lend itself to such an efficient operation.