1. Field of the Invention
This invention relates generally to the encoding/decoding data processing systems. More particularly, this invention relates to the architecture of a data encoding/decoding system wherein the variable-length decoder, run-length decoder, and data buffer are arranged in special order to achieve high process through-put rate.
2. Description of the Prior Art
The speed of a real-time data process such as a variable-length decoding process may often limit the through-put of the entire data handling system. In the past decade, the advances made in electronics, computer and data communication generate a tremendous need to transmit a large amount of data at very high speed and to store them in the memory storage apparatus. In order to satisfy these needs, a data item which is represented by a bit-stream is first compressed before transmission and storage. A code-word with less number of bits is used to encode the original bit-stream whereby the data transmission time can be shortened and the encoded data can be stored in less memory space than that required by the data as represented by the bit-stream in its original form.
Among several data compression techniques, a variable-length encoding wherein the code words are allowed to have variable number of bits is frequently chosen because it generally can achieve higher encoding efficiency than the fix-length encoding methods. On the other hand, for the purpose of receiving data, a data receiver in receiving these encoded data items must first perform a decoding process in order to properly recognize and then use these data. In performing a decoding operation on a plurality of encoded data items each having a variable length, the decoding operation is more time consuming in most cases than a decoding operation applied to the fix-length encoded data since there is no prior knowledge for determining how many of the encoded bits must be processed in order to generate a decoded bit-stream.
Another commonly used technique to increase the data processing speed is by using a plurality of processors and by configuring these processors such that the data processing operations can be independently and simultaneously performed in parallel. For a fixed-length decoding, a bit-stream can be easily segmented into a plurality of sub-streams each with the predefined length; each sub-stream can then be processed in parallel with a multi-processor decoding system. The parallel processing technique is, however, not applicable to variable-length decoding because the number of bits in the incoming bit-stream to be processed in order to decode a word is unknown; each incoming bit must be serially processed until a code-word is found. Therefore, if the speed of a variable decoding process limits the over-all system through-put, the technique of parallel processing is not available to overcome this system limitation.
One specific example wherein the variable-length decoding may limit the system through-put is a digital HDTV. For the purpose of explanation, the compression process performed by a digital video encoder 10 as shown in FIG. 1 is first described. The analog red, green and blue (R, G, B) inputs are processed by three low pass filters (LPFs) 12-1, 12-2 and 12-3 before they are digitized. The low pass filters 12-1, 12-2 and 12-3 are used to provide adequate rejection of aliasing components and other spurious signals. The analog to digital converters (A/Ds) 14-1, 14-2, and 14-3 then digitized the signals before the RGB signals are digitally converted to YUV color space by utilizing a RGB to YUV Matrix 16 which conforms to the SMPTE 240M colorimetry.
The resolution of chrominance data can be reduced relative to the luminance resolution without seriously affecting the perceived image quality. The U and V chrominance components are decimated horizontally by a factor of four and vertically by a factor of two by two decimators 18-1 and 18-2. The luminance signal (Y) bypasses the chrominance preprocessor whereby full resolution is maintained. The chrominance components are then multiplexed with the luminance component, one block at a time, in a multiplexer 20.
Each block of pixels which generally comprises eight pixels horizontally and eight pixels vertically are then transformed into a new block of Discrete Cosine Transform (DCT) coefficients by a DCT transformer 22. The block size of eight-by-eight is chosen because the efficiency of transformation is not substantially improved with increased block size which often requires greater degree of circuit complexity when the block size is larger than eight-by-eight. The transformation is applied to each block until the the entire image is transformed.
In order to improve the coding efficiency, a small adjustment is made to each of the image data by first weighting each of the DCT coefficients and than selecting an eight bit weighting factor for transmission to the decoder. Once selected, the weighting factors remain unchanged. This task is performed by a coefficient quantizer 24 which utilizes an eight-by-eight weighting matrix wherein each matrix element is a scaling factor. The compressibility of the image data is improved through the quantization process where the amplitude of the transform coefficients are reduced. A statistical encoding technique is then applied to compress these image data. A Huffman coding is used by a variable-length encoder 28. In order to apply the Huffman coding, the eight-by-eight DCT coefficients are serialized into a sequence of sixty-four and"amplitude/run-length" code. Scanning the sequence of sixty-four, an event is defined to occur each time a coefficient is encountered with an amplitude not equal to zero. A code word is then assigned indicating the amplitude of the coefficient and the number of zeros, i.e., the run-length, preceding it.
The aforementioned compressions are spatial processing. In addition to the spatial correlations, the image data also have interframe temporal correlations which can be utilized to further compress the video signals. A high degree of temporal correlation exists whenever there is little movement from one frame to the next. Even if there is movement, high temporal correlation may still exist depending on the spatial characteristics and the changes of the images from one frame to the next. In order to quantify the interframe temporal correlation, the quantized data from the quantizer 24 is normalized in a normalization processor 32 and then inversely transformed by an inverse DCT transformer 34. The frame delay is taken into account by a frame delay processor 36. A prediction of how the next frame will appear is made by a motion estimator 38 and a motion compensation component is computed by the motion compensator 40. A temporal differential encoding (DPCM) is used to generate a motion vector of a superblock which has a horizontal dimension of four DCT blocks and a vertical dimension of two DCT blocks. The sizing is compatible with the four times horizontal sub-sampling and two times vertical sub-sampling of the chrominance components, thus allowing the same motion vector to be used to displace a single chrominance DCT block.
The motion compensation scheme is therefore integrated into the overall system design as that shown in FIG. 1 wherein an estimate of the image is first generated using the motion compensation. The difference between this estimate and the actual image is then transform-coded and the transform coefficients are then normalized and statistically coded by applying the Huffman coding process. A first-in-first-out (FIFO) buffer 44 is implemented as a rate buffer which matches the variable rate of the Huffman-coded data to a fixed output rate to maintain constant channel transmission. The FIFO buffer has sufficient storage space to accommodate input data rate variations. The status of the FIFO buffer is continuously monitored and controlled to prevent the buffer from overflow or underflow.
FIG. 2 shows the block diagram of a digital video decoder 50 for a digital HDTV. The sequence of operations of the decoder 50 is exactly in reverse order as that of the encoder 10 as shown in FIG. 1. The video data are first received by a FIFO rate buffer which is then decoded by a variable-length decoder 54. The decoded data are then inversely normalized (step 56) and inversely DCT transformed (step 58) before adjustments for motion compensation components are subtracted (steps 60 and 62). The video signals are then de-multiplexed by the demultiplexer 64 into YUV components where the U and V components are interpolated by two interpolators 66 and 68 before the YUV signals are converted to RGB space through a YUV to RGB matrix 70. A digital to analog conversion is performed by DAC 72 and then further processed by three low pass filters, i.e., LPFs 74, 76 and 78 before the analog RGB signals are displayed.
Because the video data received by the FIFO buffer 52 are compressed data and since the video data are transmitted via constant bit rate channels, the FIFO buffer 52, as implemented in the decoding system 50, has an advantage that the memory space of the buffer 50 can be maintained at a relatively low level. Additionally, under the current JPEG, MPEG, OR H.261 standards, the luminance and chrominanace pixel rates are less than 10 MHz. Such pixel rate can be satisfied by the decoder system 50 since a variable-length decoder with a decoding rate of twenty to twenty-five MHz are commercially available on a VLD chip. The through-put of the decoding system 50 is sufficient to support the current digital image compression systems.
In a digital HDTV system, much higher pixel rates are required. A standard resolution of 960-vertical-pixels by 1440-horizontal-pixels are displayed at a rate of thirty frames per second. The rate of chrominance is approximately half of the rate of the luminance pixels. A pixel rate of more than sixty million pixels per second is required. The system clock rate has to be about seventy MHz (70 MHz) in order to handle the pixel rate and to manage additional overhead in decoding and transmitting the appropriate control signals. A decoding system such as the decoder system 50 is not able to meet this requirement since the highest processing rate achievable by the current VLD is still below thirty MHz. The processing speed of the variable-length decoder 54 thus limits the through-put of the decoding system 50.
FIG. 3 illustrates the bit-rate requirements of the digital HDTV system. The incoming video data are received by the data buffer at a constant rate of 20M bits/second. The video data are then processed by the VLD 54 and a run-length decoder (RLD) 55. The speed of the VLD process is driven by the image pixel requirements which are 5 MHz when averaged over long periods of time, however, there are bursts of 70 MHz in order to satisfy the pixel rate for image display. The speed of the VLD 54 thus becomes a bottle neck limiting the performance of the entire decoding system.
Therefore, a need still exists in the art to overcome the through-put limitation when a slower, variable-length decoding process is involved whereby the performance of the entire encoding/decoding system can be improved. Specifically, in the art of digital HDTV, the difficulty caused by the speed of the variable-length decoding has to be resolved in order for the high quality HDTV images to be displayed at the designed rates and resolutions.