1. Field of the Invention
The present invention relates to an apparatus and method for selecting the length of a variable length coding bitstream by using a neural network.
2. Description of the Related Art
FIG. 1 is a diagram of a structure of a conventional video data compression apparatus. The conventional video data compression apparatus comprises a discrete cosine transform (DCT) unit 11, a quantization unit 12, and a variable length coding unit 13.
Digital-type data compression, the compression of video data in particular, is an essential element of a variety of multimedia application environments. However, since a vast amount of information should be processed during video data compression, there are a lot of restrictions in efficiently transmitting, storing, or processing video data. In order to solve these problems, international standards, such as moving picture experts group (MPEG)-2, MPEG-4, H.263, and H.26L, define compression stream syntaxes and decoding processes.
Generally, the compression methods are classified into lossless compression methods and lossy compression methods. If a lossless compression method is used for characters, diagrams, and ordinary data, a complete reconstruction thereof is possible, but the compression ratio is 2 to 1 in average. Meanwhile, if data such as video and audio are compressed by allowing a little loss to the extent that a user cannot perceive the loss, a 10-to-1 or higher compression ratio can be easily obtained. Among lossy encoding techniques for efficiently compressing video data, a transform encoding technique is most widely used. In the basic frame of this method, data which are arranged with high spatial correlations are transformed into frequency components using an orthogonal transformation, ranging from low frequency components to high frequency components. Quantization is performed for each frequency component. At this time, correlation between each frequency component almost disappears and the energy of the signal is concentrated on the low frequency part. Among the frequency domain data obtained by the orthogonal transformation, more bits are allocated for a frequency component in which more energy is concentrated (i.e., where a distribution value is higher) such that the frequency component can be expressed uniformly. Whenever the distribution value increases by four times (that is, the amplitude increases by two times), one more bit is allocated such that all frequency components have identical quantization error characteristics. Among a variety of orthogonal transformations, the Karhunen-Loeve transformation (KLT) has the highest energy concentration characteristic and provides the most efficient compression, theoretically. However, since a transformation should be defined for different pictures in this method, the KLT transform cannot be used practically. A transformation which has a performance close to that of the KLT and can be used practically is the discrete cosine transformation (DCT).
The DCT unit 11 transforms video data into DCT coefficient blocks by DCT transforming video data. In the DCT transformation, each 8×8 array of picture elements is grouped in a block and transformation is performed in units of blocks. Compression ratio increases with increasing size of a block, but the implementation of the transformation becomes much more difficult. Based on experiments, the size of an 8×8 array of picture elements has been selected as a compromise between the performance and ease of implementation. Generally in the prior art compression techniques, in order to remove spatial redundancy when pictures are compressed, DCT transformation is used, and in order to remove temporal redundancy, motion estimation (ME) and motion compensation (MC) are used.
The quantization unit 12 quantizes each of coefficient values of the DCT coefficient blocks by using a Q value, that is, a quantization parameter. At this time, smaller values become 0 such that a little information loss occurs.
The variable length coding unit 13 performs variable length coding (VLC) for data that passed the DCT step and the quantization step. This is the final step of a compression process, in which the DCT transformed quantized coefficients are losslessly compressed. That is, continuously repeated characters (here, 0's) are replaced by an integer string comprising one character corresponding to the numbers of characters, and by zigzag scanning, the generated integer string is transformed into binary numbers. At this time, a VLC table is used so that a short-length code is allocated to a character which has a high probability of occurrence and a long-length code is allocated to a character which has a low probability of occurrence. This is the reason why the coding is referred to as variable length coding (VLC).
After the entire compression process described above, the 8×8 matrix is reduced to some combinations of 0's and 1's. If the compression process is performed reversely, then the compressed video data are decompressed. The video data compression apparatus of FIG. 1 is used for compressing a still picture into a compression file of a joint photographic experts group (JPEG) type, and when moving pictures are compressed into an MPEG format compression file, an apparatus which performs differential pulse code modulation (DPCM) should be added to the video data compression apparatus. By the DPCM, a signal value to be transmitted is estimated based on a signal which was already transmitted, and the difference between the estimated value and an actual value is encoded and then transmitted. For example, if a signal value of point A is used as an estimated value, the difference between the signal value and the signal value of point B is an estimation error. Instead of the signal of point B, this estimation error is encoded and transmitted.
In the prior art, in order to control a bit rate, a quantization parameter, i.e., a Q value, should be adjusted. In order to check whether or not a bit rate desired by a user is output when a Q value is used, the VLC should be performed many times until the desired bit rate is output. However, for the VLC, when a character is mapped to a code, a VLC table should be used every time. This causes heavy computational loads, and accordingly, when a quantized DCT coefficient block which was quantized by a Q value other than the desired Q value is VLC coded, waste of the system resources increases greatly. Likewise, in other cases in which processing related to the length of a bitstream should be performed, the VLC should be performed many times until the desired bit rate is output, and waste of the system resources increases greatly. In this case, the processing is usually performed through parallel hardware, and because each hardware unit should have respective VLC tables, this also increases waste of system resources greatly.
In addition to controlling a bit rate, in some cases, a compressed video packet should satisfy a predetermined limited length. In this case, without VLC coding a next macroblock, the length of the next bitstream after VLC coding cannot be identified. Accordingly, every block should be VLC coded and waste of the system resources increases greatly.