The present invention relates to a technique for controlling a bit rate at the time of compressing data on a picture, a voice, and so on.
Picture data or voice data, which is stored in storage devices such as a hard disk or a CD-ROM, is of tremendous amounts. This situation makes it necessary to compress the data so that the amount thereof for a single frame falls within a level (i.e. data transfer rate) corresponding with a specified capacity of the CD-ROM or abilities of the transmission path. Techniques such as JPEG and MPEG are known as compression systems therefor. These are methods in which a combination of orthogonal transformation and variable length coding plays a role of allowing the data to be compressed (a reference literature: ISO/IEC 11172-2). An overview of the methods will be explained below, using FIG. 10.
First, when data to be compressed, for example picture data, is inputted, a frame is separated into a certain number of blocks (101). Since, according to the standard, the number of effective pixels for each line is 720 and the number of lines for each frame is 480, the number of pixels for each frame becomes equal to 345,600. It is usually specified that one block consists of 8 by 8 pixels, and thus one frame is eventually separated into 5400 units of blocks.
Next, orthogonal transformation coding is performed for each of the separated blocks, thereby reducing redundancy which the picture data contains (102). Although a DCT (discrete cosine transformation) is employed most widely in the MPEG, there are other orthogonal transformation systems. such as Karhunen-Loeve transformation and Fourier transformation. In the DCT, the picture data, for each of the separated blocks, is transformed into frequency DC element and frequency AC element. When one block consists of 8 by 8 pixels, a two-dimensional DCT is performed first, and, after being transformed, the data is stored in an 8- by 8-pixel matrix in the order of increasing frequency from the side of the point of origin. In the MPEG, the DCT is executed not only for intraframe picture data but also for an interframe motion-compensated prediction error signal. On account of this, in a picture with no or little motion such as a still picture, data after being transformed becomes zero. This makes the block an ineffective one.
In order to further reduce data transfer rate of the signal, the redundancy of which has been reduced using the orthogonal transformation, quantization of the signal is performed so as to digitize the data (103). In this processing, the orthogonalized data, depending on the respective frequency region, is divided by a quantizing coefficient. Usually, taking advantage of the fact that visual perception of humans is not sensitive to the high frequency region, the high frequency components are divided by a large quantizing coefficient. In this way, the high frequency region is roughly quantized and the amount of the data is caused to be concentrated in the low frequency region, thereby aiming at reducing the whole amount of the data.
The next step is to perform, using Huffman coding and so on, variable length coding of the frequency components quantized at a step 103 (104). At this time, concerning the direct voltage components, it is executed to carry out Huffman coding of difference values toward direct voltage components in a block in proximity to the present block. Concerning the alternating voltage components, the following encoding is executed: First, a scanning, which is called a zigzag scan and is carried out from a low frequency component to a high frequency one, is performed. Then, a two-dimensional Huffman coding thereof is performed based on the number of ineffective (i.e. the value is equal to zero) and successive components and values of effective components subsequent thereto.
In this compression technique, there exists a problem that the data transfer rate takes on no fixed value. Namely, since the quantized data is processed using Huffman coding which belongs to the variable length coding, it turns out that a generated bit rate takes on a different value for each picture. As a result of this, the data transfer rate takes on no fixed value, for the data transfer rate is a value obtained by multiplying the generated bit rate for one frame by the number of frames reproduced per second. For example, when wishing to insert for a few seconds a still picture, which is intended for appealingly informing conversion of a scene, into a pause in scenes of a moving picture, if a monotonous still picture (for example, a picture all the pixels of which have an identical color) continues to be inserted, the compression rate becomes higher and only the portion into which the monotonous still picture is inserted comes to exhibit a lower data transfer rate as compared with the other portions. If the portion into which such a monotonous moving picture is inserted is reproduced, since the actual data transfer rate is lower than a data transfer rate stored in a header, there occurs a phenomenon such as a frame skipping. This brings about a deterioration in reproduction quality (accuracy) of the moving picture.
As disclosed in JP-A-8-46964 as a technique for solving the above-mentioned problem, there is a technique which, concerning the data after being processed by Huffman coding, allows the generated bit rate to be adjusted in a unit which consists of a plurality of blocks. In this technique, the quantizing coefficient is set so that, on a first frame of input data, a generated bit rate for the entire frames is smaller than but very near to a planned bit rate. Adjustment of a generated bit rate after the first frame is performed in a macroblock unit which is constituted by summarizing a plurality of blocks. An encoding processing in the macroblock unit is as follows: First, quantization for each block is performed and the resultant data is stored in a memory. Next, the data is read out from the memory, then being encoded while monitoring a bit rate generated by one block in terms of the macroblock unit. When the encoding for one macroblock is over, the generated bit rate in the macroblock unit is compared with an allotted bit rate. When the generated bit rate is smaller, a pseudo data bit string is inserted. The generated bit rate for each macroblock is adjusted in this manner.
In the above-described prior art, a generated bit rate on the after-compression picture data is judged for each macroblock, and then necessary amounts of pseudo data bit string is added so as to adjust the transfer bit rate. This transaction makes it unavoidable to deal with data other than display data at the time of an expansion processing, thus complicating an algorithm for expansion processing.
It is an object of the present invention to provide a method and an apparatus which, without exerting any influence upon an expansion processing, make it possible to easily adjust the generated bit rate with the use of an existing compression processing, a storage medium which stores a program readable by information processing devices, and an information processing device for executing the program stored in the storage medium.
According to the present invention, in a method of controlling a bit rate at the time of compressing picture data, a comparison between a generated bit rate and a planned bit rate is performed for each frame of the compressed input data. When the generated bit rate is found to be smaller, pixels, with a regular (or an irregular) pattern, are mixed onto the picture to be compressed. In this way, in the present invention, the pattern consisting of pixels is directly added over to the picture of an object to be compressed, thereby increasing the insufficient bit rate. This transaction makes it possible to control the bit rate after the processing without the necessity of modifying an algorithm for expansion processing of the after-compression data. Moreover, there is no necessity of removing the pattern added over at the time of the expansion processing. Also, the pattern to be added over is formed using more minute pixels or a color tone thereof is harmonized with that of the original picture. These transactions further reduce visual influences exerted upon the picture after being compressed.