1. Field of the Invention
The present invention generally relates to compression techniques and more particularly to a method of determining an optimum quantization number for the compression of video data.
2. Description of the Related Art
There are many standards for the compression and decompression of video data. Some of the standards are the DV-IEC 61834 (“IEC-61834”), DVCPRO 25 and DVCPRO 50 standards. Each of these standards specifies a fixed amount of data for one video frame. Under these standards, as can be seen in FIG. 1, the video frame consists of five layers of structure: (1) micro blocks, (2) macro blocks, (3) DIF blocks, (4) video segments, and (5) DIF sequences.
For the DV-IEC 61834 and the DVCPRO 25 standards, there are a total of ten digital interface format (DIF) sequences for NTSC and 12 DIF sequences for PAL in one video frame. For the DVCPRO 50 standard, there are twice as many DIF sequences (e.g., 20 DIF sequences for NTSC and 24 DIF sequences for PAL). Referring to FIG. 2, there are at least 135 DIF blocks for video data in each DIF sequence. Each video segment of the DIF sequence contains five DIF blocks. Under the DV-IEC 61834, DVCPRO 25 and DVCPRO 50 standards, each DIF block in the video segment has 77 bytes allocated for one macro block. Referring to FIG. 3B, for the DV-IEC 61834 and the DVCPRO 25 standard, each macro block contains six micro blocks—four blocks for Y data, one block for U data, and one block for V data. As can be seen in FIG. 3A, for the DVCPRO 50 standard, each macro block contains four micro blocks—two blocks for Y data, one block for U data, and one block for V data.
Referring to FIG. 4, the structure of the video segment for all the standards is shown. Each of the standards specifies that a video segment has a fixed space of 77×5=385 bytes for compressed video data. An EOB coding is defined for a macro block that has less than 77 bytes in a DIF block. The remaining space can then be used by any of the other four macro blocks in the same video segment.
For each of the standards, every macro block in the video segment can be assigned an individual quantization step which is a factor regarding how much data will be preserved for that particular macro block. The quantization step is determined by three factors: (1) the class number; (2) the area number; and (3) the quantization number (QNO). The class number and the area number are defined in the DV-IEC 61834, DVCPRO 25, and the DVCPRO 50 specifications. However, the QNO can vary depending on the amount of compression that the user wishes. The QNO is chosen from 16 values for each of the five macro blocks. It is desired to choose a value of the QNO for each of the five macro blocks such that the maximum amount of information can be preserved without exceeding the 385 byte capacity limit of the video segment. By choosing the optimum value of the QNO, both the best video frame rate and image quality is achieved. The larger the value of the QNO, the finer the quantization step is and hence the more bits are needed for the encoding.
For the DVC-IEC 61834 and DVCPRO 25 standards, the video frame consists of 270 video segments for NTSC and 324 video segments for the PAL format. For the DVCPRO 50 standard, the video frame consists of 540 video segments for NTSC and 648 video segments for PAL. The video frame rate is 30 frames per second for NTSC and 25 frames per second for PAL. Accordingly, for the DV-IEC 61834 and DVCPRO 25 standards, the processing time for computing the DCT (Discrete Cosine Transform) and QNO searching for a video segment is approximately 123.4 μsec. For the DVCPRO 50 standard, parallel computation in two groups of 10 DIF sequences for NTSC and 12 DIF sequences for PAL is used to maintain the same 123.4 μsec processing time for a video block. The DCT computation is well known for compressing the video data and usually only takes about ten percent of the 123.4 μsec computing time (i.e. about 12.34 μsec). The remaining time (i.e., 101 μsec) can be allocated for searching the QNO.
One method of determining the optimum QNO for each of the five macro blocks in the video segment is to do a full search of all the combinations that are possible. Referring to FIG. 5, in step 501, all of the combinations of the QNO are generated. It will be recognized that for five QNO's each having 16 values, the total possible number of combinations for the QNO's is 165=1,048,576 combinations. In this regard, 1,048,576 combinations must be examined in order to preserve the maximum video data while the capacity (i.e., 385 bytes) is not exceeded.
In step 502, each combination of QNO's is compared to the current best combination to determine if the length of the video segment is closer to the maximum value of 375 bytes. If the combination is better than the best combination, then in step 504, the best combination is replaced with the newly discovered best combination. However, if the combination is not better than the best combination, then that combination is discarded in step 506.
If all of the combinations have not been tried, then in step 508, the process returns to step 501 where another of the generated combinations is compared to the best combination. However, if all of the combination have been tried, then the process is finished because the best combination for the QNO's has been found.
A drawback to the method shown in FIG. 5 is that the time to search all 1,048,576 combinations is long. Only a supercomputer would be able to search all combinations in the 101 μsec allocated for QNO searching. It would not be possible for a desktop computer or other consumer product to search all of the QNO combinations in the allocated time.
Another method of determining the QNO's is to search only pre-selected combinations. Referring to FIG. 6, in step 601, only five pre-selected QNO's out of sixteen are generated. Accordingly, only 55=3125 combinations need to be examined.
In step 602, the combination is compared to the current best combination to determine if the length of the video segment is closer to 375 bytes. If the combination is not better than the current best, then the combination is discarded in step 606. However if the combination is better than the current best, the best combination is replaced with this combination in step 604. If all the combinations have not been tried, then the process in step 608 repeats to step 601 whereby another combination will be compared. However, if all of the 3125 combinations have been tried, then the process is finished. The average search time per combination of QNO's is about 32 ns which is applicable to the processor inside a desktop computer.
The QNO search methods described above do not consider the complexity differences between each macro block of the video segment. The QNO's are determined to be the greatest length without exceeding the prescribed limit. As a result, it is possible that macro blocks with high AC coefficients from the DCT process get small spaces while most spaces are reserved for macro blocks with small AC coefficients. Accordingly, the compressed image quality is deteriorated.
The second problem is the generation of mosquito noise. When a frozen pattern of video for encoding is formatted using the DV-IEC61834, DVCPRO 25 or DVCPRO 50 standard, the digitized video data will have a small variation from frame to frame due to the noise and uncertainty introduced by the analog-to-digital (A/D) converter. Ideally, the same five QNO's should be chosen for every video segment between video frames. However, because the video data varies slightly from frame to frame, a different combination of the five QNO's will be assigned using the above-mentioned search techniques. Accordingly, the position of high and low compressed macro blocks, which are usually along sharp edges, will be altered between the video frames thereby causing noticeable distortion of the video image (i.e., mosquito noise).
In addition to the foregoing, the method using 5 pre-selected QNO's shown in FIG. 6 also has another additional drawback. Specifically, because the pre-selected 5 QNO's are a subset of sixteen QNO's, the flexibility in choosing the best combination is limited. As such, the video encoded by this method will have inferior image quality than the video encoded by the full search method.
The present invention addresses the above-mentioned deficiencies in the prior art methods of determining the quantization of compression by providing a method whereby the optimum quantization number is chosen. In this regard, the present invention provides an adaptive bit rate allocation that only needs to search a limited number of combinations, but can allocate space according to the complexity of each macro block. Accordingly, this method utilizes all 16 QNO's without compromise to deliver the best quality available.