1. Field of the Invention
This invention relates to a coding apparatus and coding method for efficiently compressing and coding image data, especially dynamic image data.
2. Description of the Related Art
To record the image signal picked up by a solid pickup tube represented by CCD in a memory such as a memory card, magnetic disk, or magnetic tape as digital data, the amount of data becomes enormous. Therefore, to record many frame images within a range of limited recording capacity, it is necessary to compress the obtained image signal data in some ways.
For example, a digital electronic still camera stores photographed images in a data storage medium such as a memory card or magnetic disk instead of a silver film as digital data. Therefore, the number of image frames to be recorded in a memory card or magnetic disk drive is specified and recording of the images equivalent to the specified number of frames must be assured.
Also, a digital VTR (video tape recorder) must be able to record the specified number of frames independently of the amount of image data per frame. That is, it is necessary to securely record still or dynamic images equivalent to the required number of frames.
Moreover, the time to record and reproduce data should be short and constant. Especially for dynamic images, this is also important in order to prevent images of the next frame from missing.
The coding method obtained by combining orthogonal transform coding and entropy coding is popular as an image data compressing method meeting the above conditions.
The following is the outline of the method studied in international standardization of still image coding which is a typical example of the above method.
In this method, image data for one frame is first divided into blocks with the specified size and two-dimensional DCT (discrete cosine transform) is applied to each block as orthogonal transform. Then linear quantization is executed according to each frequency component and Huffman coding is applied to the quantized value as entropy (information content per message) coding. In this case, for the direct-current component, differential value from the direct-current component of adjoining blocks is Huffman-coded.
For the alternating-current component, Huffman coding is performed by scanning from low to high frequency components, which is called zigzag scan, and by using consecutive invalid component values (value of 0) and the following valid component values obtained. This is the basic part of this system.
Only with this basic part, however, the number of codes is not kept constant for each frame because Huffman coding which is entropy coding is used.
Therefore, the following method is proposed to control the number of codes. In this method, first, said basic part is processed and also the total number of codes generated in the entire screen is obtained. And the optimum quantization width to access the purposed number of codes for the DCT coefficient is estimated with the total number of codes and the purposed number of codes. Then, the processing of said basic part is repeated beginning with quantization, using the quantization width.
And the optimum quantization width to access the purposed number of codes is estimated again with the total number of codes generated this time, that of codes precedently generated, and the purposed number of codes. If the estimated quantization width coincides with the precedent quantization width and the total number of codes generated this time is smaller than the purposed number of codes, processing ends and codes are output. If not, processing is repeated using a new quantization width.
The following is the concrete description of the above operation according to FIG. 12. Image data for one frame (the image for one frame proposed in the international standardization plan consists of 720 CX 576 pixels) is divided into blocks with the specified size (e.g. blocks A, B, C, etc. consisting of 8.times.8 pixels) as shown in (a) and two-dimensional DCT (discrete cosine transform) is applied to each divided block as orthogonal transform as shown in (b) to sequentially store the data in an 8.times.8 matrix memory. Image data has spatial frequency which is frequency information based on the distribution of variable-density information when it is viewed through a two-dimensional plane.
Therefore, image data is converted into direct-current component DC and alternating-current component AC by executing said DCT. Thus, data showing the value of direct-current component DC is stored at the origin or the position (0, 0) on the 8.times.8 matrix memory, data showing the maximum frequency of the alternating-current component AC in the horizontal-axis direction at the position (0, 7) on the memory, data showing the maximum frequency of the alternating-current component AC in the vertical-axis direction at the position (7, 0) on the memory, and data showing the maximum frequency of the alternating-current component AC in the diagonal direction at the position (7, 7) on the memory. At the intermediate position, frequency data in the direction related by each coordinate position is stored so that data in will appear beginning with the lowest-frequency data from the origin side.
The linear quantization is executed (c) according to each frequency component by dividing the data stored in each coordinate position in this matrix by the quantization width for each frequency component obtained by multiplying the specified quantization matrix by the quantization width coefficient Ca, and Huffman coding is applied to the quantized values as the entropy coding. In this case, for the direct-current component DC, the differential value from the direct-current component of adjoining blocks is expressed by a group number (number of additional bits) and additional bits, the group number is Huffman-coded, and the obtained code language and the additional bits are combined to generate coded data (d1, d2, e1, e2).
The coefficient (value other than 0) also effective for the alternating-current component AC is expressed by a group number and additional bits.
Therefore, for the alternating-current component AC, Huffman coding is performed by scanning from low to high frequency components, which is called zigzag scan, and by using consecutive invalid (value of 0) components (run number of zeros) and the group number of the following valid component value obtained to generate coded data by combining the obtained code language and additional bits.
Huffman coding is executed by coding data obtained by assigning bits to data for said direct-current component DC and alternating-current component AC per frame image so that the minimum number of bits will be given to the data most frequently generated in the data distribution and the maximum number of bits will be given to the data most rarely generated in it.
This is the basic part of this system.
Only with this basic part, however, the number of codes is not kept constant for each image because Huffman coding which is entropy coding is used. Therefore, the following processing is used to control the number of codes.
First, said basic part is processed using the temporary quantization width coefficient .alpha. and, at the same time, the total number of codes (total number of bits) generated in the entire screen of one frame is obtained (g).
The optimum quantization width coefficient .alpha. to access the purposed number of codes for the DCT coefficient is estimated with said total number of codes, purposed number of codes, and temporary quantization width coefficient by means of Newton Raphson Iteration (h).
Secondly, the processing of said basic part beginning with quantization is repeated using the quantization width coefficient .alpha. (i).
Thirdly, the optimum quantization width coefficient .alpha. is estimated again with the total number of codes generated this time, total number of codes precedently generated, purposed number of codes, quantization width coefficient .alpha. used this time, and quantization width coefficient .alpha. precedently used. If the estimated quantization width coefficient .alpha. coincides with the precedent quantization width coefficient .alpha. and the total number of codes generated this time is smaller than the purposed number of codes, processing ends and coded data generated this time is output and stored in a memory card (f). If not, the quantization width coefficient .alpha. is renewed and processing is repeated using the new quantization width .alpha..
As mentioned above, for a digital electronic still camera, for example, the number of images to be recorded in a memory card, magnetic disk drive, or magnetic tape must be assured. Therefore, image data is compressed and recorded. However, the processing time should be as short as possible and constant in view of operability.
Also, it is desired that image data can efficiently be compressed. These items are required not only for digital electronic still cameras but for other applications.
Said international standard plan system is one of the compression methods meeting the above requirements. In this system, image data can efficiently be compressed by the technique obtained by combining the orthogonal transform coding with entropy coding for each block shown in the example of said basic part. However, because entropy coding is used, there is the disadvantage that the number of codes depends on images and the number of images to be recorded in a memory card or magnetic disk drive is indeterminate.
Also for the method to control the number of codes shown in the example of prior art, there is the disadvantage that not only the processing time is indeterminate but it should generally be increased because the number of pass repetitions of the basic part depends on images. The disadvantage is especially fatal for the system using dynamic images.