1. Field of the Invention
The present invention relates to a coding apparatus and a coding method of executing high-efficiency coding to compress image data.
2. Description of the Related Art
In order to store image signals picked up by a solid-state image pickup apparatus, represented by a CCD (Charge Coupled Device), as digital data into a memory device such as a memory card or a magnetic disk, since a vast amount of data is involved, data of the obtained image signals must be subjected to some sort of compression to store many frame images within a limited recording capacity. For example, since a digital electronic still camera stores picked-up images as digital data into a data storage medium such as a memory card or a magnetic disk in place of a silver salt film, the number of images recordable on a single memory card or a magnetic disk drive must be assured.
Similarly, a digital VTR (video tape recorder), for example, is required to record a predetermined number of frames without being influenced by a data amount of images per frame. That is, image data of a necessary number of frames must be recorded regardless of whether the image is a still image or a motion image.
As an image data compressing method which satisfies these conditions, a coding method which is a combination of orthogonal transform coding and entropy coding is well known.
As a typical method of this type, a system currently being studied in the international standardization of a still image will be briefly described below.
In this system, image data is divided into blocks having a predetermined size, and two-dimensional DCT (Discrete Cosine Transform) is executed as orthogonal transform for each of the divided blocks. Subsequently, linear quantization corresponding to each frequency component is executed, and Huffman coding is executed as entropy (information amount per unit message) coding for the quantized value. At this time, a differential value between a DC component of one block and that of a nearby block is Huffman-coded. An AC component is subjected to so-called zigzag scanning from a low to high frequency component, and two-dimensional Huffman coding is executed in accordance with the number of consecutive invalid components (values being zero) and the value of the subsequent valid component. The processing described above is a basic portion of this system.
With this basic portion alone, however, no constant amount of codes can be obtained for each image because the Huffman coding as entropy coding is used.
The following system has therefore been proposed as a method of controlling an amount of codes. First, the processing of the above basic portion is executed and at the same time a total amount of codes generated on the entire screen is obtained. On the basis of this total amount of codes and a target amount of codes, a quantization width optimal for making the amount of codes approach the target amount of codes with respect to a DCT coefficient is predicted. The processing following the quantization of the basic portion is repeated by using the predicted quantization width. Subsequently, on the basis of a total amount of presently generated codes, the total amount of previously generated codes, and the target amount of codes, the optimal quantization width for making the amount of codes approach the target amount of codes is predicted again. If the predicted quantization width coincides with the previous quantization width and the total amount of presently generated codes is smaller than the target amount of codes, the processing is ended and codes are output. If otherwise, the above processing is repeated by using a new quantization width.
The above operation will be described in more detail below with reference to FIG. 1. First, as indicated by (a) in FIG. 1, one frame of image data (one frame of an image specified by the international standardization proposal is 720.times.576 pixels) is divided into blocks having a predetermined size (e.g., blocks A, B, C, . . . each consisting of 8.times.8 pixels). Subsequently, as indicated by (b) in FIG. 1, two-dimensional DCT (Discrete Cosine Transform) is executed as orthogonal transform for each of the divided blocks, and the resultant data is sequentially stored in an 8.times.8 matrix memory. The image data viewed from a two-dimensional point has a spatial frequency as frequency information based on a distribution of density data. When the DCT is executed as described above, therefore, the image data is transformed into a DC component DC and an AC component AC, and data indicating the DC component DC, a maximum frequency value of the AC component AC in the horizontal axial direction, a frequency value of the highest AC component AC in the vertical axial direction, and a maximum frequency value of the AC component AC in the oblique direction are stored in positions of the origin (0, 0), (0, 7), (7, 0), and (7, 7), respectively. At the middle position, frequency data in a direction having a correlation with the coordinate point is stored such that data with lower frequency sequentially appear from the origin side.
Subsequently, the data stored in each coordinate point is divided by a quantization width for each frequency component obtained by multiplying a predetermined quantized matrix by a quantization width coefficient .alpha., thereby performing linear quantization (c). This quantized value is subjected to the Huffman coding as entropy coding. In this coding, a differential value between the DC component DC of one block and that of a nearby block is expressed by a group number (the number of added bits) and added bits, and the group number is Huffman-coded. The obtained coded words in combination with the added bits are taken as coding data (d1, d2, e1, and e2).
Valid coefficients (values being not "0") of the AC component AC are also expressed by a group number and added bits.
The AC component AC, therefore, is subjected to the so-called zigzag scanning for scanning data from a lower to higher frequency component, the two-dimensional Huffman coding is executed on the basis of the number of consecutive invalid components (values being "0"), i.e., the number of zero runs and the group number of the subsequent valid component, and the obtained coded words and added bits are taken as coding data.
The Huffman coding is executed such that a peak frequency of occurrence in a data distribution of each of the DC and AC components DC and AC per frame image is taken as the center, and coded words are obtained by coding data in accordance with bit assignment in which the closer the data to the center, the fewer the number of bits assigned thereto, and the farther the data from the center, the greater the number of bits assigned thereto.
The above processing is a basic portion of this system.
With this basic portion alone, however, since no constant amount of codes can be obtained for each image due to the use of the Huffman coding as entropy coding, the following processing, for example, is performed as a method of controlling an amount of codes.
First, the processing of the above basic portion is executed by using a temporary quantization width coefficient .alpha. and at the same time a total amount of codes (a total number of bits) generated on the entire screen is obtained (g). On the basis of this total amount of codes, a target amount of codes, and the temporary quantization width coefficient .alpha., a quantization width coefficient .alpha. optimal for making the amount of codes approach the target amount of codes to a DCT coefficient is predicted by Newton-Raphson iteration (h).
The processing following the quantization of the above basic portion is repeated by using the predicted quantization width coefficient .alpha. (i). Subsequently, on the basis of a total amount of presently generated codes, a total amount of previously generated codes, the target amount of codes, the presently used quantization width coefficient .alpha., and the previously used quantization width coefficient .alpha., the optimal quantization width coefficient .alpha. for making the amount of codes approach the target amount of codes is predicted again. If the predicted quantization width coefficient .alpha. coincides with the previous quantization width coefficient .alpha. and the total amount of presently generated codes is smaller than the target amount of codes, the processing is ended, and presently generated coded data is output and stored in a memory card (f). If otherwise, the quantization width coefficient .alpha. is altered, and the processing is repeated by using this new quantization width .alpha..
As described above, in a digital electronic still camera, for example, since the number of images recordable in a single memory card, magnetic disk drive, or magnetic tape must be secured, image data is compressed before recording. A processing time required for the compression must be as short as possible and constant from the view point of operability. In addition, the image data compression is preferably executed at a high efficiency. These conditions are somewhat required not only in the digital electronic still cameras but also in other applications.
The above-described system according to the proposal of the international standardization is a compressing method which satisfies the above conditions. In this system, image data is divided into blocks and coded by executing orthogonal transform such as discrete cosine transform, or compression as preprocessing is executed by image information compression represented by predictive coding (DPCM), the compression result is quantized, and the quantized output is coded by variable-length coding represented by the Huffman coding.
Although, however, the image data compressing system using the variable-length coding has a high efficiency, the amount of codes cannot be obtained until coding is actually finished due to the use of the variable-length coding. Therefore, it is difficult to control the amount of codes.
As a method of solving the above problem, the present inventors have proposed the following systems.
In one system, in order to control the amount of generated codes in a compression system using a combination of the DPCM and the variable-length coding, an image signal is sampled and stored in an image memory, and a difference between the sampled signal and a value predicted on the basis of a reference pixel signal which is already coded is calculated to form a differential signal. The differential signal is quantized by a temporary quantization width, and the amount of generated codes is integrated, thereby calculating a total amount of generated codes of an image of one screen. Subsequently, a new quantization width is predicted on the basis of the temporary quantization width, the total amount of generated codes, and a target total amount of codes. The predicted new quantization width is used to execute the DPCM, the quantization, and the variable-length coding, thereby obtaining a total amount of codes. This processing is repeatedly executed to make the total amount of codes approach the target amount of codes, thereby controlling the amount of codes.
In the other method, in order to control the amount of generated codes in a compressing system using a combination of the orthogonal transform and the variable-length coding, a sampled image signal stored in an image memory is divided into blocks, orthogonal transform is executed for each of the divided blocks, and the transformed output is quantized by a temporary quantization width. Thereafter, the quantized output is subjected to variable-length coding, and the amount of generated codes of each block and a total amount of generated codes of the entire image are calculated. Subsequently, a new quantization width is predicted on the basis of the temporary quantization width, the total amount of generated codes, and a target total amount of codes. In addition, an amount of codes assigned to each block is calculated on the basis of the amount of generated codes of each block, the total amount of generated codes, and the target amount of generated codes. The new quantization width is used to repeat the block division of an image signal stored in the image memory, the orthogonal transform, the quantization, and the variable-length coding. If the amount of generated codes exceeds the assigned amount of codes of each block, the variable-length coding is temporarily stopped, and processing of the next block is started. As a result, the amount of codes is controlled such that the total amount of generated codes of the entire image does not exceed the target total amount of generated codes.
That is, in this system, image data is divided into blocks and compression as preprocessing is executed by an image information compressing method such as coding by orthogonal transform represented by the discrete cosine transform (DCT), or the predictive coding (DPCM), the compression result is quantized, and the quantized output is coded by variable-length coding represented by the Huffman coding. In this system, however, since a total amount of codes cannot be obtained until the coding is finished, it is difficult to compress an amount of codes into an optimal value within a short time period. That is, since trials and errors must be repeatedly executed before an optimal quantization width coefficient o is determined, the optimal quantization width coefficient .alpha. cannot be rapidly determined.
In particular, in order to realize selection of various types of image quality modes such as low- and high-image quality modes in a single coding apparatus, an optimal compression ratio must be obtained for each of the selected image quality modes. In this case, i.e., in a coding apparatus capable of executing processing in correspondence with a plurality of target amounts of codes, a plurality of quantization width coefficients .alpha. must be determined. In a case wherein a target amount of codes is variable as described above, therefore, the above system cannot achieve a proposition of constantly determining an optimal quantization width coefficient .alpha. within a short time period.
In recent years, users have various types of needs, e.g., some of them prefer high image quality while others want to record a large number of images even at the sacrifice of the resolution of an image. In order to satisfy these users' needs, therefore, high- and low-image quality modes must be selectively designated. In this case, a target amount of codes per image (per frame) naturally changes, and compression coding must be performed accordingly. The distribution of a spatial frequency of an image, however, varies in accordance with the contents of an image, and a value of the optimal quantization width coefficient .alpha. falling within the range of a target amount of codes must be found within a short time period to execute coding. Therefore, the conventional systems in which trials and errors are repeatedly performed at random are unsatisfactory in this respect.
In order to increase the number of images recordable on a recording medium having a limited capacity, a method of changing the compression ratio of data is proposed. For example, Published Unexamined Japanese Patent Application No. 63-286078 proposes a method of selectively using a mode of directly recording data and a mode of compressing and recording data, and Published Unexamined Japanese Patent Application No. 1-292987 proposes a method capable of selecting one of a plurality of image quality modes by switching a degree of compression. Image quality is generally degraded when a compression ratio is increased. In this method, therefore, a mode (low-image quality mode) with an emphasis on the number of recordable images and a high-image quality mode with an emphasis on image quality can be selectively set in accordance with a demand of a user or an application.
In these prior arts, however, since compression circuits having compression ratios corresponding to a plurality of image quality modes must be independently provided and selectively used in accordance with an image quality mode, a hardware arrangement is complicated to increase the size of a camera and a manufacturing cost. In addition, in the above prior arts, non-compression and compression modes are switched or one of a plurality of types of fixed compression ratios is selected. Therefore, a compression ratio cannot be set at an arbitrary value, or the number of images recordable on a recording medium having a predetermined capacity cannot be freely set in accordance with a demand of a user. Furthermore, although the high- and low-image quality modes can be selectively set, a target amount of codes per image (per frame) naturally changes in accordance with the selected mode, and compression coding must be executed in accordance with the changed amount. Since, however, the distribution of a spatial frequency of an image varies in accordance with the contents of an image, a data capacity obtained after compression varies in accordance with the spatial frequency distribution if the selected compression ratio is fixed. Therefore, the number of images recordable on a recording medium having a predetermined capacity is always indefinite and cannot be found unless recording is actually executed, resulting in significant inconvenience in operability.