Field of the Invention
The present invention relates to a technique for encoding image data.
Description of the Related Art
Currently, digital devices for recording moving images such as digital video cameras are in widespread use, and in recent years, RAW image recording that was applied only to still images has also been applied to moving image recording. The data amount required for recording the above RAW image is huge, but advanced users among users of image capturing apparatuses prefer to use RAW images since correction and deterioration of the original image can be minimized, and the degree of freedom for image editing after shooting is high.
In order to record a RAW moving image to a recording medium such as a memory card, it is necessary to compress and encode the RAW moving image at a certain compression rate, since recording is possible for a period corresponding to the capacity of the recording medium.
Generally, image sensors employ a Bayer array. In the Bayer array, different color components are arranged alternately, and thus correlation between adjacent pixels is low, and even if encoding is directly performed, the compression efficiency is low. In view of this, the encoding efficiency has been improved by separating image data in the Bayer array into an image plane constituted only by R components, an image plane constituted only by G1 components, an image plane constituted only by G2 components, and an image plane constituted only by B component, and performing encoding for each plane.
Also, H.264 (H.264/MPEG-4 Part 10: Advanced Video Coding) is known as a conventional representative compression encoding system. In this compression encoding system, a data amount is compressed using time redundancy and space redundancy of the moving image, for each block constituted by a predetermined number of pixels within one frame.
In H.264 above, motion detection and motion compensation for time redundancy and the Discrete Cosine Transform (DCT) as frequency conversion for space redundancy, as well as techniques such as quantization and entropy encoding are combined. However, when the compression rate is raised to a certain degree or more, block distortion unique to DCT becomes marked, and image deterioration becomes subjectively conspicuous.
In view of this, a technique of decomposition into frequency bands called subbands by performing low pass filtering and high pass filtering in the horizontal direction and the vertical direction is adopted in JPEG 2000 as frequency conversion. In JPEG 2000, the Discrete Wavelet Transform (hereinafter, DWT) is used in order to perform frequency conversion on the frequency bands. Subband encoding is characterized in that block distortion is unlikely to occur and compression characteristics at the time of high compression are favorable, compared to an encoding technique that uses DCT.
In general code amount control, a target code amount of a frame to be encoded next is determined based on information regarding a frame that has been encoded. Code amount control is then performed by performing quantization control in which a quantization parameter Qp used for quantization is changed for each predetermined region of the image in order to converge the amount of generated code to the target code amount per frame. Note that while Qp is a parameter that makes it possible to reduce the code amount more, the greater the value of Qp is, Qp causes deterioration in image quality, and is therefore desirably as small as possible and constant in the screen.
In addition, the encoding efficiency can be improved by the ratio of Qps among the subbands having a predetermined relationship, and for example, in JPEG 2000, a relational expression in which a quantization parameter is set higher in a higher subband is defined as suggestive quantization. Particularly, in subband encoding, the target code amount of the image is distributed as subband target code amounts to the subbands, and quantization control is performed for each of the subbands, and thereby it is possible to perform quantization control, and compress the image data to a desired code amount.
However, as described above, if a target code amount is set for each subband, and Qp is changed for each predetermined region within the subband, if the subband target code amount is not set appropriately, even if the ratio of initial Qps set at the time of starting encoding of the subbands is set to the predetermined relationship, there is a possibility that the ratio of Qps of each of the subbands will deviate from the predetermined relationship partway through the screen.
As an example, change of Qp in the case where all the target code amounts of the subbands are set equally will be described. If the input image is an image having a large number of vertical lines, the target code amount of a horizontal frequency component needs to be set larger than the target code amount of a vertical frequency component such that the vertical lines do not deteriorate. However, if the target code amounts of the horizontal frequency component and the vertical frequency component are equal, control is performed such that Qp increases so as to suppress the code amount in the horizontal frequency component in which the amount of generated code is large, and thus deterioration occurs such as the vertical lines being blurred. Also in the setting of a target code amount of each plane after plane conversion, if the target code amount is not set appropriately, a problem similar to the above problem can occur.
In view of this, a technique for improving code amount controllability by changing the setting of a quantization matrix that is applied when performing encoding in order to appropriately distribute code amounts of luminance and color difference components is described in the document Japanese Patent Laid-Open No. 2010-183402. According to this document, it is possible to appropriately distribute a picture target code amount to a luminance code amount and a color difference code amount by scaling up the quantization value of luminance signals if the luminance code amount is greater than a predetermined value, and scaling up the quantization value of color difference signals if the color difference code amount is greater than a predetermined value, based on the ratio of the luminance code amount to the color difference code amount.
However, in the technique described in this document, the quantization values of the luminance signals and the color difference signals are independently changed, and thus there are cases in which optimal image quality cannot be acquired. If this is applied to the above-described setting of the target code amount for each subband and the target code amount for each plane, a problem also occurs. For example, consider a case in which the conventional technique is adopted in the relationship of quantization among subbands that underwent frequency conversion. In an image having a large number of vertical lines, if a quantization parameter is similarly set large only because the code amount of the subband corresponding to the horizontal component is large, the information rate of the horizontal component after encoding is reduced from the original image, the vertical lines are blurred, and the image quality deteriorates greatly.
In addition, the same applies to plane conversion from a RAW image into R, G1, G2 and B. For example, assume that the code amount of the plane B at the time of decomposition into the planes R, G1, G2 and B is extremely large. In this case, if the quantization parameter of the plane B is set larger than the other planes, information regarding blue component after encoding is extremely reduced from the original image, and the image quality deteriorates.