1. Field of the Invention
The present invention relates to a technique for improving the encoding efficiency of an image encoding device by using camera status information.
2. Description of the Related Art
As a high-efficiency image encoding technique, an encoding scheme such as MPEG2 (Moving Picture Experts Group Phase 2) has been established. Manufacturers are developing and commercializing DVD recorders or image capturing apparatuses such as a digital camera and digital video camera, which are adapted to record images using the MPEG2 encoding scheme. Under the circumstance, users can readily watch images using these apparatuses or personal computers or DVD players.
Various proposals are being made to improve the encoding efficiency of the MPEG2 encoding scheme. Of these proposals, for image capturing apparatuses, there is known a technique for improving the encoding efficiency by setting, in accordance with camera status information obtained by control data of a camera unit, an encoding parameter such as the target code amount used for code amount control in an encoding process (see Japanese Patent Laid-Open No. 2002-369142).
Code amount control using the MPEG2 encoding scheme will be exemplified here by TM5 as a reference soft encoder. TM5 is described in “Test Model Editing Committee: “TestModel 5”, ISO/IEC, JTC/SC29/WG11/n0400 (April 1993).
Code amount control using TM5 can be divided into the following three steps.
In the processes of step 1, the assigned code amounts to pictures in a GOP are distributed on the basis of the total amount of codes which can be assigned to pictures in the GOP which have not been encoded yet, including assignment target pictures. This distribution process is repeated in the order of encoded pictures in the GOP.
General code amount control will be described here. There are two encoding schemes for data compression based on code amount control. One is a constant bit rate (CBR) encoding scheme which keeps the generated code amount almost constant. Another one is a variable bit rate (VBR) encoding scheme which performs optimal code amount distribution in accordance with the complexity and motion magnitude of a moving image in each frame even while approximating the average value of encoding bit rates to a long-term target convergence bit rate.
Ideal code amount distribution in VBR encoding requires 2-pass arrangement, i.e., actual encoding and estimation of the generated code amount for all moving images to be encoded, so it has been conventionally implemented by an offline process using software. In recent years, hardware which executes real-time VBR encoding has been developed and recorders which are adapted to execute real-time VBR encoding superior in image quality have come into widespread nowadays. Conventional encoding by software based on prior code amount distribution for all moving images is called 2-pass encoding. Such a real-time bit rate control technique used for recorders and the like is called 1-pass encoding.
The VBR encoding bit rate control operation by 1-pass encoding achieves almost ideal code amount distribution by 2-pass encoding while being free from any influence of local characteristics such as the complexity and motion magnitude of a moving image in each frame. For this purpose, it is a common practice to average actual encoding bit rates obtained as a result of encoding at a short-term targeted encoding bit rate and to gradually control the code amount so as to adjust the resultant average value to a long-term targeted encoding bit rate in a predetermined period. A convergence time during which that average value is controlled to converge to a targeted convergence encoding bit rate is determined in accordance with the gradient of control to successively approximate the average value of actual encoding bit rates to a long-term targeted encoding bit rate.
FIG. 13 is a flowchart showing the schematic flow of bit rate control. In step S701, a maximum bit rate is assured by restricting the current quantization scale so as not to exceed its upper limit such as the maximum transfer speed, which is associated with recording and independent of the target convergence bit rate. In step S702, a virtual buffer called VBV in MPEG2 is assured by controlling the encoding bit rate so as to prevent its overflow and underflow, thus avoiding decoding failures. The encoding control processes in steps S701 and S702 are common practices, and a detailed description thereof will be omitted. In step S703, control to converge the encoding bit rate to the target convergence bit rate is done on the basis of a predetermined convergence time. The generated code amount is controlled by changing the quantization scale in accordance with a short-term target bit rate determined by the above-described bit rate control processes.
In the process of step 2, to match the assigned code amount to each picture calculated in step 1 with an actually generated code amount, the following procedure is executed. That is, a quantization scale code is calculated by feedback control for each macroblock on the basis of the capacities of three types of virtual buffers set independently for respective pictures I, P, and B.
In the process of step 3, the quantization scale code calculated in step 2 is finely quantized at a flat portion where a deterioration is visually conspicuous, and it is coarsely quantized at a portion with a complicated pattern where a deterioration is relatively inconspicuous. For this purpose, the quantization scale is changed and determined in accordance with a variable called an activity for each macroblock of 16×16 pixels.
The activity calculation method and quantization scale determination method in step 3 will be described in more detail below.
An activity representing the pattern complexity is calculated as follows. A macroblock of 16×16 pixels is divided into a total of eight blocks, i.e., four 8×8 pixel blocks in a field discrete cosine transform mode and four 8×8 pixel blocks in a frame discrete cosine transform mode. An activity is then calculated on the basis of a variance var_{sblk} of luminance signal pixel values Pj of an original picture in each block. The variance of each 8×8 pixel block is calculated by:
                                                                        var_                ⁢                                  {                  sblk                  }                                            =                                                ∑                                      j                    =                    1                                    64                                ⁢                                                      (                                                                  P                        j                                            -                      P                                        )                                    2                                                                                        (                              P                =                                                      1                    64                                    ⁢                                                            ∑                                              j                        =                        1                                            64                                        ⁢                                          P                      j                                                                                  )                                                          (        1        )            where P is the average pixel value of the luminance signal pixel values Pj of the 8×8 pixel blocks.
Of the total of eight variances calculated by equation (1), an activity corresponding to a minimum variance is calculated in accordance with equation (2). Using the minimum value of the variances in equation (2) is to finely quantize a flat portion in a macroblock, if any, irrespective of its extent.act=1+min(var—{sblk})  (2)
The activity value calculated by equation (2) becomes large if the image of interest has a complicated pattern, i.e., exhibits a large variance of luminance signal pixel values, and it becomes small if the image of interest is flat, i.e., exhibits a small variance of luminance signal pixel values. By equation (3), a normalized activity N_act is calculated such that the activity value falls within 0.5 to 2.0.N_act=(2×act+avg_act)/(act+2·×avg_act)  (3)
avg_act is the average activity obtained by calculating the average of the activities act encoded before one frame. To encode the first frame, the initial value of the average activity avg_act is set to 400 in the TM5 scheme.
On the basis of the normalized activity N_act calculated by equation (3) and the quantization scale code Qsc obtained in step 2, a quantization scale code MQUANT in consideration of the visual characteristic depending on an activity is given by:MQUANT=Qsc×N_act  (4)
That is, since the quantization scale code MQUANT of a flat image which exhibits a small activity value becomes small, it is finely quantized. To the contrary, since the quantization scale code MQUANT of an image with a complicated pattern which exhibits a large activity value becomes large, it is coarsely quantized.
Code amount control using the TM5 scheme is performed by the above-described processes.
In a method of setting an encoding parameter such as the target code amount using camera status information in the prior arts such as Japanese Patent Laid-Open No. 2002-369142, an encoding unit operates to set an encoding parameter to be used in encoding using camera status information during the encoding process execution period. Therefore, to determine the initial value of an encoding parameter to be used in the main operation of an encoding unit, it must be activated to execute the encoding process before its main operation for actually recording encoded data. This amounts to excessively activating the encoding unit, resulting in an increase in power consumption.
In the TM5 scheme, to calculate, in accordance with equation (3), a normalized activity N_act in the first frame immediately after the start of the encoding process, an average activity avg_act is set as a constant value of 400. Therefore, an encoding parameter is automatically set at a constant value without activating an encoding unit before its main operation, thus solving the problem of the technique of Japanese Patent Laid-Open No. 2002-369142. However, when the encoding parameter in encoding the first frame is set at a constant value, appropriate encoding may be disturbed if the condition of a photography target image is bad. Still worse, power consumption may increase.