Field of the Invention
The present invention relates to an image processing apparatus which encodes a moving image and an encoding method.
Description of the Related Art
Encoding methods such as H.264 and High Efficiency Video Coding (HEVC) recommended by International Telecommunication Union Telecommunication Standardization. Sector (ITU-T) divide one frame of the moving image into a plurality of blocks and perform encoding in block units. Compression encoding methods include an encoding method using temporal correlation (i.e., an inter-encoding) and an encoding method using spatial correlation (i.e., an intra-encoding). The encoding method using the temporal correlation employs a previously encoded image as a reference image and searches for a motion vector between the block in the reference image and the block in the image to be encoded. A prediction residual between the corresponding blocks is then encoded. The encoding method using spatial correlation performs prediction from pixels in the blocks surrounding the block to be encoded, and encodes the prediction residual of the block.
Data which has been encoded in block units mainly includes the motion vector of the block, a header component such as an encoding type of the block, and a prediction residual component, i.e., a difference between pixel values of the predictive image and the image to be encoded.
It is effective to use a skip block in H.264 (or a skip mode in HEVC) to reduce a data amount of the header component. When using the skip block, the header component and the residue component are not included in the block, so that encoding can be performed for a small data amount (i.e., if the skip block is executed in H.264, encoding can be performed by a data amount of 1 bit or less per block). If a block is to be encoded using the skip block in H.264, it is necessary to align the motion vector with the motion vectors of the surrounding blocks.
Further, if full search is performed for obtaining the motion vector, the following is performed. For example, if the block of 16×16 pixels is compared with a search area of all 64×64 pixels of the reference frame, it is necessary to perform an 8-bit comparison operation 1048576 times (i.e., 16×16×64×64=1048576). A calculation cost thus becomes huge.
Japanese Patent Application Laid-Open No. 5-49017 discusses reducing the calculation cost in obtaining the motion vector by performing a motion search method using a binary image. If the binary image is employed as an input image, the calculation cost per pixel can be reduced.
However, there are cases where the skip block cannot be used depending on a search result of the motion vector. When the skip block is performed, it is necessary to align the motion vector of the block to be processed and the motion vectors of the surrounding blocks. For example, if the motion vector search is performed using the binary image, the motion vectors may not be easily aligned. Therefore, there are cases where the skip block cannot be performed when the motion vector search is executed using the binary image while the skip block can be performed when the motion vector search is executed by directly using the input image.
The motion vector search using the binary image will be described below with reference to FIG. 2. Referring to FIG. 2, there is a current frame multi-value image 201 (i.e., the input image) and a reference frame multi-value image 202 (e.g., the input image one frame before). A current frame binary image 203 is the image obtained by binarizing the current frame multi-value image 201 and a reference frame binary image 204 is the image obtained by binarizing the reference frame multi-value image 202. In the figure, one square indicates one pixel. The pixel value of a gray area is different from the pixel value of a white area.
The current frame multi-value image 201 and the reference frame multi-value image 202 includes the gray area. However, since the difference between the pixel values of the gray area and the white area is small, there is no change in the pixel values of the current frame binary image 203 and the reference frame binary image 204. The current frame multi-value image 201 and the reference frame multi-value image 202 are images in which the change in the pixel values in the images are small, so that there is no change in the pixels when the current frame multi-value image 201 and the reference frame multi-value image 202 are binarized.
The case where 2×2 pixels illustrated in FIG. 2 is one block and the search area of 6×6 pixels is to be searched will be considered for ease of description. A block 205 of 2×2 pixels bordered by a thick line in the current frame multi-value image 201 is the block to be processed, and the full search is performed on the reference frame multi-value image 202. In such a case, a block 206 bordered by the thick line in the reference frame multi-value image 202 matches the block 205. On the other hand, if the full search is performed on the reference frame binary image 204 using a block 207 as a target block in the current frame binary image 203 in which a position corresponds to the block 205, a block 208 bordered by the thick line in the reference frame binary image 204 matches the block 207.
As described above, the search results of the motion vector using the multi-value image and the binary image may be different in a flat image (i.e., an image in which the change in the pixel values is small) or a flat area. In other words, search accuracy of the search method using the binary image is lower than the search accuracy of the search method using the multi-value image, so that there is such a difference in the search results.
In particular, if the binary image is generated from the input image and the motion vector search is performed, it becomes difficult to align the motion vectors between the surrounding blocks, so that encoding using the skip block cannot be easily executed. As a result, if conventional encoding is performed using the motion vector search method employing the binary image, encoding efficiency is likely to lower. The above-described issue is not limited to the case where the binary image is used, and a similar issue occurs when the motion vector search is performed using an N-value image obtained by lowering gradation of the input image to an N value.