Field of the Invention
The present invention relates to an encoding apparatus and a method of controlling the same.
Description of the Related Art
Conventionally, a technique for encoding a depth image obtained from a distance measuring sensor, a stereo camera, or the like, is known. If a depth image is used, a free view-point image synthesis synthesizing video of a viewpoint that is not captured, a precision improvement in a human body detection, a measurement of a three dimensional distance between two points, or the like, is possible.
3D Video Coding (hereinafter referred to as 3DV) is established as a standard technology related to encoding of a depth image. In 3DV, a depth image is generated in order to perform a free-viewpoint video synthesis at a high image quality, but because a frequency transformation is performed similarly to in an encoding technique for an RGB image such as H.264, a large degradation in the proximity of an edge of the depth image occurs easily. A compression scheme by which a large degradation occurs in a portion of the pixels in this way may become a large problem depending upon a depth image usage approach. For example, in a case where a three dimensional measurement of a distance between two points is performed using a depth image, the result of measurement changes greatly when a pixel value whose distance information has greatly degraded due to the compression is used.
Meanwhile, a technique by which a maximum distortion due to a compression of pixel values of an image is suppressed to a particular value is known. For example, a near-lossless (quasi-lossless) mode is defined in “The LOCO-I Lossless Image Compression Algorithm: Principles and Standardization into JPEG-LS, (IEEE TRANSACTION ON IMAGE PROCESSING, VOL. 9, NO. 8, AUGUST 2000) (hereinafter referred to as document 1). If the JPEG-LS near-lossless mode is applied to a depth image, a maximum value of an error in a respective pixel in the depth image that occurs due to the compression is suppressed, and as a consequence even in a measurement of a distance between two points, it is possible to control a maximum value of an error in a measurement of a distance between two points that occurs due to compression.
However, in the near-lossless mode of JPEG-LS, it can only guaranteed that the error that occurs due to the compression will be of a fixed value irrespective of the pixel value (hereinafter referred to as constant precision guarantee). Accordingly, encoding that guarantees that a maximum distortion due to a compression of pixel values falls within an allowable error that differs in accordance with the pixel value cannot be performed (hereinafter referred to as guarantee of precision in accordance with the pixel value), and there are cases in which an encoding data amount increases due to a precision that is higher than necessary being maintained.
Here, an example of encoding of distance information is described below. In encoding of distance information it is common to encode the distance information by a disparity image which holds a disparity (=a shift in a corresponding point between differing images) obtained as a result of performing stereo matching of images from two viewpoints as a pixel value. A disparity expression is an expression in which a resolution of distance information is higher the shorter the distance is, and is an expression suited to the principle of stereo matching. In a case where encoding that guarantees a fixed error independent of the pixel value is applied to a disparity image, there is a property that degradation due to a compression of distance information is lower the shorter the distance is, and degradation due to the compression of the distance information is higher the longer the distance is. Meanwhile, it can be considered that there is a demand to reduce the amount of data of distance information by allowing a certain amount of error even at a short distance depending on one's purpose. For example, in a case where a height of a human body is measured by using distance information obtained from a disparity image, it can be considered that an error of about 1 cm is acceptable even at a short distance. Allowing an error of 1 cm here corresponds to a large shift (for example 3) in a pixel value in the case of a short distance, and a small shift (for example 1) in a pixel value in a case of a long distance being acceptable in a disparity expression. However, because, in a constant precision guaranteed encoding, a guarantee of precision cannot be performed in accordance with a pixel value in a manner of allowing a ±3 error at a short distance, for example, and a ±1 error at a long distance, it is difficult to respond to such a demand.