Field of the Invention
The present invention relates to an image processing technique that estimates a change in position and attitude relationship between an imaging apparatus and an object to be captured by utilizing a captured image and a depth image captured in synchronization with the captured image.
Description of the Related Art
There are techniques for estimating the position and attitude of an object to be captured, the relative position or attitude between an imaging apparatus and the object to be captured, or the change over time thereof based on an image captured by the imaging apparatus. As an implementation method, an estimation method using motion information about a motion vector or a corresponding point image is typically employed. In recent years, with the advancement of a depth data acquiring technique, there has been proposed a method for establishing a three-dimensional model of an object to be captured from depth data and collating the three-dimensional model with the previously prepared three-dimensional model so as to estimate a positional relationship. There has also been proposed a method for using information obtained from an image together with the depth data.
Japanese Patent Laid-Open No. 2011-27623 and Japanese Patent Laid-Open No. 2012-123781 disclose a method for using depth data together with the feature of an image. In the method disclosed in Japanese Patent Laid-Open No. 2011-27623, alignment between the previously prepared three-dimensional shape model and depth data is used together with alignment between the two-dimensional feature extracted from an image and the projection feature obtained when the three-dimensional shape model is projected onto a two-dimensional image at a certain position and attitude. This allows estimating the position and attitude of an object. In the method disclosed in Japanese Patent Laid-Open No. 2012-123781, depth data at the position of the feature point detected from an image is associated with the previously prepared three-dimensional shape model, so that the position and attitude of an object can be estimated by dealing with an erroneous handling caused by a noise of depth data.
As a method for calculating a region of interest which is used for calculating a position and attitude change, a background region is often calculated. In the conventional background region extraction, a method for specifying a background and a moving object using a difference between continuous frames is typically employed. In the method disclosed in Japanese Patent Laid-Open No. H11-112871, differences between images constituting one scene from a moving image are compared, so that a foreground region and a background region are specified and separated from each other, which are then used for image processing.
When the camera work of an imaging apparatus is estimated based on an image captured by the imaging apparatus, there are two methods: a method for estimating the position and attitude of an object based on a motion vector and a method for estimating the position and attitude of an object based on depth data. In these two discrete methods, if a dynamic region and a static region are mixed in a captured image, it may become difficult to estimate the change over time of position and attitude and the camera work represented by the position and attitude which is the integration thereof.