In many fields, there have been already used AR image processing apparatuses configured to composite a CG object on a target object image such as an AR marker image in real time by using augmented reality (AR) techniques, the target object image being captured by a camera which is an image capturing device such as a web camera or a digital video camera.
A marker based AR technique involves: registering in advance feature points forming a group having a certain shape in a digital image; detecting the registered feature points from a digital image captured by the image capturing device by using homography or the like; estimating the position, the posture, and the like of the group; and compositing and displaying a CG object at the position of an AR marker image corresponding to the position, the posture, and the like of the group.
In this AR technique, the feature points registered in advance and having the certain shape are referred to as AR marker (or simply “marker”). By adding additional information indicating the size and posture of the marker in the real world in the registration of the marker, the size of and the distance to the AR marker in a digital image obtained from the image capturing device can be accurately estimated to some extent. Meanwhile, when no recognizable feature points exist in the digital image, the position and posture of the marker cannot be estimated as a matter of course.
A natural feature tracking based AR technique as typified by PTAM (“Parallel Tracking and Mapping for Small ARWorkspaces”, Oxford University) is an excellent method which requires no prior registration of the feature points in the digital image and which allows the image capturing device to be moved in any direction and to any position as long as the feature points can be tracked even when the position of the image capturing device is continuously moved.
However, since a base position needs to be designated first, the image capturing device needs to be moved in a special way to determine the base position from amounts of movement of the feature points in multiple images captured along with the movement of the camera, and position and posture information needs to be additionally provided. In this process, a base plane cannot be accurately determined unless the image capturing device is correctly moved. Moreover, in the natural feature tracking based AR technique, since no prior registration of feature points is generally performed due to the nature of the technique, information on the distance among and the size of feature points in a captured digital image cannot be accurately known. Hence, there is generally used a method of manually setting the size, direction and position of the CG object with respect to the base plane.