Pose Estimation
Three-dimensional (3D) pose estimation determines the location and angular orientation of an object. Typical, pose estimation methods rely on several cues, such as 2D texture images, and 3D range images. Texture images based methods assume that the texture is invariant to variations of the environment. However, this assumption is not true if there are illumination changes or shadows. In general, most of these methods cannot handle objects that are specular.
Range images based methods can overcome some of these difficulties, because they exploit 3D information that is independent of the appearance of objects. However, range acquisition equipment is more expensive than simple cameras.
Specular Objects
For some objects, it is very difficult to reconstruct the 3D shape. For example, recovering 3D shape of highly specular objects, such as mirror-like or shiny metallic objects is known to be difficult and unreliable.
Reflection cues are more sensitive to pose changes than texture or range cues. Therefore, exploiting the reflection cues enables pose parameters to be estimated very accurately. However, it is not clear whether the reflection cues are applicable to global pose estimation, i.e., object detection, object segmentation, and rough object pose estimation, rather than just pose refinement.
Prior art methods are generally based on appearance, which is affected by illumination, shadows, and scale. Therefore it is difficult for those methods to overcome related problems such as partial occlusions, cluttered scenes, and large pose variations. To handle these difficulties, those methods use illumination invariant features, such as points, lines, and silhouettes, or illumination insensitive cost functions such as a normalized cross correlation (NCC). However, the object needs to be sufficiently textured in order for these methods to be successful. Severe illumination changes remain a problem, especially for specular objects.
A wide range of methods derive sparse local shape information from the identification and tracking of distorted reflections of light sources, and special known features. Dense measurements can also be obtained using a general framework of light-path triangulation. However, those methods usually need to perform accurate calibration and control the environment surrounding the object, and sometimes require many input images.
Some methods for specular object reconstruction do not require environment calibration. Those methods assume small environmental motion, which induces specular flow on the image plane. In those methods, the specular flow is exploited to simplify the inference of specular shapes in unknown complex lighting. However, a pair of linear partial differential equations have to be solved, and generally, that requires an initial condition, which is not easily estimated in real world applications.
One method for estimating the pose based on specular reflection uses a short image sequence and initial pose estimates computed by the standard template matching procedure. Lambertian and specular components are separated for each frame and environment maps are derived from the estimated specular images. Then, the environment maps and the image textures are concurrently aligned to increase the accuracy of the pose estimation process.