Panoramic image sensors are becoming increasingly popular because they capture large portions of the visual field in a single image. These cameras are particularly effective for capturing and navigating through large, complex three-dimensional (3D) environments. Existing vision-based camera pose algorithms are derived for standard field-of-view cameras, but few algorithms have been proposed to take advantage of the larger field-of-view of panoramic cameras. Furthermore, while existing camera pose estimation algorithms work well in small spaces, they do not scale well to large, complex 3D environments consisting of a number of interconnected spaces.
Accurate and robust estimation of the position and orientation of image sensors has been a recurring problem in computer vision, computer graphics, and robot navigation. Stereo reconstruction methods use camera pose for extracting depth information to reconstruct a 3D environment. Image-based rendering techniques require camera position and orientation to recreate novel views of an environment from a large number of images. Augmented reality systems use camera pose information to align virtual objects with real objects, and robot navigation and localization methods must be able to obtain the robot's current location in order to maneuver through a (captured) space.
Existing vision-based camera pose approaches may be divided into passive methods and active methods. Passive methods derive camera pose without altering the environment but depend on its geometry for accurate results. For example, techniques may rely upon matching environment features (e.g., edges) to an existing geometric model or visual map. To obtain robust and accurate pose estimates, the model or map must contain sufficient detail to ensure correspondences at all times. Another class of passive methods, self-tracking methods, use optical flow to calculate changes in position and orientation. However, self-tracking approaches are prone to cumulative errors making them particularly unsuited for large environments.
Active methods utilize fiducials, or landmarks, to reduce the dependency on the environment geometry. Although fiducial methods are potentially more robust, the number and locations of the fiducials can significantly affect accuracy. Existing techniques often focus on deriving pose estimates from a relatively sparse number of (noisy) measurements. For large arbitrarily shaped environments, there does not exist a method for determining the optimal number of fiducials or their optimal placement in order to achieve a desired camera pose accuracy.
Thus, there exists a need for techniques that overcome the above-mentioned drawbacks by providing techniques for determining the optimal number of fiducials and their optimal placement, in order to achieve a desired camera pose accuracy.