The present invention, in some embodiments thereof, relates to computer vision, and more specifically, but not exclusively, to estimating a pose of an image capturing device.
There exist image capturing systems comprising an image capture device, for example a camera, where there is a need to estimate a pose of the camera in a coordinate system. Examples of coordinate systems are a world coordinate system and a coordinate system calibrated with a camera pose of a camera when capturing a first image. A camera pose is a combination of position and orientation of a camera relative to a coordinate system. For example, a camera pose x may be expressed as a pair (R, t), where R is a rotation matrix representing an orientation with respect to the coordinate system, and t is a translation vector representing the camera's position with respect to the coordinate system. Other possible representations of orientation are double-angle representations and tensors. Examples of such image capturing systems are medical systems comprising a camera inserted into a patient's body, for example by swallowing the camera, systems comprising autonomously moving devices, for example vehicles and robots, navigation applications and augmented reality systems. When a camera operates in unknown environments, without further information or sensors, estimation of a camera pose may involve three-dimensional (3D) reconstruction of a scene. This problem is known as “structure from motion” (SfM) and “simultaneous localization and mapping” (SLAM) in computer vision and robotics communities, respectively.
A scene has image features, also known as landmarks. An image of the scene captured by a camera, sometimes referred to as a camera view, has observed image features representing the scene's landmarks. Typically, camera pose estimation is solved using bundle adjustment (BA) optimization. A bundle adjustment optimization calculation aims to minimize re-projection errors across all available images and for all image features identified in the available images. A re-projection error for a landmark in an image is the difference between the location in the image of the landmark's observed image feature and the predicted location in the image of the landmark's observed image feature for a certain camera pose estimate.