In a number of applications it is desirable to be able to determine the absolute or relative position of objects in images. Alternatively, it may be desirable to determine the absolute or relative position of the camera based on points with known locations in the images. For instance when taking pictures of the same object from two different locations the rotation and translation of the camera can be calculated using different methods.
A camera can be said to have a transfer function mapping a point in 3D space to an image point in the camera image according to
                              (                                                                      x                  0                                                                                                      y                  0                                                              )                =                              L            ⁡                          (                              r                ~                            )                                ⁢                      K            ⁡                          (                                                                    X                                                                                        Y                                                                                        Z                                                                                        1                                                              )                                                          (        A1        )            where x0 and y0 are the coordinates of the point in the first image, L({tilde over (r)}) is the distortion generated by the lens of the camera, K represents the intrinsic parameters of the camera and X, Y and Z are the coordinates of the point in 3D space. An image of the same object taken from a different position and rotation is calculated by
                              (                                                                      x                  1                                                                                                      y                  1                                                              )                =                              L            ⁡                          (                              r                ~                            )                                ⁢                      K            ⁡                          (                                                R                  1                                |                                  t                  1                                            )                                ⁢                      (                                                            X                                                                              Y                                                                              Z                                                                              1                                                      )                                              (        A2        )            where x1 and y1 are the coordinates of the point in the second image, R1 describes the rotation of the camera and t1 describes the translation of the camera. If the same camera is used for both images L({tilde over (r)}) and K are the same, and R1 and t1 are desired unknowns.
Today, there exist various examples of solving position and orientation from correlated point features in images. Usually, these methods solve rotation and translation in two different separate steps using for instance the iterative methods LevenBerg-Marquard or Newton-Raphson. Simply setting up equations and using these methods provide unstable and uncertain solutions not necessarily converging fastly or to a correct solution.
The problem is thus to find an improved method and apparatus to rapidly obtain correct and stable convergence in solving for rotation and translation for movements of a camera in order to find the actual position and attitude for the camera.