1. Technical Field
The present invention relates to a method for determining the pose of a camera with respect to at least one real object. A camera is operated for capturing a 2-dimensional image including at least a part of a real object, and in the determination process, the pose of the camera with respect to the real object is determined using correspondences between 3-dimensional points associated with the real object and corresponding 2-dimensional points of the real object in the 2-dimensional image.
2. Background Information
Augmented Reality Systems permit the superposition of computer-generated virtual information with visual impressions of the real environment. To this end, the visual impressions of the real world are mixed with virtual information, e.g. by means of semi-transmissive data glasses or by means of a head-mounted display worn on the head of a user. The blending-in of virtual information or objects can be effected in context-dependent manner, i.e. matched to and derived from the respective environment viewed. As virtual information, it is basically possible to use any type of data, such as texts, images etc. The real environment is detected e.g. with the aid of a camera carried on the head of the user.
When the person using an augmented reality system turns his or her head, tracking of the virtual objects is necessary with respect to the changing field of view. The real environment may be a complex apparatus, and the object detected can be a significant part of the apparatus. During a so-called tracking process, the real object (which may be an object to be observed such as an apparatus, an object provided with a marker to be observed, or a marker placed in the real world for tracking purposes) detected during initialization serves as a reference for computing the position at which the virtual information is to be displayed or blended-in in an image or picture taken up by the camera. For this purpose, it is necessary to determine the pose of the camera with respect to the real object. Due to the fact that the user (and consequently the camera when it is carried by the user) may change his or her position and orientation, the real object has to be subjected to continuous tracking in order to display the virtual information at the correct position in the display device also in case of an altered pose (position and orientation) of the camera. The effect achieved thereby is that the information, irrespective of the position and/or orientation of the user, is displayed in the display device in context-correct manner with respect to reality.
One of the problems in the field of augmented reality is the determination of the head position and the head orientation of the user by means of a camera that is somehow associated with the user's head. Another problem may be determining the position and orientation of the camera inside a mobile phone in order to overlay information on the camera image and show the combination of both on the phone's display. To this end, in some applications the pose of the camera with respect to at least one real object of the captured real environment is estimated using the video flow or image flow of the camera as source of information.
Pose estimation is one of the most basic and most important tasks in Computer Vision and in Augmented Reality. In most real-time applications, it needs to be solved in real-time. However, since it involves a non-linear minimization problem, it requires heavy computational time.