Especially in case of the latter application, the creation of augmented reality by the spatially precise insertion of virtual objects or information, as of a robot path, a reference system of coordinates, a virtual environment or an environment recorded in advance at another location, of virtual tools or the like, is of the greatest importance, because the spatial position itself contains the essential information.
The continuous, exact determination of the position and orientation (pose) of a corresponding image recording means is therefore necessary within the framework of tracking. The image recording means may be a camera or an optical recording means, such as a semitransparent mirror or the eyeglasses of a viewer, into which the virtual information is inserted.
Marks, which can be detected and recognized by the camera or a connected image processing system, are now frequently arranged in the real environment in spatially exactly defined positions to determine the pose of a camera. The pose of the camera in the space is inferred from the known position based on the camera image. However, the accuracy that can be achieved in the determination of the pose is limited by the optical quality of the camera, the quality of the calibration of the camera, the accuracy of the determination of the position of the marks and the precise arrangement of the marks at known sites in the world or environment and the quality of the image processing. Arranging a sufficient number of optical marks in real manufacturing environments and measuring them—manually—before the beginning of a robot programming is very complicated.
If a model of a robot environment exists, it is possible in certain cases to do without the arrangement of marks, because the determination of the site of the image reception means, the camera—the so-called tracking—is carried out now by the comparison of the environmental features recognized from the camera image and features stored in an environment model. On the one hand, such a procedure has not hitherto been embodied in a version fit at least for practical use and, on the other hand, it might not reach high accuracies.
Furthermore, it is very well known that the determination of the pose of the image reception means can be carried out by means of additional sensor systems with mechanical, optical, acoustic, magnetic, and inertia- and/or laser-based sensors. Such a procedure is associated with high costs. In addition, the range of action of the viewer is greatly limited in case of mechanical sensor systems, such as cross-wire systems. There must be a direct visual contact between the transmitter and the receiver in case of optical or acoustic systems, which is usually hardly guaranteed in real manufacturing environments. Moreover, time-consuming calibration of the additional sensor systems is necessary, but hardly practicable, before the beginning of the programming.