EP 1 720 131 B1 shows an augmented reality system with real marker object identification. FIG. 4 schematically shows in a simplified illustration an augmented reality system 100 thereof. The system 100 comprises a video camera 110 for gathering image data from a real environment 120. The real environment 120 represents any appropriate area, such as a room of a house, a portion of a specific landscape, or any other scene of interest. In FIG. 4, the real environment 120 represents a living room comprising a plurality of real objects 121 . . . 124, for instance in the form of walls 124, and furniture 121, 122 and 123. Moreover, the real environment 120 comprises further real objects that are considered as marker objects 125, 126 which have any appropriate configuration so as to be readily identified by automated image processing algorithms. For instance, the marker objects 125, 126 have formed thereon significant patterns that may easily be identified, wherein the shape of the marker objects 125, 126 may be designed so as to allow identification thereof from a plurality of different viewing angles. The marker objects 125, 126 also represent substantially two-dimensional configurations having formed thereon respective identification patterns.
The system 100 further comprises a means 130 for identifying the marker objects 125, 126 on the basis of image data provided by the camera 110. The identifying means 130 may comprise well-known pattern recognition algorithms for comparing image data with predefined templates representing the marker objects 125, 126. For example, the identifying means 130 may have implemented therein an algorithm for converting an image obtained by the camera 110 into a black and white image on the basis of predefined illumination threshold values. The algorithm is further configured to divide the image into predefined segments, such as squares, and to search for pre-trained pattern templates in each of the segments, where the templates represent significant portions of the marker objects 125, 126.
First, the live video image is turned into a black and white image based on a lighting threshold value. This image is then searched for square regions. The software finds all the squares in the binary image, many of which are not the tracking markers, such as the objects 125, 126. For each square, the pattern inside the square is matched against some pre-trained pattern templates. If there is a match, then the software has found one of the tracking markers, such as the objects 125, 126. The software then uses the known square size and pattern orientation to calculate the position of the real video camera relative to the physical marker such as the objects 125, 126. Then, a 3×4 matrix is filled with the video camera's real world coordinates relative to the identified marker. This matrix is then used to set the position of the virtual camera coordinates. Since the virtual and real camera coordinates are the same, the computer graphics that are drawn precisely superimpose the real marker object at the specified position. Thereafter, a rendering engine is used for setting the virtual camera coordinates and drawing the virtual images.
The system 100 of FIG. 4 further comprises means 140 for combining the image data received from the camera 110 with object data obtained from an object data generator 150. The combining means 140 comprises a tracking system, a distance measurement system and a rendering system. Generally, the combining means 140 is configured to incorporate image data obtained from the generator 150 for a correspondingly identified marker object so as to create virtual image data representing a three-dimensional image of the environment 120 with additional virtual objects corresponding to the marker objects 125, 126. Hereby, the combining means 140 is configured to determine the respective positions of the marker objects 125, 126 within the real environment 120 and also to track a relative motion between the marker objects 125, 126 with respect to any static objects in the environment 120 and with respect to a point of view defined by the camera 110.
The system 100 of FIG. 4 further comprises a means 160 configured to provide the virtual image data, including the virtual objects generated by the generator 150, where, in preferred embodiments, the output means 160 is also configured to provide, in addition to image data, other types of data, such as audio data, olfactory data, tactile data, and the like. In operation, the camera 110 creates image data of the environment 120, where the image data corresponds to a dynamic state of the environment 120 which is represented by merely moving the camera 110 with respect to the environment 120, or by providing moveable objects within the environment, for instance the marker objects 125, 126, or one or more of the objects 121 . . . 123 are moveable. The point of view of the environment 120 is changed by moving around the camera 110 within the environment 120, thereby allowing to observe especially the marker objects 125, 126 from different perspectives so as to enable the assessment of virtual objects created by the generator 150 from different points of view.
The image data provided by the camera 110 which is continuously updated, is received by the identifying means 130, which recognizes the marker objects 125, 126 and enables the tracking of the marker objects 125, 126 once they are identified, even if pattern recognition is hampered by continuously changing the point of view by, for instance, moving the camera 110 or the marker objects 125, 126. After identifying a predefined pattern associated with the marker objects 125, 126 within the image data, the identifying means 130 informs the combining means 140 about the presence of the marker object within a specified image data area and based on this information, the means 140 then continuously tracks the corresponding object represented by the image data used for identifying the marker objects 125, 126, assuming that the marker objects 125, 126 will not vanish over time. The process of identifying the marker objects 125, 126 is performed substantially continuously or is repeated on a regular basis so as to confirm the presence of the marker objects 125, 126 and also to verify or enhance the tracking accuracy of the combining means 140 creates the three-dimensional image data and superimposes corresponding three-dimensional image data received from the object generator 150, wherein the three-dimensional object data are permanently updated on the basis of the tracking operation of the means 140.
For instance, the means 140 may, based on the information of the identifying means 130, calculate the position of the camera 110 with respect to the marker objects 125, 126 and use this coordinate information for determining the coordinates of a virtual camera, thereby allowing a precise “overlay” of the object data delivered by the generator 150 with the image data of the marker objects 125, 126. The coordinate information also includes data on the relative orientation of the marker objects 125, 126 with respect to the camera 110, thereby enabling the combining means 140 to correctly adapt the orientation of the virtual object. Finally, the combined three-dimensional virtual image data is presented by the output means 160 in any appropriate form. For example, the output means 160 may comprise appropriate display means so as to visualize the environment 120 including virtual objects associated with the marker objects 125, 126. When operating the system 100, it is advantageous to pre-install recognition criteria for at least one marker object 125, 126 so as to allow a substantially reliable real-time image processing. Moreover, the correlation between a respective marker object and one or more virtual objects may be established prior to the operation of the system 100 or is designed so as to allow an interactive definition of an assignment of virtual objects to marker objects. For example, upon user request, virtual objects initially assigned to the marker object 125 are assigned to the marker object 126 and vice versa. Moreover, a plurality of virtual objects is assigned to a single marker object and a respective one of the plurality of virtual objects is selected by the user, by a software application.