Position and orientation measurement of an image capturing device (to be referred to as a camera as needed hereinafter) such as a camera or the like used to capture a physical space is required in a mixed reality (MR) system which blends and presents a physical space and virtual space.
As a method of measuring the position and orientation of a camera in the physical space, a method of attaching a position and orientation sensor such as a magnetic sensor or the like to the camera is available (to be referred to as method 1 hereinafter).
In the MR technique, it is desirable that no deviation be allowed to exist between the position of an object (physical object) which exists in the physical space and that of an object (virtual object) rendered by computer graphics or the like. Japanese Patent Laid-Open No. 11-084307 (D1) and Japanese Patent Laid-Open No. 2000-041173 (D2) disclose a technique for correcting the measurement errors of the position and orientation sensor used in method 1 using a captured image of the physical space.
The methods disclosed in D1 and D2 are common in that markers whose positions are given are laid out on the physical space, and sensor errors are corrected using information from markers included in the image captured by the camera although they have different calculation principles, means, and processes. More specifically, the position and orientation of the camera are calculated based on information obtained from a six degrees of freedom (6DOF) position and orientation sensor used to measure the position and orientation of the camera, information from markers laid out in the physical space, and information obtained by capturing these markers using the camera (to be referred to as method 2 hereinafter).
As disclosed in
W. A. Hoff and K. Nguyen, “Computer vision-based registration techniques for augmented reality”, Proc. SPIE, Vol. 2904, pp. 538-548, November 1996 (D3),
U. Neumann and Y. Cho, “A self-tracking augmented reality system”, Proc. VRST '96, pp. 109-115, July 1996 (D4),
Junichi Rekimoto, “Constructing augmented reality system using the 2D matrix code”, Interactive system and software IV, Kindai kagaku sha, pp. 199-208, December 1996 (D5), and the like,
many methods for calculating the position and orientation of a camera based only on information obtained by using the camera to capture markers present in the physical space have been implemented. In order to calculate the position and orientation of the camera, three or more marker points that are not located on an identical line are required, as described in D2. Methods of capturing three or more marker points which are not located on an identical line by a camera, and calculating the position and orientation of the camera based on the coordinates of detected markers in the captured image will be collectively referred to as method 3.
Method 3 is advantageous in terms of cost, since it uses only the camera without using any expensive 6DOF position and orientation sensor. However, in order to measure the position and orientation of the camera, three or more marker points that are not located on an identical line must be captured.
Japanese Patent Laid-Open No. 2005-33319 (D6, U.S. Patent Publication No. 2005/0008256) discloses a method of estimating the position and orientation of a camera using an orientation detection sensor and marker information detected in an image. With the method described in D6, when three or more marker points are detected, the position and orientation of the camera can be accurately calculated by repetitive operations. When one or two points of markers are detected, translations or rotation components of the position and orientation are corrected using the previous position and orientation estimation results. In this way, if the number of detected markers is less than three, accurate position and orientation measurement can be obtained effectively using information from the orientation detection sensor.
In recent years, gyro or acceleration sensors have improved in performance, and can accurately detect an orientation. Therefore, using these sensors, a tilt in the gravitational direction and an azimuth in the direction of earth axis can be accurately obtained. Since method 3 measures the position and orientation using only the camera, three or more points which are not laid out on an identical line on the physical space must be detected in a captured image. However, when the orientation of the camera is detected by the sensor and only the position is calculated from markers, only two marker points need be detected. Also, the two points can be two ends of a line segment. In order to easily detect correspondence between the captured markers and actual markers, it is a common practice to use different colors or patterns in markers.
In order to detect correspondence between markers, a plurality of pieces of information required to identify individual markers can be introduced using encoded information in an element which forms each marker. As such encoded information, a barcode, QR code, and the like are generally known. As a method that can continuously form such information linearly, a method using a pseudo random number sequence (PN sequence) generated based on an Maximum Length Sequence (MLS) is known. Furthermore, many methods of obtaining a relative position using the Maximum Length Sequence (MLS) have been proposed as position detection methods. The Maximum Length Sequence (MLS) allows calculation of the number of times, a pseudo random number generator has been executed by using the point at which the correlation of a sequence generated using pseudo random number generator signals becomes greatest. Hence, if a sequence greater than or equal to a standard length required to generate the Maximum Length Sequence (MLS) can be obtained, the relative position can be calculated.
When applying the MR technique to in-house navigation or the like, such a technique normally superimposes the next direction to go in on the image of the physical space. In this case, if the orientation measurement precision of the camera is low, because the image that indicates the direction may not indicate a correct direction, improvement of the orientation measurement precision is important.
Furthermore, since GPS is not available underground or inside buildings, it is difficult to directly apply car navigation mechanisms to navigation in in-house movement over a broad range underground or inside buildings.
It is difficult for 6DOF sensors, which detect position and orientation by detecting magnetism, to obtain sufficiently high precision over a broad range due to the influence of obstacles such as metal and the like, and various means for improving their precision have been proposed. Of such means, method 2 has been proposed as a means for minimizing errors by minimizing the positional deviation from an actual captured image. Since 6DOF sensors normally have a limited measurement range, they do not support cases wherein an object to be measured moves over a wide range. Since the sensors themselves are expensive, the price of 6DOF sensors must be reduced in order for both methods 1 and 2 to become popular.
Since method 3 does not use a 6DOF sensor, it can measure the position and orientation of the camera within a range where markers are present. Since method 3 uses only the camera, it can be implemented at a lower cost than a measurement apparatus which uses a 6DOF sensor if a general-purpose CCD camera is used. However, in order to obtain the position and orientation of the camera, markers whose positions on the physical space are given need be detected in the captured image.
In order to accurately obtain the orientation using method 3, marker images appearing in the captured image must be of a large size. By also laying out a plurality of markers in the physical space so as to ensure that those captured are of a large size, stable position and orientation measurement can be made over a broad range. Each marker must not be affected by the illumination state of the physical space so that it can be accurately detected by image processing. For this purpose, each marker is normally configured to have black and white regions.
As described above, each marker used in method 3 tends to have a large size and a black-and-white pattern. For this reason, when a plurality of large markers for position and orientation measurement (e.g., square, black-and-white markers described in D5) are laid out on the physical space, these markers may often be eyesores for people who do not use the position and orientation measurement of the camera. Also, people who do not know the significance of markers may feel they spoil the beauty of the physical space.
On a wall surface on which two-dimensional, large markers are adhered, not only the material and structure of the wall surface are hidden, but also such wall surface hardly accords with the existing design. Furthermore, it is difficult, for example, in public places to put black and white large markers for position and orientation measurement on the wall surface and structure. In order to accurately measure the position and orientation of the camera in a place that requires a long moving distance such as a corridor, passageway, and the like, a large number of two-dimensional markers must be continuously set within a range in which the camera can capture them. For example, setting a large number of markers on a public space such as an underground passage or the like requires social consensus, and it is not easy to implement this in practice. That is, not everybody may prefer large, rectangular markers adhered here and there on the wall surfaces of a public space.
When a method of estimating the orientation of the camera using an orientation sensor that uses gyro and acceleration sensors together with markers, and a method of obtaining the camera position from two marker points is adopted, the use of large markers is not required. In this case, markers need only be set to identify two points. Conventionally, it is a common practice to use a method of identifying markers based on colors. However, since the colors change under the influence of scene lighting, stable identification is difficult to attain. For this reason, since the number of colors that can be used is limited, it is difficult to identify a plurality of colors when moving the camera over a broad range. As information used to easily identify the detected marker, differences of the shapes of markers may be used. However, if the number of types of shapes increases, the differences between the shapes becomes small, and markers are indistinguishable if they are not captured in a large size. This consequently poses the same problem as with the markers used in method 3.
As methods of providing information to each marker, means for embedding information into one marker, and a method of providing information to a combination of a plurality of markers are available. As representatives of the latter method, there are many techniques for obtaining the relative position using signals based on the Maximum Length Sequence (MLS). In the Maximum Length Sequence (MLS), the information volume of each marker is often two bits, i.e., “0” or “1”, and a simple sensor is normally used to read 2-bit data. In order to cover a broad range, the standard length required to generate the sequence becomes large. A region to be captured is often relatively narrow, and if the position and orientation cannot be measured unless many markers are captured at one time, the range of use is narrowed down.
As described above, in an application in which the camera used to detect the position and orientation moves over a broad range, no marker which can implement precise measurement of the position and orientation without largely spoiling the beauty of the camera moving range are available so far.