In recent years, studies about mixed reality (MR) that aims at seamless joint of physical and virtual spaces have been extensively made. An image display apparatus which presents mixed reality is implemented by an apparatus which superimposes an image of a virtual space (e.g., a virtual object, text information, and the like rendered by computer graphics) onto an image of a physical space captured by an image sensing device such as a video camera or the like.
As applications of such image display apparatus, navigation that superimposes the names and information of famous buildings and the like as virtual space images in an image of a physical space obtained by capturing an urban area, a landscape simulation that superimposes a computer graphics image of a building which is planned to be constructed onto an image obtained by capturing a planned construction site of that building, and the like are expected.
A common requirement for these applications involves the precision level of registration between the physical and virtual spaces, and many efforts have been conventionally made in this respect. In order to attain accurate registration between the physical and virtual spaces, camera parameters (intrinsic and extrinsic parameters) required to generate an image on the virtual space can be always matched with those of an image sensing device. If intrinsic parameters of the image sensing device are known, a problem of registration in mixed reality eventuates in a problem of calculating extrinsic parameters of the image sensing device, i.e., the position and orientation of the image sensing device on a reference coordinate system set on the physical space.
As a method of calculating the position and orientation of an image sensing device on the reference coordinate system set on the physical space, for example, T. HÖllerer, S. Feiner, and J. Pavlik, Situated documentaries: embedding multimedia presentations in the real world, Proc. International Symposium on Wearable Computers '99, pp. 79-86, 1999. has proposed a technique for acquiring the position and orientation of an image sensing device using orientation measurement of an image sensing device using an orientation sensor and position measurement of an image sensing device by a global positioning system or the like in combination.
As typical orientation sensors used in such method, TISS-5-40 (TOKIMEC INC.) and InertiaCube2 (InterSense Inc.) are available. Each of these orientation sensors mainly comprises gyro sensors for detecting angular velocities in triaxial directions, and acceleration sensors for detecting accelerations in the triaxial directions, and measures the triaxial orientation values (azimuth angle, pitch angle, roll angle) as a combination of these measurement values. In general, angle information obtained by the gyro sensor alone is only a relative change in orientation with respect to an orientation at a given time. However, these orientation sensors are characterized in that the gravitational direction of the earth is measured using the acceleration sensors to obtain the absolute angles with reference to the gravitational direction as tilt angles (i.e., pitch and roll angles).
Orientation measurement values output from the orientation sensor represent the orientation of the sensor itself on a sensor coordinate system defined by the sensor itself irrespective of a reference coordinate system. The sensor coordinate system is defined to have the gravitational direction (down direction) as a Z-axis, and a direction in front of a sensor upon initializing the sensor on an X-Y plane specified by this Z-axis as an X-axis, in case of, e.g., TISS-5-40 above. In case of InertiCube2, the sensor coordinate system is defined to have the gravitational direction (down direction) as a Z-axis, and a north direction indicated by a built-in geomagnetic sensor upon initializing the sensor on an X-Y plane specified by this Z-axis as an X-axis. In this way, the orientation measurement values of the orientation sensor do not normally indicate the orientation itself of an object to be measured (an image sensing device in case of an image display apparatus that presents mixed reality) on the reference coordinate system as information to be acquired.
That is, the orientation measurement values of the orientation sensor cannot be directly used as the orientation of the object to be measured on the reference coordinate system, and must undergo some kind of coordinate conversion. More specifically, coordinate conversion that converts the orientation of the sensor itself into that of the object to be measured, and coordinate conversion that converts the orientation of the object to be measured on the sensor coordinate system into that on the reference coordinate system are required.
In this specification, data required to perform coordinate conversion that converts the orientation of the sensor itself into that of the object to be measured will be referred to as offset data hereinafter. Also, data required to perform coordinate conversion that converts the orientation of the object to be measured on the sensor coordinate system into that on the reference coordinate system will be referred to as alignment data hereinafter.
The prior art of an orientation measurement method that measures the orientation of an object to be measured using an orientation sensor which can measure tilt angles as the absolute angles with reference to the gravitational direction will be explained below taking a general image display apparatus that presents mixed reality as an example. Especially, the conventional setting method and use method of alignment data will be explained.
FIG. 1 is a block diagram showing the arrangement of a general image display apparatus which presents mixed reality.
A camera 110, display unit 120, and orientation sensor 130 are fixed to a head-mount unit 100.
The orientation sensor 130 measures the orientation of the orientation sensor 130 itself on the sensor coordinate system, and outputs orientation measurement values of three degrees of freedom. The orientation sensor 130 comprises, e.g., TISS-5-40 or InertiaCube2.
An orientation calculation unit 140 receives the orientation measurement values from the orientation sensor 130, applies coordinate conversion to the orientation measurement values in accordance with alignment data and offset data held by an internal memory (not shown) to calculate the orientation of the camera 110 on the reference coordinate system, and outputs it to an image generation unit 150 as orientation information.
The image generation unit 150 generates a virtual image corresponding to the position and orientation of the camera 110 in accordance with the orientation information input from the orientation calculation unit 140 and position information of the camera 110 on the reference coordinate system, which is input from a position calculation unit (not shown: e.g., a receiver of a global positioning system), and outputs that image by superimposing it onto an actually captured image input from the camera 110. The display unit 120 receives the image output from the image generation unit 150, and displays it.
With the above arrangement, an observer (not shown: i.e., a person who wears the head-mount unit 100) observes a composite image of the actually captured image (an image of the physical space) and virtual image (an image of the virtual space), which is displayed on the display unit 120 arranged in front of his or her eyes.
The method of calculating the orientation of the camera 110 on the reference coordinate system by the orientation calculation unit 140 will be described below using FIG. 2.
Initially, variables in FIG. 2 will be explained.
RWV: the orientation of the camera 110 on a reference coordinate system 200 (a coordinate system fixed to the physical space)
RWT: the orientation of a sensor coordinate system 210 on the reference coordinate system 200
RTS: the orientation of the orientation sensor 130 on the sensor coordinate system 210
RSV: the orientation of the camera 110 from the perspective of the orientation sensor 130
In this specification, the orientation of an object B on a coordinate system A is described by a 3×3 matrix RAB where RAB is a coordinate conversion matrix from a coordinate system B defined by the object B into the coordinate system A, and defines a conversion formula PA=RAB·PB that converts coordinates PB=(XB, YB, ZB)T on the coordinate system B into coordinates PA=(XA, YA, ZA)T on the coordinate system A. That is, the orientation RWV of the camera 110 on the reference coordinate system 200 can be reworded as the coordinate conversion matrix (PW=RWV·PV) for converting coordinates PV=(XV, YV, ZV)T on the camera coordinate system 200 into coordinates PW=(XW, YW, ZW)T on the reference coordinate system 200.
At this time, the relationship among RWT, RTS, RSV, and RWV can be described by:RWV=RWT·RTS·RSV  (A)
In equation (A), RTS corresponds to the input data from the orientation sensor 130 to the orientation calculation unit 140, RWV corresponds to the output data from the orientation calculation unit 140, RSV corresponds to the offset data, and RWT corresponds to the alignment data. The orientation calculation unit 140 calculates RWV based on equation (A) using RTS input from the orientation sensor 130, and RSV and RWT held by the internal memory, and outputs it to the image generation unit 150.
Therefore, in order to attain accurate registration between the physical space and virtual space, accurate RSV and RWT must be set in the internal memory of the orientation calculation unit 140 by some means.
The value of the offset data RSV is always constant as long as the relative orientation relationship between the orientation sensor 130 and camera 100 remains the same. In case of the image display apparatus shown in FIG. 1, since both the orientation sensor 130 and camera 100 are fixed to the head-mount unit 100, the offset data need only be derived only when the orientation sensor 130 and camera 100 are set on the head-mount unit 100. In general, since an object whose orientation is to be measured and an orientation sensor used to measure it are fixed to keep rigidity, the offset data need only be derived only when the orientation sensor is set on the object to be measured.
Likewise, the value of the alignment data RWT is always constant as long as the relative orientation relationship between the reference coordinate system 200 and sensor coordinate system 210 remains the same, and the alignment data need only be derived only when the reference coordinate system 200 is defined. However, in practice, in case of, e.g., TISS-5-40, since the sensor coordinate system 210 is determined depending on the orientation of the orientation sensor 130 upon initializing the sensor, as described above, if the orientation of the orientation sensor 130 upon initializing the sensor differs, the sensor coordinate system 210 differs. In case of InertiaCube2, since the sensor coordinate system 210 is determined depending on the north direction indicated by the geomagnetic sensor upon initializing the sensor, the sensor coordinate system 210 may differ depending on a change in magnetic environment upon initializing the sensor.
For this reason, there is a restriction in use of the once derived alignment data without any change that the orientation sensor 130 must always be initialized in the same orientation or magnetic environment.
As one method free from such restriction, a method described in Japanese Patent Laid-Open No. 2003-132374 (U.S. Pat. Pub. No. 2003/080976 A1) previously filed by the present application is known. This conventional method will be explained below.
If the above restriction is not placed, the definition of the sensor coordinate system 210 changes depending on the orientation, magnetic environment, and the like upon initializing the orientation sensor 130. Therefore, an appropriate value of the alignment data is not constant, and must be derived again every time the orientation sensor 130 is initialized. Since this conventional method facilitates derivation of the alignment data, the orientation can be accurately measured even when the sensor coordinate system 210 has changed.
In general, an orientation sensor which can measure tilt angles as the absolute angles with reference to the gravitational direction has a feature that one of the axes of the sensor coordinate system 210 is set to agree with the gravitational direction (or its inverse direction). For example, in TISS-5-40 or InertiaCube2, the Z-axis of the sensor coordinate system 210 is set to agree with the gravitational direction, as shown in FIG. 3. In the following description, assume that the Z-axis of the sensor coordinate system 210 indicates the gravitational direction.
This conventional method limits the degrees of freedom in design of the reference coordinate system 200 so as to facilitate derivation of alignment data. More specifically, as shown in FIG. 3, only the orientation that can be expressed by rotating the sensor coordinate system 210 in the azimuth direction can be defined as that of the reference coordinate system 200. In other words, the condition that the gravitational direction must always be defined as the Z-axis of the reference coordinate system 200 is placed as a restriction upon designing the reference coordinate system 200.
If the reference coordinate system 200 is designed under such restriction, the Z-axis direction of the reference coordinate system 200 agrees with that of the sensor coordinate system 210. For this reason, the alignment data RWT as data required to convert the orientation on the sensor coordinate system 210 into that on the reference coordinate system 200 can be expressed by a rotation matrix that expresses a rotation in the azimuth direction. In other words, alignment data RWT can be expressed by only one scalar quantity φWT that expresses the rotation angle about the Z-axis. That is, RWT is given by:
                              R          WT                =                  [                                                                      cos                  ⁢                                                                          ⁢                                      ϕ                    WT                                                                                                                    -                    si                                    ⁢                                                                          ⁢                  n                  ⁢                                                                          ⁢                                      ϕ                    WT                                                                              0                                                                                      sin                  ⁢                                                                          ⁢                                      ϕ                    WT                                                                                                cos                  ⁢                                                                          ⁢                                      ϕ                    WT                                                                              0                                                                    0                                            0                                            1                                              ]                                    (        B        )            
As described above, the appropriate value of the alignment data is not constant, and must be re-derived every time the orientation sensor 130 is initialized. With this conventional method, since the alignment data is defined by only φWT, the value which must be re-derived every time the orientation sensor 130 is initialized is only one scalar quantity φWT.
For example, after the orientation sensor 130 is initialized, the value φWT held by the orientation calculation unit 140 need only be interactively increased/decreased via an input device (not shown) such as a joystick or the like to attain accurate registration between an actually captured image and virtual image while observing an image (formed by superposing the virtual image on the actually captured image) displayed on the display unit 120. This process can be done very easily since only one variable need only be changed, and the degree of freedom of a variable is 1.
In this way, the conventional method allows accurate orientation measurement without placing any restriction that the orientation sensor 130 must always be initialized in the same orientation or magnetic environment, since re-derivation of alignment data is facilitated by limiting the degree of freedom in design of the reference coordinate system 200.
However, this method suffers a program that the reference coordinate system cannot be freely designed. For this reason, the degree of freedom in design of an application is limited. Also, it is difficult to apply this method to an existing application which has a unique reference coordinate system.