1. Field of the Invention
The present invention relates to an image display apparatus and method in which an image in a virtual space fused with a real space is displayed based on output values from a posture sensor. More particularly, the invention relates to an image display apparatus and method in which an image in a virtual space drawn in accordance with a posture of a viewpoint of an image pickup apparatus is superposed on an image in a real space photographed by the image pickup apparatus, and the superposed image is displayed on a display picture surface, an image display apparatus and method in which an image in a virtual space drawn in accordance with a posture of a viewpoint of a user is displayed on an optical see-through-type display picture surface through which a real space is transparently observable with the virtual image, and a storage medium.
2. Description of the Related Art
Recently, mixed reality (hereinafter abbreviated as “MR”) aiming at seamless connection of a real space and a virtual space is intensively being studied. MR is obtained according to a video see-through method in which an image in a virtual image (such as a virtual object drawn according to computer graphics (hereinafter abbreviated as “CG”), character information or the like) is superposed on an image in a real space photographed by an image pickup apparatus, such as a video camera or the like, and the resultant image is displayed, or an optical see-through method in which an image in a virtual space is displayed on an optical see-through-type display picture surface of a display apparatus through which a real space is transparently observable with the virtual image.
It is being expected to apply such MR to new fields qualitatively different from conventional virtual reality, such as navigation in which, for example, the name and the guide of a building is superposed and displayed on a building in a real town, scenery simulation in which a CG image of a building to be built on a planned site is displayed in a superposed state, and the like. A common request for such applications is how exactly a real space and a virtual space are to be aligned, and many attempts have been done for this request.
The problem of alignment in MR according to the video see-through method relates to the problem of obtaining the three-dimensional position/posture of a viewpoint of an image pickup apparatus in a world coordinate system set in a real space (hereinafter simply termed a “world coordinate system”). The problem of alignment in MR according to the optical see-through method relates to the problem of obtaining the three-dimensional position/posture of a viewpoint of a user in the world coordinate system.
In order to solve such problems, acquisition of the three-dimensional position/posture of a viewpoint of a photographing apparatus or a user (hereinafter simply termed a “viewpoint”) in the word coordinate system by utilizing a three-dimensional position/posture sensor, such as a magnetic sensor, an ultrasonic sensor or the like, has been generally practiced.
In situations in which fixed values may be used for the position of a viewpoint, for example, in a case in which the distance to an object is sufficiently large outdoors, acquisition of the three-dimensional position/posture of a viewpoint by obtaining the three-dimensional posture of the viewpoint using a three-dimensional posture sensor including a gyro-sensor and an accelerometer has been generally practiced.
For example, when using a posture sensor TISS-5-40 made by Tokimec Kabushiki Kaisha as a three-dimensional posture sensor, output values from the sensor represent the sensor's three-dimensional posture in a sensor coordinate system in which a direction opposite to the direction of gravity is defined as the y axis, and the forward direction of the sensor on the x-z plane defined by the y axis when starting the sensor is defined as the -z axis. Accordingly, output values from a three-dimensional sensor do not generally represent the three-dimensional posture of a viewpoint in the world coordinate system to be measured. That is, output values from the sensor cannot be used for the three-dimensional posture of the viewpoint in the world coordinate system, and it is necessary to perform some coordinate transformation. More specifically, coordinate transformation in which the posture of a sensor is transformed into the posture of a viewpoint, and coordinate transformation in which a posture in the sensor coordinate system is transformed into a posture in the world coordinate system are required. In this specification, data for performing coordinate transformation between output values from a sensor and the three-dimensional posture of a viewpoint in the world coordinate system is termed “correction information”.
FIG. 1 illustrates the configuration of an ordinary image display apparatus for providing MR according to the optical see-through method.
An optical see-through-type display picture surface 110 is fixed on a head mounting unit 100 together with a posture sensor 120. When a user (not shown) mounts the head mounting unit 100 so that the display picture surface 110 is positioned in front of the user's eyes, the user can observe a real space in front of the display picture surface 110 via an optical system (not shown in FIG. 1) of the display picture surface 110. The posture sensor 120 measures its posture in the sensor coordinate system, and outputs measured values of the posture having three degrees of freedom. The posture sensor 120 incorporates a tiltmeter (not shown) that can measure the direction of gravity of the earth. As described above, one axis (the y axis in this case) of the sensor coordinate system is set to a direction opposite to the direction of gravity.
A posture-information output unit 130 calculates the posture of the user' viewpoint in the word coordinate system by transforming measured values input from the posture sensor 120 in accordance with correction information stored in a memory 140, and outputs the calculated values as posture information. An image generation unit 150 generates and outputs a virtual image corresponding to the user's posture in accordance with the posture information input from the posture-information output unit 130. The display picture surface 110 inputs the virtual image from the image generation unit 150 and displays the input virtual image. According to the above-described configuration, the user sees the virtual image displayed on the display picture surface 110 which is fused with a view of the real space seen via the display picture surface 110.
Next, a description will be provided of a method for calculating the posture of the user's viewpoint in the world coordinate system by the posture-information output unit 130, with reference to FIG. 2.
In FIG. 2, the posture of a sensor coordinate system 210 in a world coordinate system 200 is represented by RTW, the posture of the posture sensor 120 in the sensor coordinate system 210 is represented by RST, the relative posture of a viewpoint 220 of the user as seen from the posture sensor 120 is represented by RVS, and the posture of the user's viewpoint 220 in the world coordinate system 200 is represented by RVW.
R is a 4×4 matrix, and RBA represents the posture of an object B in a coordinate system A. In other words, R is a coordinate transformation matrix from the coordinate system A into a coordinate system B defined by the object B, and defines a transformation formula PB=RBAPA to transform coordinates PA=(XA,YA,ZA,1)T in the coordinate system A into coordinates PB=(XB, YBZB, 1)T in the coordinate system B. That is, the posture RVW of the user's viewpoint 220 in the world coordinate system 200 can also be represented by a coordinate transformation matrix (PV=RVWPW) for transforming coordinates PW=(XW,YW,ZW,1)T in the world coordinate system 200 into coordinates PV=(XV,YV,ZV,1)T in a user's viewpoint coordinate system 230.
The matrix R is the product of a rotation matrix Rx defined by an angle of rotation θ around the x axis, a rotation matrix Ry defined by an angle of rotation (azimuth) φ around the y axis, and a rotation matrix Rz defined by an angle of rotation φ around the z axis, and a relationship of R=RzRxRy holds. These matrices are expressed as follows:
            R      ⁢                          ⁢              x        ⁡                  (          θ          )                      =          [                                    1                                0                                0                                0                                                0                                              cos              ⁢                                                          ⁢              θ                                                          sin              ⁢                                                          ⁢              θ                                            0                                                0                                                              -                sin                            ⁢                                                          ⁢              θ                                                          cos              ⁢                                                          ⁢              θ                                            0                                                0                                0                                0                                1                              ]        ,          ⁢            R      ⁢                          ⁢              y        ⁡                  (          ϕ          )                      =          [                                                  cos              ⁢                                                          ⁢              ϕ                                            0                                                              -                sin                            ⁢                                                          ⁢              ϕ                                            0                                                0                                1                                0                                0                                                              sin              ⁢                                                          ⁢              ϕ                                            0                                              cos              ⁢                                                          ⁢              ϕ                                            0                                                0                                0                                0                                1                              ]        ,          ⁢            R      ⁢                          ⁢              z        ⁡                  (          ψ          )                      =          [                                                  cos              ⁢                                                          ⁢              ψ                                                          sin              ⁢                                                          ⁢              ψ                                            0                                0                                                                              -                sin                            ⁢                                                          ⁢              ψ                                                          cos              ⁢                                                          ⁢              ψ                                            0                                0                                                0                                0                                1                                0                                                0                                0                                0                                1                              ]      RVW can be expressed by RVW=RVS·RST·RTW - - - (equation A).
Since the y axis of the sensor coordinate system 210 is set to a direction opposite to the direction of gravity, by defining the y axis of the world coordinate system 200 to be perpendicular to the ground level, the direction of the y axis of the world coordinate system 200 can coincide with the y axis of the sensor coordinate system 210. At that time, each of components RxTW and RzTW around the x axis and the y axis, respectively, of RTW is a unit matrix, and RTW is equivalent to a rotation matrix RyTW defined by an angle of rotation φTW around the y axis. Accordingly, the above-described (equation A) is transformed into RVW=RVS·RST·RyTW (equation B).
In the above-described equations, RST is an input from the posture sensor 120 to the posture-information output unit 130, RVW is an output from the posture-information output unit 130 to the image generation unit 150, and RVS and RyTW (in other words, angles of rotation θVS, φVS and φVS around three axes defining RVS, and an angle of rotation φTW defining RyTW) constitute correction information necessary for transforming RST into RVW. The posture-information output unit 130 calculates RVW based on (equation B) using RST input from the posture sensor 120, and RVS and RyTW stored in the memory 140, and outputs the calculated RVW to the image generation unit 150.
In order to perform exact alignment between the real space and the virtual space, exact correction information must be set in the memory 140 by some means. Only when exact correction information is provided, display of the virtual image exactly aligned with the real space is realized.
In one known method for setting correction information, the user or the operator interactively increases or decreases the values of θVS φVS, φ vs and φTW stored in the memory 140 via input means (not shown), and adjustment of the respective values is performed in a trial-and-error approach until exact alignment is achieved.
In this approach, since the four parameters must be simultaneously adjusted, troublesome operations are required and a large amount of time is required for adjustment.
One method for mitigating the above-described trouble has been proposed in Japanese Patent Application Laid-Open (Kokai) No. 2001-050990 (2001) filed by the assignee of the present application. This method for setting correction information will now be described.
FIG. 3 is a block diagram illustrating the configuration of an image display apparatus in which this method for setting correction information is incorporated in the image display apparatus shown in FIG. 1. As shown in FIG. 3, in this configuration, a correction-information calculation unit 310, an instruction-information input unit 320 and a switching unit 330 are added to the configuration shown in FIG. 1. The functions of components of this image display apparatus corresponding to the posture-information output unit 130 and the memory 140 shown in FIG. 1 differ from the functions of the corresponding components shown in FIG. 1. Hence, in the image display apparatus shown in FIG. 3, these components are indicated by a posture-information output unit 130′ and a memory 140′.
Correction information is calculated by moving the posture of the user's viewpoint 220 to a predetermined posture Ry0VW having a visual line horizontal to the ground level in the world coordinate system, and acquiring an output R0ST of the posture sensor 120 at that time. The memory 140′ stores the posture Ry0VW of the predetermined viewpoint (or an angle of rotation φ0VW around the y axis defining the posture), in addition to the correction information.
The switching unit 330 sets the mode of the posture-information output unit 130′ to an ordinary mode or a correction-information calculation mode by receiving an input from the user or the operator (both not shown in FIG. 3).
When the mode is the ordinary mode, the posture-information output unit 130′ calculates RVW from RST input from the posture sensor 120 using correction information, as the posture-information output unit 130 described with reference to FIG. 1, and outputs the calculated RVW to the image generation unit 150 as posture information.
When the mode is the correction-information calculation mode, Ry0VW is input from the memory 140′, and is output to the image generation unit 150 as posture information. The instruction-information input unit 320 receives an input from the user or the operator, and transmits an instruction to execute correction-information calculation processing to the correction-information calculation unit 310. More specifically, in the correction-information calculation mode, the user or the operator adjusts the posture of the viewpoint 220 so that a virtual image displayed on the display picture surface 110 is correctly aligned with the real space seen via the display picture frame 110. When it is determined that they are sufficiently aligned (i.e., when it is determined that the viewpoint 220 is positioned at the posture Ry0VW), input to the instruction-information input unit 320 is performed, for example, by depressing a specific key.
The correction-information calculation unit 310 inputs an instruction to execute correction-information calculation processing from the instruction-information input unit 320, inputs the output R0ST from the posture sensor 120 at that time (i.e., when the user or the operator determines that the viewpoint 220 is positioned at the posture Ry0vw), and calculates correction information based on the posture Ry0vw and the posture R0st.
In the above-described method, it is necessary that a rotation component RySV (or an angle of rotation φSV around the y axis defining the rotation component) around the y axis of an inverse matrix RSV (representing a relative posture of the posture sensor as seen from the user's viewpoint) of the posture RVS, serving as one component of correction information, is known by some means, and is already stored in the memory 140′.
At that time, a relationship of Ry0VW=R0VS·R0ST·RyTW (equation C) holds between data processed at the correction-information calculation unit 310, according to (equation B). By modifying (equation C), the following equation is obtained:Ry0VW=(RzSVRxSVRySV)−1 R0stRyTW (equation D).
The correction-information calculation unit 310 inputs Ry0VW and RySV from the memory 140′ and R0ST from the sensor 120, and calculates unknown components RzSV, RxSV and RyTW of correction information based on (equation D), according to the following procedure.
By further modifying (equation D), the following equation is obtained:RzSVRxSVRySVRy0VW=Rz0STRx0STRy0STRyTW  (equation E).Since each of the left and right sides of (equation E) is the product of rotation components around the z, x and y axes, an identity holds for each rotation component around each of the z, x and y axes.
First, an identity of a rotation component around each of the z and x axes is as follows. That is,RzSV=Rz0ST  (equation F)RxSV=Rx0ST  (equation G).Thus, it is possible to obtain RzSV and RxSV.
An identity of a rotation component around the y axis is as follows. That is,RySV Ry0VW=Ry0STRyTW.From this equation, the following equation is obtained:RyTW=RySVRy0VW Ry0ST−1  (equation H).Thus, it is possible to obtain RyTW.
The correction-information calculation unit 310 calculates RzSV, RxSV and RyTW according to the above-described processing, further calculates RVS (=(RzSVRxSVRySV)−1) from these values, and outputs RVS and RyTW (or angles of rotation θSV, φSV and φTW defining these values) to the memory 140′.
RySV may be obtained from measured values by a protractor or the like in a trial-and-error approach, or by using any other appropriate measuring means.
As described above, by obtaining only RySV in correction information, it is possible to easily obtain other unknown components of correction information, and realize exact alignment.
However, it is generally difficult to exactly measure correction information RySV in advance. Accordingly, it is actually necessary to repeat a series of operations of rough setting of RySV, processing to obtain other components of correction information, fine adjustment of RySV, processing to obtain other components of correction information, fine adjustment of Rysv, . . . until exact alignment is realized, and therefore troublesome operations and a large amount of time are required for acquiring correction information.