A technology which is disclosed in the specification relates to an image display device which is used when a user wears the device on the face or the head, and views an image, an image display method thereof, and a computer program. In particular, the technology relates to an image display device in which a virtual display image is viewed by being overlapped with a real world scenery, an image display method thereof, and a computer program.
A head-mounted image display device which is used when viewing an image by wearing the device on the head, that is, a head-mounted display has been known. The head-mounted image display device includes, for example, image display units for respective left and right eye, and is configured so as to control sight and hearing by using headphones together. In addition, the head-mounted image display device is capable of projecting images which are different in the left and right eyes, and providing a 3D image when displaying an image with parallax in the left and right eyes.
It is also possible to classify the head-mounted image display device into a light blocking type and a transmission type. A light blocking-type head-mounted image display device is configured so as to directly cover user's eyes when being mounted on the head, and a level of concentration of a user increases when seeing and hearing an image. On the other hand, in a case of a transmission-type head-mounted image display device, a user is able to see an outside view (that is, see through) beyond an image while the image is being displayed by mounting the device on the head. Accordingly, it is possible to display a virtual display image by overlapping the image with a real world scenery. Naturally, also in the light blocking-type head-mounted image display device, it is possible to display the same image for a user when a composition process is performed so that an image of a camera which photographs the outside scenery is overlapped with a virtual display image.
For example, a head-mounted display device in which character data is extracted from a designated region which is designated when a user moves a cursor on a real image using a line of sight, and characters which are generated based on a translation result of the character data are shown at a designated portion such as the upper right corner, or the lower left corner of a virtual screen has been proposed (for example, refer to Japanese Unexamined Patent Application Publication No. 2001-56446). However, in the head-mounted display device, a user himself should perform input operations of processes which are applied to a designated region such as a type of a target language of character data, a character type, and a character style. In addition, since the character after the translation as a processing result is arranged at a place excluding the original designated region, it is not possible for a user to view the processing result, if the user does not take his eyes off of the designated region. In addition, when a translated character string is overlapped with a place which is different from the designated region of the real image, it is considered that a sense of reality decreases.
In addition, a glasses-type display device in which a sound signal of a speaker is specified and extracted based on face image data of the speaker in an imaged field of vision, face characteristic data, and a sound signal of ambient sound, and text data thereof is translated into another language, and is overlappingly displayed in the field of vision has been proposed (for example, Japanese Unexamined Patent Application Publication No. 2012-59121).
In addition, a telescope-type mixed reality presentation device in which virtual space image data in which a scene of building a virtual structure in a scenery in front of eyes is simulated is composited without being deviated from a real space image has been proposed (for example, refer to Japanese Unexamined Patent Application Publication No. 2003-215494).
As described above, the head-mounted image display device in the related art is capable of presenting virtual information such as translated sentences of text data which is extracted from a real image, a sound of a speaker, or the like, and a virtual space image which is simulated by computer graphics, or the like, to a user. However, pieces of information which are asked for from a user who is wearing the device, or are necessary for the user are various. In a field of sight, a user not necessarily keeps a close eye on an object which presents information necessary for the user. In addition, information to be presented to an object, or a presentation method thereof is various in each user, and may be changed according to a surrounding environment, or a user's state, even in the same user.