1. Technical Field
The present invention relates to a user interface device that estimates a degree of interest of a user in a plurality of objects displayed on a screen for executing an input process on the basis of the degree of interest of the user, and to an input method.
2. Background Art
Currently available information systems are generally designed to perform an interaction such as presenting information to a user upon reacting to an “express request” of the user (for example, an input of a character through a keyboard, a press of a button on a remote controller, or designation of an object with a pointing device, performed by the user). Such conventional interaction mode is, however, insufficient for achieving smooth communication between the user and the system, because of difficulty and trouble in manipulation, as well as complexity of expressing an intention of the user.
Accordingly, a recently proposed system estimates an “implied request” of the user (for example, whether the user is interested, or the degree of the user's interest) utilizing a multimodal sensor group including a camera or a microphone. For example, NPL 1 proposes a system that shoots a user viewing a video content and estimates the degree of interest on the basis of the user's facial expression, to thereby add a tag such as “Neutral”, “Positive”, or “Negative” to the video content, thus providing information useful for recommending a program. Also, PTL 1 proposes an image reproduction system that sequentially reproduces and displays a plurality of different object images, and that dynamically determines a display time of the object image on the basis of a peripheral sound (such as a cheer of the viewer) and the viewer's action (such as a change in facial expression). These techniques are employed basically for determining the degree of interest in a single content displayed on the screen.
Meanwhile, gaze direction is one of typical physical reactions that can be used for estimating the user's interest, attention, or intention, with respect to a plurality of contents displayed on a screen. Although vision is a predominant factor when one desires to acquire information, area of central vision and an effective visual field are limited. Accordingly, it is necessary for the user to move a gaze point to an object in order to acquire information from the object. Resultantly, the gaze direction concentrates at the object in which the user is interested. The gaze direction can, therefore, be construed as generally representing the user's interest, attention, or intention.
Here, PTL 2 discloses a device that decides an object on which the user's eye remains for a long time to be the object desired by the user. The device displays a plurality of images on a screen for user's choice, and selects an image that the user desires by detecting the user's eye direction to the images with a gaze angle detector, measuring gaze duration on each of the images, and comparing the lengths of the duration.