An eye gaze interface technology is available in which the gaze of a user is detected and the detection result is used in operating a computer. If a user makes a visual observation of a position on the screen of an eye gaze interface, then that user can perform operations such as zooming of the observed position or selecting the object at the observed position without having to operate a mouse or a keyboard.
An eye gaze interface detects a gaze by means of a corneal reflex method by which the corneal reflex of a cornea is produced using a near-infrared light source, and the center of the corneal reflex and the center of the corresponding pupil are obtained by means of image processing. Then, in the corneal reflex method, the gaze of the user is detected from the positional relationship between the center of the corneal reflex and the center of the pupil.
Herein, in the corneal reflex method, the premise is to accurately detect a pupil by means of image processing. For that reason, for example, even when a pupil moves under the eyelid, it is imperative that the position of the pupil is detected. Given below is the explanation of exemplary conventional technologies for detecting a pupil.
According to a first conventional technology, from an image that captures an eye, a near-circular portion is inferred to be the pupil and is extrapolated to a circular shape. Then, according to the first conventional technology, pupil detection is done by performing template matching between a predefined template and the template of the image that has been extrapolated to a circular shape.
A second conventional technology is built based on the template matching that is implemented in the first conventional technology described above. According to the second conventional technology; while processing a dynamic image, if the target object is not detected in a particular frame image, then the extrapolation is done using the detection result of the previous frame image. For example, in the second conventional technology; the following rule is applied: if the target object is detected in an area A of a first frame image; then, in a second frame image too, the target object is supposed to be present in the vicinity of an area that is identical to the area A.
According to a third conventional technology, if a plurality of candidate pupils is detected, then a single candidate pupil is selected from among all candidate pupils by referring to the sizes of the candidate pupils or the bounding rectangle area ratios of the candidate pupils. For example, according to the third conventional technology; as far as the size is concerned, if the height and the width of a candidate pupil is within 3 pixels to 30 pixels, then that candidate pupil is treated as a pupil. Meanwhile, in the third conventional technology, the user sets in advance a value that a pupil can have as its size. As for examples of the conventional technologies, see Japanese Laid-open Patent Publication No. 06-274269, Japanese Laid-open Patent Publication No. 2009-254691, “Human Detection Method for Autonomous Mobile Robots”, Matsushita Electric Works Technical Report, Vol. 53, No. 2, and “AdaBoost-based traffic flow measurement using Haar-like features”, ViEw2009 collection of papers, p. 104, for example.
However, in the conventional technologies, it is not possible to detect a pupil with accuracy.
For example, in the first conventional technology, in the case when the area of an eye has the shadow of the nose reflected in the eye, then it is not possible to distinguish between the reflection and the pupil of that eye. Moreover, since the position of a pupil instantaneous changes in a large way, there are times when the rule of the second conventional technology cannot be applied. Furthermore, in the third conventional technology, the size points to the broad dimensions that a pupil can have. For that reason, in case a plurality of candidate pupils having similar sizes is present, then it is difficult to select the correct pupil.