This invention relates to the determination of a user""s eye gaze vector and point of regard by analysis of images taken of a user""s eye. The invention relates more specifically to eye gaze tracking without the need to calibrate for specific users"" eye geometries and to subsequently recalibrate for user head position.
Eye gaze tracking technology has proven to be useful in many different fields, including human-computer interfaces for assisting disabled people interact with a computer. The eye gaze tracker can be used as a mouse emulator for a personal computer, for example, helping disabled people to move a cursor on a display screen to control their environment and communicate messages. Gaze tracking can also be used for industrial control, aviation, and emergency room situations where both hands are needed for tasks other than operation of a computer but where an available computer is useful. There is also significant research interest in eye gaze tracking for babies and animals to better understand such subjects"" behavior and visual processes. Commercial eye gaze tracking systems are made by ISCAN Incorporated (Burlington Mass.), LC Technologies (Fairfax Va.), and Applied Science Laboratories (Bedford Mass.).
There are many different schemes for detecting both the direction in which a user is looking and the point upon which the user""s vision is fixated. Any particular eye gaze tracking technology should be relatively inexpensive, reliable, unobtrusive, easily learned and used and generally operator-friendly to be widely accepted. The corneal reflection method of eye gaze tracking is increasing in popularity, and is well-described in the following U.S. patents, which are hereby incorporated by reference: 4,595,990, 4,836,670, 4,950,069, 4,973,149, 5,016,282, 5,231,674, 5,471,542, 5,861,940, 6,204,828. These two articles also describe corneal reflection eye gaze tracking and are also hereby incorporated by reference: xe2x80x9cSpatially Dynamic Calibration of an Eye-Tracking Systemxe2x80x9d, K. White, Jr. et al., IEEE Transactions on Systems, Man, and Cybernetics, vol. 23, no. 4, July/August 1993, p. 1162-1168, referred to hereafter as White, and xe2x80x9cEffectiveness of Pupil Area Detection Techniquexe2x80x9d, Y. Ebisawa et al., Proceedings of the 15th Annual International Conference of IEEE Engineering in Medicine and Biology Society, vol. 15, October 1993, p. 1268-1269.
Corneal reflection eye gaze tracking systems project light toward the eye and monitor the angular difference between pupil position and the reflection of the light beam. Near-infrared light is often employed, as users cannot see this light and are therefore not distracted by it. Usually only one eye is monitored, and it isn""t critical which eye is monitored. The light reflected from the eye has two major components. The first component is a xe2x80x98glintxe2x80x99, which is a very small and very bright virtual image of the light source reflected from the front surface of the corneal bulge of the eye. The glint position remains relatively fixed in an observer""s image field as long as the user""s head remains stationary and the corneal sphere rotates around a fixed point. The second component is light that has entered the eye and has been reflected back out from the retina. This light serves to illuminate the pupil of the eye from behind, causing the pupil to appear as a bright disk against a darker background. This retroreflection, or xe2x80x9cbright eyexe2x80x9d effect familiar to flash photographers, provides a very high contrast image. Unlike the glint, the pupil center""s position in the image field moves significantly as the eye rotates. An oculometer determines the center of the pupil and the glint, and the change in the distance and direction between the two as the eye is rotated. The orientation of the eyeball can be inferred from the differential motion of the pupil center relative to the glint. The eye is often modeled as a sphere of about 13.3 mm radius having a spherical corneal bulge of about 8 mm radius; the eyes of different users will have variations from these typical values, but individual dimensional values do not generally vary significantly in the short term.
As shown in prior art FIG. 1, the main components of a corneal reflection eye gaze tracking system include a video camera sensitive to near-infrared light, a near-infrared light source (often a light-emitting diode) typically mounted to shine along the optical axis of the camera, and a computer system for analyzing images captured by the camera. The on-axis light source is positioned at or near the focal center of the camera. Image processing techniques such as intensity thresholding and edge detection identify the glint and the pupil from the image captured by the camera using on-axis light, and locate the pupil center in the camera""s field of view as shown in prior art FIG. 2.
Human eyes do not have equal resolution over the entire field of view, nor is the portion of the retina providing the most distinct vision located precisely on the optical axis. The eye directs its gaze with great accuracy because the photoreceptors of the human retina are not uniformly distributed but instead show a pronounced density peak in a small region known as the fovea centralis. In this region, which subtends a visual angle of about one degree, the receptor density increases to about ten times the average density. The nervous system thus attempts to keep the image of the region of current interest centered accurately on the fovea as this gives the highest visual acuity. A distinction is made between the optical axis of the user""s eye versus the foveal axis along which the most acute vision is achieved. As shown in prior art FIG. 3, the optical axis is a line going from the center of the spherical corneal bulge through the center of the pupil. The optical axis and foveal axis are offset in each eye by an inward horizontal angle of about five degrees, with a variation of about one and one half degrees in the population. The offsets of the foveal axes with respect to the optical axes of a user""s eyes enable better stereoscopic vision of nearby objects. The offsets vary from one individual to the next, but individual offsets do not vary significantly in the short term. For this application, the gaze vector is defined as the optical axis of the eye. The gaze position or point of regard is defined as the intersection point of the gaze vector with the object being viewed (e.g. a point on a display screen some distance from the eye). Adjustments for the foveal axis offsets are typically made after determination of the gaze vector; a default offset angle value may be used unless values from a one-time measurement of a particular user""s offset angles are available.
Unfortunately, calibration is required for all existing eye gaze tracking systems to establish the parameters describing the mapping of camera image coordinates to display screen coordinates. Different calibration and gaze direction calculation methods may be categorized by the actual physical measures they require. Some eye gaze tracking systems use implicit models that map directly from pupil and glint positions in the camera""s image plane to the point of regard in screen coordinates. Other systems use physically-based explicit models that take into account eyeball radius, radius of curvature of the cornea, offset angle between the optical axis and the foveal axis, head and eye position in space, and distance between the center of the eyeball and the center of the pupil as measured for a particular user. During calibration, the user may be asked to fix his or her gaze upon certain xe2x80x9cknownxe2x80x9dpoints in a display. At each coordinate location, a sample of corresponding gaze vectors is computed and used to adapt the system to the specific properties of the user""s eye, reducing the error in the estimate of the gaze vector to an acceptable level for subsequent operation. The user may also be asked to click a mouse button after visually fixating on a target, but this approach may add synchronization problems, i.e. the user could look away from the target and then click the mouse button. Also, with this approach the system would get only one mouse click for each target, so there would be no chance to average out involuntary eye movements. Alternately, during calibration, the user may visually track a moving calibration icon on a display that traverses a discrete set of known screen coordinates. Calibration may need to be performed on a per-user or per-tracking-session basis, depending on the precision and repeatability of the tracking system.
prior art eye gaze tracking systems also require subsequent recalibration to accurately adjust for head motion. U.S. Pat. No. 5,016,282 teaches the use of three reference points on calibration glasses to create a model of the head and determine the position and orientation of the head for the eye gaze tracking system. However, it is not likely that users will generally be willing to wear special glasses merely to enable the system to account for head motion in everyday use. Other commercial eye gaze tracking systems are head mounted, and therefore have no relative head motion difficulties to resolve. However, these systems are mainly designed for military or virtual reality applications wherein the user typically also wears a head mounted display device coupled to the eye gaze tracking device. Head mounted displays are inconvenient and not generally suitable for long periods of computer work in office and home environments. Details of camera calibration and conversion of measured two-dimensional points in the image plane to three-dimensional coordinates in real space are described in xe2x80x9cA Flexible New Technique for Camera Calibrationxe2x80x9d, Z. Zhang, IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(11): 1330-1334, 2000, (also available as Technical Report MSR-TR-98-71 at http://research.microsofl.com/xcx9czhang/Papers/TR98-71.pdf), which is hereby incorporated by reference.
White offers an improvement in remote eye gaze tracking in the presence of lateral head translations (e.g. parallel to a display screen) of up to 20 cm. White uses a second light source to passively recalibrate the system. The second light source creates a second glint. White claims that a single initial static (no head motion) calibration can be dynamically adjusted as the head moves, leading to improved accuracy under an expanded range of head motions without a significantly increased system cost. Unfortunately, White""s system compensates only for lateral head displacements, i.e. not for motion to/from the gaze position, and not for rotation. Rotation of a user""s head is particularly troublesome for prior art gaze tracking systems as it changes the distance from the eye to both the object under observation and to the camera generating images of the eye.
While the aforementioned prior art methods are useful advances in the field of eye gaze tracking, systems that do not require calibration would increase user convenience and broaden the acceptance of eye gaze tracking technology. A system for providing eye gaze tracking requiring little or no knowledge of individual users"" eye geometries, and requiring no subsequent calibration for head movement is therefore needed.
It is accordingly an object of this invention to devise a system and method for eye gaze tracking wherein calibration for individual users"" eye geometries is not required.
It is a related object of the invention to devise a system and method for eye gaze tracking wherein subsequent recalibration for head movement is not required.
It is a related object of the invention to determine a gaze vector and to compute a point of regard as the intersection of the gaze vector and an observed object.
It is a related object of the preferred embodiment of the invention that two cameras each having a co-located and co-oriented light source are used to capture images of a user""s eye. It is a related object of the preferred embodiment of the invention to capture images of a user""s eye such that the pupil center in each image and glints generated by each light source may be readily identified and located in the image plane of each camera.
It is a related object of the preferred embodiment of the invention to compute a first angle between three points in the image plane of the first camera, specifically the angle between the pupil center, the first glint (generated by the first camera""s light source) and the second glint (generated by the second camera""s light source). Similarly, it is a related object of the preferred embodiment of the invention to compute a second angle between three points in the image plane of the second camera, specifically the angle between the pupil center, the second glint and the first glint.
It is a related object of the preferred embodiment to define a base plane spanning the first camera""s focal center, the second camera""s focal center, and the common point in space (on the eye) at which light from one camera""s light source reflects to the other camera. It is a related object of the preferred embodiment of the invention to define a first plane by rotating the base plane by the first angle about a line from the focal center of the first camera and the first glint in the first camera""s image plane. The intersection of the first plane with the display screen plane defines a first line containing the point of regard. Similarly, it is a related object of the preferred embodiment of the invention to define a second plane by rotating the base plane by the second angle about a line from the focal center of the second camera and the second glint in the second camera""s image plane. The intersection of the second plane with the display screen plane defines a second line containing the point of regard.
It is a related object of the preferred embodiment of the invention to compute the gaze vector as a line defined by the intersection between the first plane and the second plane and extending from the user""s eye toward an observed object. The point of regard is computed from the intersection of the gaze vector with the observed object, which corresponds to the intersection of the first line and the second line when the observed object is planar. Correction for foveal axis offsets may be added.
It is a related object of the second embodiment that each of the two cameras require only light originally emitted by its own on-axis light source. It is a related object of the second embodiment of the invention to compute a first plane including a first glint position in the first camera""s image plane, a pupil center position in the first camera""s image plane, and the focal center of the first camera. Similarly, it is a related object of the second embodiment of the invention to compute a second plane including a second glint position in the second camera""s image plane, a pupil center in the second camera""s image plane, and the focal center of the second camera. The intersection of the first plane with the display screen plane defines a first line containing the point of regard. The intersection of the second plane with the display screen plane defines a second line containing the point of regard. The gaze vector is a line defined by the intersection between the first plane and the second plane and extending from the user""s eye toward an observed object. The point of regard is computed from the intersection of the gaze vector with the observed object, which corresponds to the intersection of the first line and the second line when the observed object is planar.
It is a related object of the third embodiment of the invention to use a single camera having a co-located and co-oriented light source to capture images of a user""s eye including glints and a pupil center. It is a related object of the third embodiment of the invention to determine the distance in the camera""s image plane between the pupil center and the glint. Using an estimated distance between the user""s eye and an observed object, and a one-time measurement of the user""s corneal curvature, the gaze vector and point of regard are determined.
The foregoing objects are believed to be satisfied by the embodiments of the present invention as described below.