The invention relates to methods for locating image features.
In eyegaze technology it has long been known that the angular orientation of the optical axis of the eye can be measured remotely by the corneal reflection method. The method takes advantage of the eye's properties that the surface of the cornea is very nearly spherical over about an 11-to-15-degree cone around the eye's optic axis, and the relative locations of the pupil and a reflection of light from the cornea, e.g., the first Purkinje image, change in proportion to eye rotation. The corneal reflection method for determining the orientation of the eye is described in U.S. Pat. No. 3,864,030 to Cornsweet: U.S. Pat. No. 3,869,694 to Merchant et al.: U.S. Pat. Nos. 4,287,410 and 4,373,787 to Crane et al.: U.S. Pat. No. 4,648,052 to Friedman et al.: and U.S. Pat. Nos. 4,755,045 and 4,789,235 to Borah et al; A. R. Downing, "Eye Controlled and Other Fast Communicators for Speech Impaired and Physically Handicapped Persons," Australasian Phys. Eng. Scis. in Med. vol. 8, no. 1, pp. 17-21 (1985): J. L. Levine, "Performance of an Eyetracker for Office Use," Comput. Biol. Med., vol. 14, no. 1, pp. 77-89 (1984): and J. Merchant et al., "A Remote Oculometer Permitting Head Movement," Rpt. No. AMRL-TR-73-69, Contr. No. F33615-72-C-1038, U.S. Air Force Systems Command (1973), as well as U.S. Pat. No. 4,034,401 to Mann: and U.S. Pat. No. 4,595,990 to Garwin et al., and certain U.S. Pat. Applications of Thomas E. Hutchinson, Ser. No. 07/086,809 now U.S. Pat. No. 4,836,670; Ser. No. 07/326,787 now U.S. Pat. No. 4,973,149; and Ser. No. 07/267,266 now U.S. Pat. No. 4,950,069.
A typical equipment configuration for an eye orientation monitor 10 is shown in FIG. 1. The hardware generally includes a video camera 12 and lens 14 to observe eye 16, a light source 18 such as a near-infrared-emitting diode near or on the lens to illuminate the eye, a digital frame grabber 20 to capture the video image from camera 12 and put it into a computer readable form, and a general purpose digital computer 22 to perform image processing and mathematical computations. The outputs of the camera 12 and/or computer 22 may also be displayed on a suitable monitor 24.
Fundamentally, the corneal reflection method comprises the following steps: processing the image of the eye to detect and locate the center of the corneal reflection or first Purkinje image processing the image to detect and locate the center of the pupil: computing the 2-dimensional vector between the center of the pupil and the center of the corneal reflection: and computing the 2-dimensional angular orientation of the eye with respect to the camera axis from the pupil-center/corneal-reflection vector. Naturally, the accuracy with which the eye's angular orientation and gaze point can be computed is heavily influenced by the accuracies with which the centers of the pupil and corneal reflection are computed.
Typically the image of the corneal reflection formed by camera 12 is a small cluster of high intensity picture elements (pixels). A simple method for finding the center of the corneal reflection is peak detection where the position of the peak image intensity is taken to be the location of the corneal reflection. In practice, however, noise and pixel amplitude clipping often render the peak detection method unreliable. Camera noise can cause the peak amplitude pixels to be far from the actual center of the corneal reflection. Also, the amplitude of the camera's image of the corneal reflection is often clipped because its intensity exceeds the linear range of the camera sensor: thus, several neighboring pixels will have equal, i.e., the maximum, intensities.
One method for reducing the effects of noise and clipping is to set a corneal reflection detection threshold valve T.sub.cr and compute the simple average or centroid of the coordinates of all pixels whose intensities exceed the threshold. The pixel coordinates x.sub.cr, y.sub.cr of the corneal reflection are thus given by: ##EQU1## where N is the total number of pixels that exceed the corneal reflection threshold, n is an index of those pixels, and x and y are the coordinates of those pixels in the camera image. This method can be referred to as the "equal-weighting" method because any pixel that exceeds the threshold is given an equal weight in estimating the corneal reflection position.
The center of the pupil is typically found by locating several points on the edge of the pupil, then computing the pupil center coordinates from the edge coordinates. Using an amplitude threshold crossing technique as just described, the edge coordinates are located where the image intensity crosses a pupil detection threshold value T.sub.p set somewhere between the average intensity of the pupil and the average intensity of the iris. The horizontal coordinate of the pupil center can be estimated, for example, by averaging the left and right edge coordinates of one or more horizontal "cuts" through the pupil, and the vertical coordinate of the pupil center can be similarly estimated by averaging the top and bottom edge coordinates of one or more vertical cuts through the pupil. Another way to find the pupil center is to fit a circle or ellipse to the detected edge coordinates and mathematically compute the center of the fitted circle or ellipse.
In certain orientations of the eye, the corneal reflection lies on or near the edge of the pupil and can disrupt the pupil edge detection procedure. The pupil location procedure should use information about the position of the corneal reflection to avoid attempting pupil edge detection in the region of the corneal reflection.
As described above, prior image processing methods locate the pupil within the camera image by detecting the difference in intensities between the pupil and the surrounding iris. In the usual image of the eye, the pupil appears to be darker than the surrounding iris since more light reflects from the iris than enters the pupil and reflects from the inner eye. This is commonly known as the "dark-pupil" or "dark-eye" effect. Typically, the relative intensities of the iris and pupil are not sufficiently different to make the image processing easy. Alternative methods were sought to increase the contrast ratio between the pupil and its surrounding iris.
The retinas of most eyes are highly reflective. As illustrated in FIG. 2a, the light source 18 is directed at eye 16, and the eye's lens 16-1 causes the light that enters pupil 16-2 to converge to a point on retina 16-3. As illustrated in FIG. 2b, some of the light that reflects from the retina passes back out through the pupil and is partially refocused by lens 16-1 in such a way that it is directed primarily back toward the light source. In flash photography, if the flash lamp is too close to the lens, a significant portion of this reflected light enters the camera lens aperture, producing pictures of people with bright pupils. The phenomenon was therefore named the "bright-eye" or "bright-pupil" effect. The bright-eye effect is avoided in flash photograph by moving the flash unit away from the camera lens axis to minimize the amount of reflected light entering the camera lens.
While the bright-eye effect is generally undesirable in most photography, it is useful in optical eye monitoring applications. If, as illustrated in FIGS. 3a and 3b, the light source 18 used to generate the corneal reflection is either mounted coaxially with the camera lens (see FIG. 3a) or disposed with respect to a beam-splitter 26 so as to appear coaxial with the lens (see FIG. 3b), the bright-pupil effect is maximized. If the light source is bright enough, e.g., by use of a lens 18-1 shown in FIG. 3b, the contrast ratio between the iris and the bright pupil in the camera image can be made significantly greater than the contrast ratio between the iris and a dark pupil. With the improved contrast ratio, image processing routines can locate the pupil edges and center more reliably and accurately.
Locating the pupil center and corneal reflection is important because once the coordinates of the centers of the pupil and corneal reflections have been determined, the 2-dimensional vector between the two points can be computed as the difference between the two points: EQU dx=x.sub.p -x.sub.cr EQU dy=y.sub.p -y.sub.cr
where x.sub.p and y.sub.p are the pupil center coordinates within the camera frame of reference, x.sub.cr and y.sub.cr are the corneal center coordinates within the camera frame of reference, and dx and dy comprise the 2-dimensional pupil-center/corneal-reflection vector's components.
For example (see FIG. 4), let the pitch and yaw angles .theta. and .psi., respectively, of the eye's orientation be defined with respect to the camera axis. (It will be understood that if it is desired to express the eye's orientation with respect to other coordinate frames, the desired angles may be computed by straightforward mathematical transformations relating the camera coordinate frame to the desired coordinate frame.) Further, let .theta. and .psi. be defined with respect to the eye's optical axis defined as passing through the center of the pupil and being normal to the optical plane of the eye's lens. The optical axis of the eye is different from its visual axis which also passes through the pupil center, but is further defined as passing through the center of the foveola, i.e., that part of the retina where a person focuses his visual concentration. Physiologically, the foveola generally lies somewhere to the person's lateral side of the point where the optical axis intercepts the retina; thus, the visual axis is rotated from the optical axis by an angle .epsilon. of about 2 degrees to 8 degrees horizontally (yaw) and an angle .gamma. of about .+-.3 degrees vertically (pitch). Since the camera's view of the corneal reflection is based on the geometry of the eye's lens and not on the position of the foveola, it is convenient in the image processing domain to work in terms of the eye's optical axis rather than its visual axis. If it is desired to know the orientation of the eye's visual axis, the eye's optical axis pitch and yaw angles .theta. and .psi. can be adjusted by the angular difference between the eye's optical and visual axes.
FIG. 4 shows a side schematic view of the camera looking at eye 16. Within the camera, the camera's sensor plane 12-1 is a distance s from lens plane 14-1. In the following description, the light source that illuminates the eye is assumed to be located at the center of the camera lens. The location of eye 16 in the camera coordinate frame can be defined in terms of a range R along the camera axis from the camera lens to the center of the pupil and a corneal offset angle .alpha..sub.cr between the camera axis and the center of the corneal reflection. Similarly, an angle .alpha..sub.p is the angle between the camera axis and the pupil center. A line from the center of the eye's pupil through the center of the camera lens is called the pupil ray. An extension of the pupil ray through the camera lens intercepts the camera sensor plane at the point y.sub.p. By the law of optical reflection, the corneal reflection lies on the cornea at the point where the cornea surface is normal to the center of the camera lens because the light source is located there. The line from the corneal reflection through the center of the camera lens is called the corneal reflection ray. An extension of the corneal reflection ray through the camera lens intercepts the camera sensor plane at the point y.sub.cr. It can also be noted that an extension of the corneal reflection ray in the opposite direction passes through the cornea's center of curvature.
A procedure for computing the eye's orientation or pitch angle .theta. is as follows. An angle .eta. between the corneal reflection ray and the eye's optical axis is the sum of the corneal offset angle .alpha..sub.cr and the eye's optical axis orientation angle .theta.: EQU .eta.=.alpha..sub.cr +.theta.
The corneal offset angle .alpha..sub.cr can be calculated from the corneal reflection position y.sub.cr on the sensor plane 12-1 and the distance s: EQU .alpha..sub.cr =arctan (y.sub.cr /s)
As defined above, the distance dy on the camera sensor plane is the difference between the measured pupil center y.sub.p and the measured corneal reflection position y.sub.cr : EQU dy=y.sub.p -y.sub.cr
By reason of similar triangles, a distance dy', measured parallel to the camera's lens plane, between the actual pupil center and the corneal reflection ray is: EQU dy'=dy R/s
A distance dy", measured normal to the corneal reflection ray, between the actual pupil center and corneal reflection ray is: EQU dy"=dy' cos(.alpha..sub.cr)
Defining the distance from the corneal center of curvature to the pupil center to be .rho., the angle .eta. is found from dy": EQU .eta.=arcsin (dy"/.rho.)
Finally, the eye's optical axis pitch angle .theta. is: EQU .theta.=.eta.-.alpha.c.sub.cr
Using small-angle approximations, the above equations combine and reduce to: EQU .theta..perspectiveto. (R/.rho.s) {y.sub.p (1-.rho./R) -y.sub.cr }
and since the distance .rho. between the corneal center of curvature and the pupil center is small with respect to the range R from the camera lens to the eye, the pitch angle approximation can be further simplified to: EQU .theta..perspectiveto. (R/.rho.s) (y.sub.p -y.sub.cr)=(R/.rho.s)dy
By similar derivation, the eye's optical axis yaw angle .psi. is approximated by: EQU .psi..perspectiveto. (R/.rho.s) (x.sub.p -x.sub.cr)=(R/.rho.s)dx
last two approximations relate the measured vector components dy and dx to the eye's optical axis orientation angles .theta. and .psi. with the factor (R/.rho.s) as a constant of proportionality.
In some eyegaze tracking applications it is desired to determine the gaze point (x.sub.g, y.sub.g) on a predetermined display plane at which the person is looking. For example, it may be desired to determine where a person is looking on a computer monitor screen or on a printed page. In these applications, a projection is made from the pupil of the eye, along the eye's visual axis, to the intercept point on the display plane. If the display plane is approximately parallel to the camera lens plane, and if the x and y axes of the display plane are approximately parallel to the x and y axes of the camera lens plane, the following derivation shows how to approximate the gaze point from the pupil-center/corneal-reflection vector.
FIG. 5 shows a side schematic view of camera 12, lens 14 and lens plane 14-1, eye 16 and display plane 28. A distance y.sub.o represents the offset of the origin of the display plane with respect to the camera axis, and D is a distance between lens plane 14-1 and display plane 28. From the geometry in FIG. 5, it can be determined that the vertical component y.sub.g of the gaze point is given by: EQU y.sub.g =R sin (.alpha..sub.p)+(R+D) sin (.theta.+.gamma.)-y.sub.o
where .gamma. is the angle between the eye's optical and visual axes in the pitch plane. Using small-angle approximations and the above approximation of .theta., and noting .alpha..sub.p .perspectiveto.y.sub.p /s yields: ##EQU2## Since {1-(R+D)/R} is much smaller than (R+D) /.rho., y.sub.g can be further approximated as: EQU y.sub.g .perspectiveto. {(R+D)R/.rho.s}dy+{(R+D).gamma.-y.sub.o }
Similarly, the horizontal component of the gaze point may be approximated: EQU x.sub.g .perspectiveto. {(R+D)R/.rho.s}dx+{(R+D).epsilon.-x.sub.o }
where .epsilon. is the angle between the eye's optical and visual axes in the yaw plane, and x.sub.o is the horizontal offset of the origin of the display plane with respect to the camera axis.
In practice, the linearization assumptions introduce errors into the gaze point projection equations: the values of R, D, s, .rho., .gamma., .epsilon., x.sub.o and y.sub.o are difficult to determine accurately: and the display screen is rarely parallel to the camera lens plane. Therefore, explicit calculation of the above gaze point equations is impractical. The form of the equations is highly useful, however, in that they show how the pupil-center/corneal-reflection vector varies with respect to gaze point motions. Keeping the form of the approximations but generalizing the coefficients yields: EQU x.sub.g .perspectiveto.a.sub.o +a.sub.x dx EQU y.sub.g .perspectiveto.b.sub.o +b.sub.y dy
where the a's and b's are generalized, lumped coefficients.
To make it more comfortable for the user to view, it is often desirable to tilt the display plane 28, such as a computer monitor, forward with respect to the camera lens so that the user's view is approximately normal to the plane and the camera "looks up" at the user's eye. Tilt and roll of the display plane can largely be accommodated if the following more general approximations are used: EQU x.sub.g .perspectiveto.a.sub.o +a.sub.x dx+a.sub.y dy+a.sub.xy dx dy EQU y.sub.g .perspectiveto.b.sub.o +b.sub.x dx+b.sub.y dy +b.sub.xy dx dy
A geometrical analysis of the camera/eye/display configuration shows that the terms, a d.sub.y and b.sub.x dx generally compensate for roll of the display screen about the camera's optical axis, and the cross-product terms a.sub.xy dx dy and b.sub.xy dx dy compensate for tilt of the display plane about the camera's pitch and yaw axes respectively.
Values for the generalized coefficients are typically generated by the following calibration procedure: the user sequentially looks at a series of predetermined locations on the display plane 28, the computer measures and records the pupil-center/corneal-reflection vectors for each of those calibration points, and curve fitting methods such as linear regression are used to determine the coefficient values that allow the gaze point equations to generate those calibration points best.
The locations of the predetermined calibration points should be distributed fairly evenly in both dimensions over the display plane 28 where accurate eye-orientation or gaze-point projection is desired. Generally, to get sufficiently accurate coefficient values to achieve adequate gaze-point accuracies, the number of calibration points should be at least twice the number of coefficients used in the prediction equations.
A central goal in most eye monitoring applications is to measure the orientation of the eye with as much accuracy as possible. As described above, the accuracy of the orientation calculations is directly dependent on the accuracy of the measurements of the centers of the pupil and the corneal reflection. In turn, key factors limiting the coordinate measurements of the pupil and corneal reflection centers are the spatial resolution of the digitized image of the eye and amplitude noise on the light intensity samples. It is an objective of the present invention to maximize the accuracy of the pupil center and corneal reflection location measurements in light of the resolution and noise constraints of digital image acquisition, and to carry out those measurements rapidly. It will be appreciated that the advantages of the present invention can be realized in a wide variety of applications, and are obtained in eyegaze tracking using either the dark-eye or bright-eye effect.