Optical motion capture systems are typically based on high contrast video imaging of retro-reflective markers, which are attached to strategic locations on an object. The markers are usually spherical objects or disks ranging from 1 to 5 cm in diameter, depending on the motion capture hardware and the size of the active area, and are sometimes covered by reflective materials. Some markers may incorporate infra-red (IR) light emitting diodes (LEDs) to enhance contrast. The number, size, shape, and placement of markers depend on the type of motion to be captured and desired quality (accuracy). In many applications, multi DoF measurements are required. For instance, in addition to three position coordinates, the angular orientation of an object may also need to be calculated. Usually, in these cases a rigid combination of several markers, called a “rigid body” is used for tracking. Obtaining more detailed rotational information always requires additional markers. Additional markers can also provide redundancy and overcome occlusion problems.
In conventional optical trackers, each marker is tracked by an array of high-resolution high-speed digital cameras that cover a working area. The number of cameras depends on the type of motion capture. To enhance the contrast, each camera is equipped with IR LEDs and IR pass filters over the camera lens. Appropriate software receives the 2D coordinates of markers, as captured by the tracking cameras, and calculates the positions of individual markers (3 DoF) or rigid bodies (6 DoF). Adding more cameras helps alleviate performance issues, but further drives up the system complexity and cost.
Optical tracking can be applied to tracking eye movements and measure (compute) the gaze direction of a person who is looking at a particular point in space or on a display. However, these measurements are difficult to achieve in practice and require high-precision instruments as well as sophisticated data analysis and interpretation. Over the years, a variety of eye-gaze (eye-movement) tracking techniques have been developed, such as Purkinje Image Tracking, Limbus, Pupil, Eyelid Tracking, Cornea and Pupil Reflection Relationship (see, for example “Remote Eye Tracking: State of the Art and Directions for Future Development”, The 2nd Conference on Communication by Gaze Interaction—COGAIN 2006: Gazing into the Future, pp. 1-5, Sep. 4-5, 2006, Turin, Italy by M. Böhme et al). Despite the number of developed techniques, the concomitant improvement in performance has been rather modest. Developers are still being challenged by problems related to head-movement, tracker over-sensitivity and/or unreliable calibration. Some systems require complicated personal calibration procedures, others are based on wearable head-mounted gears, and some restrict the user's head positions within a narrow area. Every remote eye-tracking system (i.e. when no head-mounted gear is used) has its own problem of head movement compensation, which must be addressed appropriately. For example, a small 2D mark may be attached to the head or a cap on it and used, as a reference, in order to compensate for head movement. The use of nose-feature image extraction as a reference point has also been explored, see, for example, U.S. Pat. No. 5,686,942 by Ball. And so has been a hybrid approach to head movement compensation that entailed a combination of optical eye-tracking and magnetic or ultrasound trackers. The drawbacks of these approaches are the need for a separate control unit and the use of bulky transmitters. In sum, there is still an urgent need for an accurate, unobtrusive, and reliable method and system of real-time gaze-tracking.
Currently, the most promising approach is the near-infrared reflection (NIRM) two-point optical gaze-tracking method. It requires no mounted equipment and allows for small head motions. The tolerance to insignificant head movements is gained by tracking two reflection points (glints) of the eye and distinguishing head movements (points move together without changing their relative position) from eye movements (points move with respect of one another). One of the most common variations of the NIRM gaze-tracking method employs pupil and Purkinje image processing and is a relatively accurate gaze tracking technique (see, for example, “Pupil Detection and Tracking Using Multiple Light Sources”, Image and Vision Computing, Volume: 18, No. 4, pp. 331-336, 2000 by C. H. Morimoto et al; also “Single camera Eye-Gaze Tracking System with Free Head Motion”, ETRA 2006, pp. 87-94 by G. Hennessey et al.; also U.S. Pat. Nos. 5,604,818 by Saitou, 4,973,149 by Hutchinson, and 6,152,563 by Hutchinson). This method is frequently employed in advanced research and commercial products.
Much effort has been dedicated to NIRM-based tracking in an attempt to improve its robustness, mitigate the head-movement problem, and simplify calibration. In particular, physiological aspects and appearance of eyes as well as head/eye motion dynamics were analyzed in “Detecting and Tracking Eyes by using their Physiological Properties, Dynamics, and Appearance”, Proceedings of CVPR, pp. 163-168, 2000 by A. Haro et al., The use of multiple cameras and light sources has also been considered in “A Calibration-Free Gaze Tracking Technique”, In Proceedings of the International Conference on Pattern Recognition (ICPR'00), pp. 201-204, 2000 by S. W. Shih et al., A real-time imaging system reported in “A TV camera system which extracts feature points for non-contact eye movement detection”, In Proceedings of the SPIE Optics, Illumination, and Image Sensing for Machine Vision IV, Vol. 1194, pp. 2-12, 1989 by A. Tomono et al. comprised three CCD cameras and two NIR light sources at two wavelengths, one of which was polarized. The system implemented a combined differencing and threshold and allowed for pupil and corneal reflection segmentation by using different CCD sensitivity and polarization filtering. A relatively robust method for gaze direction detection was realized by combining a real-time stereovision technique (to measure head position) with limbus detection (the boundary between a bright sclera and a dark iris), see “An Algorithm for Real-time Stereo Vision Implementation of Head Pose and Gaze Direction Measurement”, In Proceedings of Fourth IEEE International Conference on Automatic Face and Gesture Recognition, pp. 499-504, 2000 by Y. Matsumoto et al. However, this method suffered from a low angular resolution and compromised accuracy. In “Just Blink Your Eyes: A Head-Free Gaze Tracking System”, Proceedings of CHI2003, pp. 950-951, 2003 by T. Ohno et al., a stereo imaging unit (two CCD cameras) for determining eye positioning was complemented by a conventional gaze tracking imaging unit for detecting the pupil and corneal reflection (Purkinje image). As yet another example of prior art, U.S. Pat. No. 7,206,435 by Fujimura et al, discloses a combination of bright and dark pupil images by implementing two spatially separate light-emitting diode arrays (rings); one around the camera lens and another far from the lens. While the robustness of pupil tracking may be improved with this implementation, the problem of head movement compensation remains unresolved.
Since the NIRM invokes the use of illumination, several studies have attempted to optimize its use. It is known that when an on-axis light source is used (i.e. it is positioned coaxially with the camera optical axis), the pupil appears bright because the light reflected from the eye interior is able to reach the camera. On the contrary, illumination by an off-axis source generates a dark pupil image. Commercial remote eye tracking systems mostly rely on a single light source positioned either off-axis or on-axis (point or ring-type). Examples are presented by eye trackers made by ISCAN Incorporated from Burlington, Mass.; LC Technologies from McLean, Va.; ASL from Bedford, Mass. Some other approaches employ multiple cameras with multiple on-axis light sources and attempt to estimate the line of sight without using any of the user-dependent parameters (‘calibration free’), see, for example, U.S. Pat. No. 6,659,611 by Amir et al. and “A Calibration-Free Gaze Tracking Technique”, in Proceedings of the International Conference on Pattern Recognition (ICPR'00), pp. 201-204, 2000 by S. W. Shih et al.
It is worthwhile to note that despite the obvious advantages of using bright pupil conditions, this type of illumination inherently limits tracking resolution due to a relatively high sensitivity to head movements. It is especially true when speaking of head movements along the line of sight. Another NIRM based approach invokes the use of two optical point sources for simultaneous eye position and gaze detection, as disclosed in U.S. Pat. No. 5,604,818 Saitou et al. and in “Single camera Eye-Gaze Tracking System with Free Head Motion”, ETRA 2006, pp. 87-94 by G. Hennessey et al. However, the accuracy of this approach is compromised in cases when glints are located close to the optical axis (i.e. the straight line connecting the origin (camera) and object (cornea center)). In these cases, the error is proportional to the square of the angle between the optical axis and the direction to the glints. Thus, the error increases rapidly with the distance between glints.
While the aforementioned prior art methods are useful advances in the field of eye-gaze tracking, they are adversely affected by short tracking ranges, frequent recalibration, flow stability (frequent floss of tracking) and high cost. All these deficiencies prevent object/eye trackers from being used by the general public and limit their applications to research laboratories and boutique markets.
In light of the above review of prior art, there is an apparent need for a unobtrusive optical tracking method that is robust under various light conditions and stable within a wider range of tracking. Such a method should obviate recalibration and have high speed and good angular resolution within an extended range of tracking. Finally, the method should lend itself to making an inexpensive optical tracking device suitable for mass production.