In the past, the classification of the gaze direction of a vehicle driver has been important to determine, amongst other things, a drowsy driver. Moreover gaze detection systems, in conjunction with external sensors such as infra-red, microwave or sonar ranging to ascertain obstacles in the path of the vehicle, are useful to determine if a driver is paying attention to possible collisions. If the driver is not paying attention and is rather looking away from the direction of travel, it is desirable to provide an alarm of an automatic nature. Such automatic systems are described in "Sounds and scents to jolt noisy drivers", Wall Street Journal, page B1, May 3, 1993. Furthermore, more sophisticated systems might attempt to learn the characteristic activity of a particular driver prior to maneuvers, enabling anticipation of those maneuvers in the future.
An explicit quantitative approach to this problem involves (a) calibrating the camera used to observe the driver, modeling the interior geometry of the car, and storing this as a priori information, and (b) making an accurate 3D metric computation of the driver's location, head pose and gaze direction. Generating a 3D ray for the driver's gaze direction in the car coordinate frame then determines what the driver is looking at.
There are problems with this approach. Firstly, although the geometry of the car's interior will usually be known from the manulacturer's design data, the camera's intrinsic parameters, such as focal length, and extrinsic parameters, such as location and orientation relative to the car coordinate frame, need to be calibrated. That extrinsic calibration may change over time due to vibration. Furthermore, the location of the driver's head, head pose and eye direction must be computed in the car coordinate frame at run-time. This is difficult to do robustly, and is intensive for the typical low-power processor installed in a car.