Interaction with computing devices is a fundamental action in today's world. Computing devices, such as personal computers, tablets, smartphones, are found throughout daily life. In addition, computing devices that are wearable, such as wearable headset devices (e.g., virtual reality headsets and augmented reality headsets), are becoming more popular. The systems and methods for interacting with such devices define how they are used and what they are used for.
Advances in eye tracking technology have made it possible to interact with a computing device using a person's gaze information. In other words, the location on a display the user is gazing at. This information can be used for interaction solely, or in combination with a contact-based interaction technique (e.g., using a user input device, such as a keyboard, a mouse, a touch screen, or another input/output interface).
Previously proposed interaction techniques using gaze information can be found in U.S. Pat. No. 6,204,828, United States Patent Application Publication 20130169560, U.S. Pat. No. 7,113,170, United States Patent Application Publication 20140247232, and U.S. Pat. No. 9,619,020. The full specification of these patents and applications are herein incorporated by reference.
Generally, gaze-based interaction techniques rely on detecting a gaze of a user on a gaze point. Existing systems and methods can accurately detect two dimensional (2D) gaze. Recently, neural networks have been implemented to detect such 2D gazes.
Attempts have been made to expand existing techniques that rely on neural network to three dimensional (3D) gaze. However, the accuracy of the prediction is not as good as the one for 2D gaze. Absent accurate 3D gaze tracking, support of stereoscopic displays and 3D applications is significantly limited. Further, even in the 2D domain, a neural network is typically trained for a specific camera and screen configuration (e.g., image resolution, focal length, distance to a screen of a computing device, a size of the screen, and the like). Thus, anytime the configuration changes (e.g., different image resolution, different screen size, and the like), the neural network can no longer predict 2D gaze at an acceptable accuracy. Re-training of the neural network for the new configuration would be needed.