The ideal human-computer interaction system should function robustly with as few constraints as those found in human-to-human interaction. One of the most effective means of interaction is through the behavior of the eye. Specifically, knowledge of the viewing direction and thus the area of regard offers insight into the user's intention and mental focus, and consequently this information is vital for the next generation of user interfaces [21][22]. Applications in this vein can be found in the fields of HCI, security, advertising, psychology, and many others [5][21].
As such, there has been intensive research on eye tracking and gaze estimation for the past 30 years [5]. However, with the conventional single-camera system, either the user must keep his or her head locked in place (so that it remains within the narrow field of view), the camera must be strapped to the subject's head (a common approach for eye detection), or some other marker (like a small infrared light) must be worn by the subject [2]. The resulting situation can be prodigiously inconvenient and uncomfortable for the user. Because of the unique reflective properties of the pupil to infrared (IR) light, some systems have opted to detect the face by first finding the eyes [2][10][11]. While robust to visible light conditions, these methods have issues with changes in head pose, reflections off glasses, and even decreased reflectivity of the retina from contact lenses [9]. An alternative approach is to use stereo cameras [2][11]; while robust, these systems generally require substantial calibration time. Some complex systems (like “smart rooms” [1]) require expensive and/or non-portable setups. Even existing, non-stereo, two-camera systems [16] often restrict the user to a preset location in the room.
The overwhelming majority of gaze estimation approaches rely on glints (the reflection of light off the cornea) to construct 2D or 3D gaze models [5]. Alternatively, eye gaze may be determined from the pupil or iris contours [18] using ellipse fitting approaches [3][15]. One can also leverage the estimated iris center directly and use its distance from some reference point (e.g., the eye corners) for gaze estimation[12][19]. Indeed, the entire eye region may be segmented into the iris, sclera (white of the eye), and the surrounding skin; the resulting regions can then be matched pixel-wise with 3D rendered eyeball models (with different parameters) [17][20]. However, different subjects, head pose changes, and lighting conditions could significantly diminish the quality of the segmentation[20].
U.S. Pat. No. 8,077,217 provides an eyeball parameter estimating device and method, for estimating, from a camera image, as eyeball parameters, an eyeball central position and an eyeball radius which are required to estimate a line of sight of a person in the camera image. An eyeball parameter estimating device includes: a head posture estimating unit for estimating, from a face image of a person photographed by a camera, position data corresponding to three degrees of freedom (x-, y-, z-axes) in a camera coordinate system, of an origin in a head coordinate system and rotation angle data corresponding to three degrees of freedom (x-, y-, z-axes) of a coordinate axis of the head coordinate system relative to a coordinate axis of the camera coordinate system, as head posture data in the camera coordinate system; a head coordinate system eyeball central position candidate setting unit for setting candidates of eyeball central position data in the head coordinate system based on coordinates of two feature points on an eyeball, which are preliminarily set in the head coordinate system; a camera coordinate system eyeball central position calculating unit for calculating an eyeball central position in the camera coordinate system based on the head posture data, the eyeball central position candidate data, and pupil central position data detected from the face image; and an eyeball parameter estimating unit for estimating an eyeball central position and an eyeball radius based on the eyeball central position in the camera coordinate system so as to minimize deviations of position data of a point of gaze, a pupil center, and an eyeball center from a straight line joining original positions of the three pieces of position data.
U.S. Pat. No. 7,306,337, expressly incorporated herein by reference, determines eye gaze parameters from eye gaze data, including analysis of a pupil-glint displacement vector from the center of the pupil image to the center of the glint in the image plane. The glint is a small bright spot near the pupil image resulting from a reflection of infrared light from a an infrared illuminator off the surface of the cornea.
U.S. Pat. Pub. 2011/0228975, expressly incorporated herein by reference, determines a point-of-gaze of a user in three dimensions, by presenting a three-dimensional scene to both eyes of the user; capturing image data including both eyes of the user; estimating line-of-sight vectors in a three-dimensional coordinate system for the user's eyes based on the image data; and determining the point-of-gaze in the three-dimensional coordinate system using the line-of-sight vectors. It is assumed that the line-of-sight vector originates from the center of the cornea estimated in space from image data. The image data may be processed to analyze multiple glints (Purkinje reflections) of each eye.
U.S. Pat. No. 6,659,611, expressly incorporated herein by reference, provides eye gaze tracking without calibrated cameras, direct measurements of specific users' eye geometries, or requiring the user to visually track a cursor traversing a known trajectory. One or more uncalibrated cameras imaging the user's eye and having on-axis lighting, capture images of a test pattern in real space as reflected from the user's cornea, which acts as a convex spherical mirror. Parameters required to define a mathematical mapping between real space and image space, including spherical and perspective transformations, are extracted, and subsequent images of objects reflected from the user's eye through the inverse of the mathematical mapping are used to determine a gaze vector and a point of regard.
U.S. Pat. No. 5,818,954 expressly incorporated herein by reference, provides a method that calculates a position of the center of the eyeball as a fixed displacement from an origin of a facial coordinate system established by detection of three points on the face, and computes a vector therefrom to the center of the pupil. The vector and the detected position of the pupil are used to determine the visual axis.
U.S. Pat. No. 7,963,652, expressly incorporated herein by reference, provides eye gaze tracking without camera calibration, eye geometry measurement, or tracking of a cursor image on a screen by the subject through a known trajectory. See also, U.S. Pat. No. 7,809,160, expressly incorporated herein by reference. One embodiment provides a method for tracking a user's eye gaze at a surface, object, or visual scene, comprising: providing an imaging device for acquiring images of at least one of the user's eyes: modeling, measuring, estimating, and/or calibrating for the user's head position: providing one or more markers associated with the surface, object, or visual scene for producing corresponding glints or reflections in the user's eyes; analyzing the images to find said glints or reflections and/or the pupil: and determining eye gaze of the user upon a said one or more marker as indicative of the user's eye gaze at the surface, object, or visual scene.
One application of eye gaze tracking is in small or large surfaces, particularly large displays or projected wall or semi-transparent surfaces, including but not limited to LCD screens, computer screens, SMART boards, tabletop displays, projection screens of any type, plasma displays, televisions, any computing appliance, including phones, PDAs, and the like, and head-mounted and wearable displays and the like. In addition, any surface, including, for example, walls, tables, furniture, architectural ornaments, billboards, windows, semi-transparent screens, window displays, clothing racks, commercial displays, posters, stands, any commercial or other goods, clothing, car dashboards, car windows, and the like, may be the gaze target.
By augmenting any shopping display, such as, for example, computer or television screen-based, projected, static surface, objects, goods (e.g., clothing, furniture), with eye gaze determination, eye gaze behavior of subjects (i.e., shoppers) can be tracked for the purpose of registering whether individuals are interested in the goods on display. This can be used for evaluating the design or arrangement of advertisements or arrangements of goods, or for disclosing more information about products or objects to the subject. The following scenario illustrates this application. A clothes rack is augmented with one or more eye tracking cameras, and the clothes or hangers (or any other goods). Cameras detect which item the shopper is interested in by tracking the eye gaze of the shopper. According to one option, when the duration of an eye fixation on an object reaches a threshold, a projection unit displays more information about the goods. Alternatively, in response to a fixation, the subject may be addressed using a recorded message or synthesized computer voice associated with the object of interest, which acts as an automated sales assistant. Alternatively, information about user interest in an article or advertisement may be conveyed to a sales assistant or third party.
Interactive or non-interactive home appliances can be augmented using eye tracking, to determine the availability of users for communications with other people or devices. Subjects may direct the target of speech commands to the appliance, or initiate speech dialogue or other forms of disclosure by the appliance through establishing eye gaze fixation with the appliance.
Eye tracking may be incorporated into a gaming device, portable or otherwise, may provide extra channels of interaction for determining interest in embodied gaming characters. Characters or objects in games can then observe whether they are being looked at by the user and adjust their behavior accordingly, for example by avoiding being seen or by attracting user attention. Alternatively, characters or objects can respond verbally or nonverbally to fixations by the user, engaging the user in verbal, nonverbal, textual, graphical, or other forms of discourse. In the case of speech recognition agents or online human interlocutors, the discourse can be mutual. Alternatively, the technology can be used to allow gaming applications to make use of eye gaze information for any control purpose, such as moving on-screen objects with the eyes, or altering story disclosure or screen-play elements according to the viewing behavior of the user. In addition, any of the above may be incorporated into robotic pets, board games, and toys, which may operate interactively at any level.
By incorporating eye tracking into a television display or billboard (e.g., a screen, paper, or interactive display), broadcasters and/or advertisers can determine what (aspects of) advertisements are viewed by, and hence of interest to, a subject. Advertisers may use this information to focus their message on a particular subject or perceived interest of that subject, or to determine the cost per view of the advertisement, for example, but not limited to, cost per minute of product placements in television shows. For example, this method may be used to determine the amount of visual interest in an object or an advertisement, and that amount of interest used to determine a fee for display of the object or advertisement. The visual interest of a subject looking at the object or advertisement may be determined according to the correlation of the subject's optical axis with the object over a percentage of time that the object is on display. In addition, the method may be used to change the discourse with the television, or any appliance, by channeling user commands to the device or part of the display currently observed. In particular, keyboard or remote control commands can be routed to the appropriate application, window or device by looking at that device or window, or by looking at a screen or object that represents that device or window. In addition, TV content may be altered according to viewing patterns of the user, most notably by incorporating multiple scenarios that are played out according to the viewing behavior and visual interest of the user, for example, by telling a story from the point of view of the most popular character. Alternatively, characters in paintings or other forms of visual display may begin movement or engage in dialogue when receiving fixations from a subject user. Alternatively, viewing behavior may be used to determine what aspects of programs should be recorded, or to stop, mute or pause playback of a content source such as DVD and the like.
Eye tracking technology can be used to control the location, size, transparency, shape, or motion of visible notification dialogs on large or small screens according to viewing behavior of the user. In particular, on large screens the technology allows the establishment of peripheral vision boundaries of the user's eyes, ensuring that a window is placed in view. On small screens, notification windows can be placed out of the way of the user's foveal vision, and can be acknowledged and removed after the user has viewed them. In addition, the control of any hidden or visible cursor on a display can be used to communicate attention to underlying applications or systems. In addition, activation and zooming or resizing of focus windows, and the reorganization of windows on a display, can be implemented according to the viewing behavior of the user or the movement of the user in front of the display. The latter may be accomplished by allowing users to look at the subsequent focus window, after which a key is pressed to activate this window and make it the front window. This may incorporate zooming of the front window according to an elastic tiled windowing algorithm, or fisheye view zoom of the front window. In addition, the disclosing of attention of others for notes on a public display board, by modulating aspects of size, shape or color of displayed notes, may be accomplished according to the number of times they have been viewed.
Eye tracking can be used to make the content of a display visible only to the current user, by using eye fixations to position a gaze-contingent blurring lens, directional lens, or obstruction that projects the image at the fixation point of that user but not elsewhere. This results in a screen that can only be read by the current user, and not by any other onlooker. Alternatively, the state of the screen may be altered by, for example, but not limited to, darkening, wiping, or changing its contents. Further, visual or auditory notification may be provided upon detecting more than one pair of eyes looking at the display. This is particularly useful when computing devices are used in public, for private matters. Eye tracking may also be used to modulate transparency of surfaces, for example, but not limited to, cubicle walls, upon orientation or co-orientation of the eyes, face(s), or head(s) of a subject or subjects towards that surface.
Eye tracking may also be used in advanced hearing aids to provide aiming information for directional microphones or noise cancelling microphones.
Eye tracking may be incorporated invisibly and without restrictions into vehicles to control dashboard or instrument cluster operation, to alter lighting conditions of vehicle illumination or dashboard indicators and instruments, to reduce impact on visual attention. Displays (including projections on windows) may be altered according to viewing behavior, for example, to ensure that eyes remain focused on the road, or to direct the destination of speech commands to appliances or objects within or outside the vehicle. In addition, the detection of fatigue, the operation of vehicle navigation systems, entertainment systems, visual display units including video or televisions, the selection of channels on a radio or entertainment system, and the initiation and management of remote conversations may all be carried out using the invention, according to the visual attention of the user.
Eye tracking may be used for sensing attention in remote or same-place meetings, for editing recordings of such meetings, or for the purpose of detecting presence or initiating interactions with remote or co-present attendees, or for communicating attendee attention in order to optimize a turn taking process among several remote attendees.
Eye tracking may also be used for sensing user attention towards any mobile or portable computing device to determine when a user is paying attention to the visual information provided on the device. Audiovisual media played on the device may be paused or buffered automatically upon the user looking away from the device. The device may continue playing the buffered audiovisual stream whenever the user resumes looking at the device. For example, a mobile device may provide speed reading facilities. The device streams words across a display screen in a timed manner, allowing the user to read without producing fixations. When the user looks away, the stream of words is paused, and when the user looks back at the device, the stream of words continues.
Eye contact sensing objects provide context for action, and therefore a programmable system may employ eye tracking or gaze estimation to determine context. A display may be presented, optimized to present different available contexts, from which the user may select by simply looking. When there are multiple contexts, or hybrid contexts, the user may have a complex eye motion pattern which can be used to determine complex contexts.
See, U.S. Pat. Nos. 3,689,135; 4,075,657; 4,102,564; 4,145,122; 4,303,394; 4,641,349; 4,651,145; 4,702,575; 4,755,045; 4,836,670; 4,973,149; 4,975,969; 5,008,946; 5,016,282; 5,231,674; 5,287,437; 5,325,133; 5,331,149; 5,345,281; 5,360,971; 5,428,413; 5,471,542; 5,481,622; 5,583,795; 5,638,176; 5,649,061; 5,668,622; 5,726,916; 5,797,046; 5,805,167; 5,818,954; 5,898,423; 5,912,721; 5,926,251; 5,984,475; 5,991,085; 6,120,461; 6,152,563; 6,154,559; 6,163,336; 6,204,828; 6,215,898; 6,220,706; 6,243,076; 6,246,779; 6,299,307; 6,323,884; 6,369,952; 6,381,339; 6,393,136; 6,397,137; 6,456,737; 6,477,267; 6,478,425; 6,546,121; 6,568,809; 6,578,962; 6,603,504; 6,608,615; 6,634,749; 6,659,611; 6,659,611; 6,727,866; 6,753,847; 6,843,564; 6,853,854; 6,943,754; 6,989,754; 6,999,071; 7,040,759; 7,043,056; 7,076,118; 7,084,838; 7,090,348; 7,091,931; 7,138,997; 7,161,596; 7,190,825; 7,197,165; 7,239,293; 7,306,337; 7,307,609; 7,315,324; 7,324,085; 7,331,929; 7,345,664; 7,388,580; 7,391,887; 7,396,129; 7,401,920; 7,423,540; 7,428,001; 7,448,751; 7,460,940; 7,490,941; 7,501,995; 7,503,653; 7,520,614; 7,528,823; 7,538,746; 7,542,210; 7,554,541; 7,556,377; 7,567,702; 7,572,008; 7,583,252; 7,600,873; 7,620,216; 7,626,569; 7,633,493; 7,650,034; 7,665,845; 7,693,256; 7,701,441; 7,702,660; 7,703,921; 7,705,876; 7,706,575; 7,724,251; 7,731,360; 7,742,623; 7,753,523; 7,762,665; 7,766,479; 7,768,528; 7,787,009; 7,801,686; 7,809,160; 7,810,926; 7,815,507; 7,819,525; 7,834,912; 7,839,400; 7,857,452; 7,860,382; 7,862,172; 7,866,818; 7,869,848; 7,872,635; 7,880,739; 7,916,977; 7,925,077; 7,938,540; 7,948,451; 7,963,652; 7,970,179; 7,974,787; 7,983,733; 7,986,318; 7,999,844; 8,014,571; 8,020,993; 8,065,240; 8,069,125; 8,077,217; 8,077,914; 8,099,748; 8,100,532; 8,121,356; 8,130,260; 8,150,796; 8,154,781; 8,155,479; 8,170,293; 8,175,374; 8,195,593; 8,199,186; 8,219,438; U.S. Pub. App. Nos. 20030098954; 20030123027; 20040174496; 20040239509; 20050073136; 20050175218; 20050288564; 20060028400; 20060110008; 20060227103; 20080002262; 20080287821; 20090018407; 20090024050; 20090110245; 20090112616; 20090112617; 20090112620; 20090112621; 20090118593; 20090119154; 20090132275; 20090156907; 20090156955; 20090157323; 20090157481; 20090157482; 20090157625; 20090157660; 20090157751; 20090157813; 20090163777; 20090164131; 20090164132; 20090164302; 20090164401; 20090164403; 20090164458; 20090164503; 20090164549; 20090171164; 20090172540; 20090219484; 20090318773; 20100033333; 20100039617; 20100086200; 20100086221; 20100086278; 20100149073; 20100208205; 20100280372; 20100295774; 20110007275; 20110018903; 20110170065; 20110178784; 20110182472; 20110228975; 20120026276; 20120092618; 20120105486; 20120116559; 20120120498; 20120120499; 20120133889; 20120134548; 20120154633; 20120154920; 20120164613; and Foreign Patent Nos. JP1990224637; JP1995055941; JP2003015816; JP2004504684; JP2005230049; JP2006328548; JP2007055941; JP2007073273; WO2002009025; WO2004045399; WO2008069158, each of which is expressly incorporated herein by reference.