Augmented reality (AR) is an enterprise endeavor to transpose legacy interaction methods to a new layer of superposed interfaces over the world's real objects. The core complexity of this task requires the detection of environment surfaces and semantics across different locations, materials and light conditions.
In opposition to most traditional computer interfaces, displaying graphic interfaces in this new medium is not simple since it does not respond to a regular square canvas with predictable boundaries. The augmented reality canvas is arbitrary for the most part, generally balancing three main constraints in order to perform in a visible and usable manner: the environment surfaces, the device's field of view (FOV), and general computer resources.
Detecting environment surfaces is as fundamental as complex. Commonly labeled as Simultaneous Localization & Mapping (SLAM) and Inertial Measurement Unit (IMU) systems, these algorithms continuously apply complex technology models to identify the user's point of view (POV) within a scene, generally interpreting a sequence of scene features in conjunction with accelerometers and other specialized components. With the point of view at hand, it is still necessary to understand the spatial relation between scene objects, identifying the available planes and surfaces. All this operation needs to cycle several times per second (desirably 60) in order to provide continuity and responsiveness to the user's ever moving head pose, demanding heavy R&D investments to support proper hardware design, performance and autonomy, especially for wearable smart glasses' Head-mounted Displays (HMDs).
The present invention relates to the natural consequences of all these constraints—specially the field of view limitation and the spatial alienation that users experience when Graphical User Interfaces (GUIs) go beyond their narrow HMD screen area—by abstracting one novel interaction model that is able to signalize and open GUI elements in areas adjacent to the one the user is currently looking at.
In the current state of the art, the following technologies that enable the implementation (technical viability) of the present method can be found:
1. Mixed Reality (MR) is the space between augmented reality and virtual reality, which allows real and virtual elements to be combined in varying degrees. Mixed reality is made possible by improvements in computer vision, graphical processing power, display technology, and input systems.
2. Augmented Reality (AR) contains primarily real elements and, therefore, is closer to reality. For example, a user with an AR application on a smartphone will continue perceiving the real world in the normal way, but with additional elements that are displayed through the smartphone—the real-world experience is dominant.
3. Virtual Reality (VR) immerses the user in a completely computer-generated environment, removing any restrictions as to what a user can do or experience.
4. Input/output (I/O) is the communication between an information processing system (e.g. computer) and the outside world. Inputs are the signals or data received by the system and outputs are the signals sent from it.
5. Input device is defined as any device that enters information into an information processing system from an external source (e.g. keyboards, touch screens, mouse, microphones, scanners).
6. Gaze is a form of input and a primary form of targeting within mixed reality. Gaze indicates where the user is looking in the real world and that allows the system to determine the user's intent. It is important to note that MR headsets use the position and orientation of the user's head, not eyes, to determine their gaze vector.
7. In the other side of the input-process-output (IPO) is the Graphical User Interface (GUI). Almost all digital interfaces nowadays are GUIs. An interface is a set of data, commands/controls and/or menus/navigation displayed on a screen, through which a user communicates with a program—in other words, GUI is the part of a system through which the user interacts.
8. AR HMDs (Holographic devices and Immersive devices) are the devices that deliver MR/AR experiences. Holographic devices are characterized by the device's ability to place digital content in the real world and Immersive devices are characterized by the device's ability to hide the physical world and replace it with a digital experience, while creating a sense of presence.
The following solutions that have some similarities with the method of the present invention but that are technically different or have different objectives/purposes are described below.
The patent document U.S. Pat. No. 9,317,113 B1, titled “GAZE ASSISTED OBJECT RECOGNITION”, granted on Apr. 19, 2016, by Amazon Technologies, Inc., discloses a user's gaze to define a method that visually recognizes objects in a given scene, offering context information about that specific object. This method would work with devices such as mobile phones or AR devices, operated with or without the aid of a peripheral device/accessory. It differs from the interaction method of the present invention as it describes in detail how this context or any other information would be accessed from the initial reference point, being it a surface, an object, a composition, a pattern, the user body, belongings or accessories, or yet another virtual element.
The patent document US 2014/0168056 A1, titled “ENABLING AUGMENTED REALITY USING EYE GAZE TRACKING”, filed on Mar. 15, 2013, by QUALCOMM INCORPORATED, takes eye-tracking data into consideration to limit object/image recognition to a certain portion of what user sees in a given moment. In that sense the recognition results would be shown based on the area of interest instead of just on the general gaze direction. It deviates from the method of the present invention because of the very nature of the proposition, since the method of the present invention is completely agnostic of the accuracy or the underlying technology supporting the capture of gaze inputs.
The patent document EP 1 679 577 A1, titled “ADAPTIVE DISPLAY OF EYE CONTROLLABLE OBJECTS”, filed on Jan. 10, 2005, by Tobii Technology AB, discloses eye-tracking input to display an array of items with sizes that change as users look to different portions of an arbitrary modification zone of a computer-based display. While the interaction scopes the behavior of an array of elements while the user's eyes are tracked within a certain modification zone, the method of the present invention is completely agnostic of what is to be controlled, of the gaze methods applied, and of the zones used to control change. Rather than that, it focuses on how one clue can signalize and give access to a related and hidden feature, using one very flexible gaze-based distance relation to understand when the given feature should be signalized and then delivered to the device's field of view.
The patent document US 2017/0184848 A1, titled “AUGMENTED REALITY DISPLAY SYSTEM WITH VARIABLE FOCUS”, filed on Dec. 29, 2015, by TUOMAS VALLIUS, describes one specific composition for lenses of near-eye displays, in a way to enable simulated focal variation in augmented reality applications. It is radically distant from the essence of the present invention, given that it focuses on specific composition of materials for near-eye displays, with lenses that can reproduce sensitive focal variation whereas the propositions of the present invention are about how users can interact with the projections that those (and many other) kinds of lenses are able to reproduce.
The patent document US 2016/0284129 A1, titled “DISPLAY, CONTROL, METHOD OF DISPLAY, AND PROGRAM”, filed on Feb. 22, 2016, by SEIKO EPSON CORPORATION describes the very basic concept of a HMD device with the ability to display and update the synchronic images served to each eye according to the inputs of eye-distance and controller devices. The description of HMD hardware and concept are however out of the scope of the interaction method of the present invention, which only uses these devices to perform over the flexibility of the software layer, mostly agnostic of the component concept or structure supporting it underneath.
The patent document U.S. Pat. No. 9,761,057 B2, titled “INDICATING OUT-OF-VIEW AUGMENTED REALITY IMAGES”, filed on Nov. 21, 2016, by MICROSOFT TECHNOLOGY LICENSING, LLC, discloses that a positional information can be presented in an AR device to inform users about the path to or simply the presence of environment related objects and elements that are not seen in a given moment. It is different from the method of the present invention, which defines instead a very specific way to activate/interact not only with those, but with any type of information or feature, including elements that are not related to one specific environment such as general-purpose GUIs and projections. The method of the present invention also serves the purpose of manipulating the perception of common FOV limitations when using augmented reality devices.