Unless otherwise indicated herein, the materials described in this section are not prior art to the claims in this application and are not admitted to the prior art by inclusion in this section.
Augmented reality (AR) systems use computerized devices to provide a human machine interface that enables a human to view a real-world physical environment while providing a display of virtual graphics that “augment” features of the physical environment. A common type of AR device includes transparent glasses with one or more sensors and a video projection device that a human operator wears. The transparent glasses enable the human to view the real-world physical environment and the AR device projects graphical data onto the glasses or directly onto the eyes of the user at locations corresponding to objects, locations, or other features in the physical environment. In some instances, the graphics display information or otherwise add graphical elements to “augment” a physical object in the physical environment while in other instances the graphics provide a two-dimensional or three-dimensional rendition of information or a virtual object that does not actually exist in the physical environment. While augmented reality systems share some features with “virtual reality” (VR) systems, one distinction between AR and VR systems is that AR systems provide a visual depiction and graphical interaction with a real-world physical environment that is not generated by a computer and is not under the control of a computer, while VR systems produce graphical displays of completely computer-generated environments. As such, many operations in AR systems require additional processing to enable the AR system to measure parameters of the physical environment around the AR system to provide accurate augmented graphics.
One function of an AR system performs “localization” to identify the locations of sensors in the AR system in relation to the environment around the system. In particular, many AR systems use both a camera system and inertial sensors, such as MEMs accelerometers and gyroscopes, to perform the localization. The prior-art systems combine the outputs of the camera and the inertial sensor together to perform localization. Since the sensors are generally integrated into a device that is worn by a user, localization of the sensor locations also provides localization of the user in the environment. Most prior-art AR systems assume a static environment and handle only the primary motion consistent with the inertial coordinate frame, which is to say these systems can perform localization in a static, non-moving environment (“inertial coordinate frame”) and then use the input of the AR sensors to identify the motion of the AR system and the user with respect to the static inertial coordinate frame (“primary motion”). A simple example of this is to measure the movement of a user who wears an AR system in a stationary room that provides the inertial coordinate frame.
Existing AR systems are substantially less effective in handling dynamic environments in which there are multiple types of motion and multiple coordinate frames that produce inconsistent or “conflicting” sensor data inputs. To cite a common, non-limiting example, when a user wears an AR device in a moving motor vehicle the true inertial reference coordinate frame, such as the non-moving road, appears to be moving in the visual input from the camera system sensors while the inertial sensors in the AR device might register no movement whatsoever if the wearer of the device sits still and does not move relative to the interior of the vehicle while the vehicle travels at a constant velocity (which produces no acceleration). The interior of the vehicle is said to be a “local coordinate frame” because the local movement of the user and the AR system is relative to the interior of the vehicle even if the entire vehicle and the AR system are also moving relative to the inertial coordinate frame of the road. Furthermore, any movement of the user and AR system within the vehicle produces inertial motion data that do not match the perceived movement from the camera system because of course the movement of the vehicle relative to the inertial coordinate frame of the road is substantially different than the movement of a user relative to the local coordinate frame of the vehicle. Even the input data from the camera system is typically inconsistent because a portion of each generated image of video data includes the local coordinate frame of the interior of the vehicle, which is static relative to the user unless the user moves, while another portion of the image includes the inertial coordinate frame of the exterior environment that appears to be moving relative to the local coordinate frame of the vehicle.
The aforementioned problems in existing AR systems reduce the accuracy of measurements of both the primary motion of the AR sensors in the AR system and provide further challenges in accurately tracking “secondary motions”, which refer to movements of the sensors in the AR system relative to a local coordinate frame. Using the moving vehicle example above, secondary motions occur in the local coordinate frame of the interior of the vehicle, such as the movement of the AR system itself, or of another object within the interior of the vehicle. Accurate tracking of the secondary motion requires accurate tracking of the primary motion, which is difficult in a local coordinate frame due to the sensor conflicts that are described above. For example, there are techniques that are known to the art for motion tracking of moving objects in video data in an inertial coordinate frame, but if the camera is moving in an unknown manner due to inaccurate primary motion tracking arising from conflicting sensor data, then accurate movement tracking of the camera relative to an external object in a local coordinate frame with the video data becomes substantially more difficult or impractical. Additionally, while some techniques are known to the art to improve the accuracy of primary motion tracking in situations that include multiple coordinate frames, these techniques rely on identifying and rejecting potentially conflicting sensor data that improves the accuracy of primary motion detection based on the inertial coordinate frame but prevents accurate secondary motion detection for the local coordinate frame. Furthermore, while of course systems and methods exist for tracking relative movement using active sensors such as RADAR and LIDAR, these active sensors are impractical for use in many AR systems.
As described above, prior-art AR systems encounter difficulties in performing localization and tracking primary and secondary movement in situations where video and inertial sensor data experience conflicts due to the relative motion produced by both an inertial coordinate and a local coordinate frame in the sensor data. Consequently, improvements to AR systems and methods of operation thereof to improve the accuracy of both primary and secondary movement detection would be beneficial.