Technical Field
The present disclosure relates generally to augmented reality, and more specifically, to augmented reality techniques that provide improved tracking of a camera's pose during an augmentation session.
Background Information
Augmented reality is a technology in which a view of a physical environment (i.e. a real-life environment) captured by a camera is merged with computer-generated graphics, text or other information (hereinafter “computer-generated elements”), such that the computer generated features appear as if they are part of the physical environment. The result of this merging is an augmented view. The augmented view may be static (e.g., a still image) or dynamic (e.g., a video). In contrast to virtual reality, where a simulated environment is shown to a user instead of the physical environment, augmented reality blends the virtual with the real to enhance a user's perception of the physical environment.
In order to create an augmented view, the position and orientation of the camera (hereinafter the camera's “pose”) that captures the view of the physical is generally used to position and orientate a virtual camera within a computer-generated environment. It is then determined which portions of the computer-generated environment that are visible to the virtual camera. These portions are used to produce the computer-generated elements that are merged with the view of the physical environment, resulting in the augmented view.
In many cases, the camera capturing the view of the physical environment is not limited to a fixed position. The camera may have an initial pose, and subsequently may be moved about the physical environment by a user, taking on subsequent poses. In order to create an augmented view, the initial pose of the camera, and changes to the initial pose generally, need to be accurately determined. Operations to determine the camera's initial pose are generally referred to as “initialization”, while operations for updating the initial pose to reflect movement of the physical camera are generally referred to as “tracking.” A number of techniques have been developed to attempt to address challenges presented by initialization. Techniques have also been developed to attempt to address the challenged presented by tracking. However, the techniques suffer shortcomings.
Many tracking techniques rely upon detection of features in the view of the physical environment captured by the camera. The features may be lines, corners, or other salient details that can be reliably detected. A tracking algorithm may then be applied to these features. However, while a tracking algorithms may operate acceptably when there are a large number of detected features, reliability may decrease as the number of features decreases. Many cameras have very limited fields of view, limiting the portion of the physical environment they can see at a given moment in time. For example, many “standard” cameras have field of view of about 40° to 60° in a given orientation. If the portion of the physical environment visible within this field of view happens to have few features, a tracking algorithm may struggle to track the camera's pose, leading to an inaccurate and potentially jittery augmented view. For example, if a camera is directed towards a uniform white wall having no features, tracking may be lost. Further, if a substantial portion of the physical environment is occluded, for example by a person or object temporarily moving in front of the camera, tracking may be lost. Still further, even if a reasonable number of features are detected, they are typically grouped closely together, along one general orientation, which may hinder calculation of an accurate pose.
Accordingly, there are is a need for augmented reality techniques that provide improved tracking of a camera's pose during an augmentation session.