The discussion below is merely provided for general background information and is not intended to be used as an aid in determining the scope of the claimed subject matter.
Due to the increasing capabilities of multimedia devices, mobile augmented reality (AR) applications are rapidly expanding. These AR applications allow enrichment (augmentation) of a real scene with additional content, which may be displayed to a user in the form of augmenting a camera image of the real-world scenery with computer generated graphics. The augmentation thereby provides an “augmented reality” user-experience.
Augmented reality platforms, such as the Layar Vision platform, allow an AR application to recognize an object in an image frame and to render and display content together with the recognized object. In particular, an AR application may use vision-based object recognition processes to recognize whether a particular object is present in the scene. Furthermore, the AR application may use a pose estimation process to determine position and/or orientation (pose information) of the object based on information in the image frame and sensor and/or camera parameters. The pose information is then used to generate the augmentation for the object.
Examples of known image processing algorithms for object recognition and tracking are described in the article by Duy-Nguygen Ta et al. “SURFrac: Efficient Tracking and Continuous Object Recognition using local Feature Descriptors” IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'09), Miami, Fla., Jun. 20-25, 2009. Object recognition may include extracting features from the image frame and matching these extracted features with reference features associated with objects stored in a database. By matching these reference features with the extracted features, the algorithm may determine that an object is “recognized”. Thereafter, the recognized object may be subjected to a pose estimation (tracking) process wherein the new state of the object is estimated on the basis of new observables (e.g. a new image frame and its extracted features) and the previous state of the object determined on the basis of a previous image frame. Computer generated graphics are generated using pose information estimated in the tracking process, and the computer generated graphics are then composed with the camera image of the real word scenery. As a result, the computer generated graphics appear “stuck” onto the object to the user.
Content creation for augmented reality may be time consuming if a large number of real world objects are to be augmented. Content creation may require complex technical expertise on the part of the content author. One possible solution to these problems is to automatically place content (i.e. the computer generated graphics), such as text, images, videos, advertisements, etc., by placing the content anywhere or at random in relation to the object. However, such a method has disadvantages. Automatically-placed content can unintentionally cover or obscure an important part of the object, thereby negatively affecting user experience. Further, if the image frame of the real world scenery depicts only a small part of the object (e.g., if the user is zoomed-into or looking mostly at the automatically-placed content), then the augmented reality system would be unable to recognize the object or perform tracking. In other words, the content would disappear from the display because the automatically-placed content is placed near an area of the object where there are insufficient features to enable object recognition and/or tracking.
Accordingly, there is a need to provide improved methods and systems that at least alleviate some of these problems.