Users are increasingly utilizing electronic devices to obtain various types of information. For example, a user wanting to learn the name of a song playing in the background can cause a sample of that song to be recorded by an electronic device and uploaded to a song identification service for analysis. Likewise, a user wanting an answer to a question can use his voice to ask his device the question, such that the device will process the user's voice and retrieve the answer to the question. In a similar fashion, a user wanting to determine the availability of a book can capture an image of the book and upload that image to a book identification service for analysis. Accordingly, automated object recognition systems to recognize and track objects in an image or multiple frames of an image are ever becoming more sophisticated. Conventional systems have utilized feature-based object tracking algorithms, such as Scale-invariant feature transform (SIFT) or Speeded Up Robust Feature (SURF) algorithm, to identify distinguishing features (which are usually corners) and calculate descriptors (unique fingerprints) for each feature point. These systems identify hundreds of feature points and their corresponding descriptors are computed for each frame in a sequence of video, for example, and a computationally intensive algorithm, such as brute force matching or Random Sample Consensus (RANSAC) algorithm, is used to track these points from frame to frame. This has enabled various augmented reality applications to identify objects and points of interest within a live view and provide information about objects or points of interest in an overlay. In order to match the feature points identified by these algorithms to real-world objects, a computing device, or system in communication therewith, must compare the feature points to images stored for these real-world objects. Since there are so many objects and points of interest, image databases often lack images from all possible angles and under various types of lighting conditions. This often leads to unrecognized or misrecognized information and additionally leads to the misplacement of overlay information. This is particularly bothersome when panning across a scene since the misplaced and misrecognized information often leads to an erratic and discontinuous display of augmented reality information. Therefore, as technology advances and as people are increasingly using portable computing devices in a wider variety of ways, it can be advantageous to adapt the ways in which these image databases are populated to ensure they contain sufficient image coverage of objects and points of interest from various angles, zoom levels, elevations and under various lighting conditions for the accurate recognition and overlay placement of information to provide users with a smooth and continuous viewing experience.