Users are increasingly utilizing electronic devices to obtain various types of information. For example, a user wanting to learn the name of a song playing in the background can cause a sample of that song to be recorded by an electronic device and uploaded to a song identification service for analysis. Similarly, a user wanting to determine the availability of a book can capture an image of the book and upload that image to a book identification service for analysis. Accordingly, automated object recognition systems to recognize and track objects in an image or multiple frames of an image are ever becoming more sophisticated. Conventional systems have utilized feature-based object tracking algorithms, such as Scale-invariant feature transform (SIFT) or Speeded Up Robust Feature (SURF) algorithm, to identify distinguishing features (which are usually corners) and calculate descriptors (unique fingerprints) for each feature point. These systems identify hundreds of feature points and their corresponding descriptors are computed for each frame in a sequence of video, for example, and a computationally intensive algorithm, such as brute force matching or Random Sample Consensus (RANSAC) algorithm, is used to track these points from frame to frame. On many electronic devices, such as mobile device, CPU and memory resources are tightly constrained, making the use of computationally expensive feature-based object tracking algorithms impractical. The most computationally demanding aspect of these algorithms is calculating the descriptors. It would, therefore, be advantageous to devise an object tracking method that obviates the need to calculate these descriptors.