Augmented reality (AR) processing of video sequences may be performed in order to provide real-time information about one or more objects that appear in the video sequences. With AR processing, objects that appear in video sequences may be identified so that supplemental information (i.e., augmented information) can be displayed to a user about the objects in the video sequences. The supplemental information may comprise graphical or textual information overlayed on the frames of the video sequence so that objects are identified, defined, or otherwise described to a user. In this way, AR may provide an enhanced real-time experience to the user with respect to video sequences that are captured and displayed in real-time.
Unfortunately, AR processing can be very complicated, and may require extensive processing capabilities. Furthermore, in AR processing, it may be difficult to distinguish objects of interest to the user from objects that are irrelevant to the user within a video sequence. The supplemental AR information may be desirable for objects of interest, but may be less desirable, or even undesirable, for irrelevant objects. AR processing may be particularly challenging in hand-held devices, such as cellular telephones, smartphones, digital cameras, or other hand-held devices that support video capture, where processing capabilities and battery power are limited.