Traditional viewing of video is burdened by its inherently passive experience. Regardless of the device (TV, Movie Screen, Mobile Device, Computer, Tablet computer, etc.) one uses to watch video, viewers of that video are unable to directly interact with the visual objects in that video (where objects in the video refer to, but are not limited to, the people, clothing, landscape, buildings, cars, electronic or other devices, food or drink items, furniture, actors, faces, jewelry, hair, etc.) as those objects appear in one or more frames of any given video.
Currently, there is no system or method that identifies, encodes, and tracks visual objects in video and allow viewers to interact with those objects, whether by clicking, touching, pointing, waving or a similar interaction method, hereinafter referred to as “clicking”) in order to: (i) discover the identity and related metadata of said object, (ii) be provided with an opportunity to purchase that object, (iii) be served an advertisement directly based on the identity of said object, and/or be offered (iv) a richer content experience based on the identity of said object.