Videos are composed of series of still images, each representing photometric qualities. To be useful in many applications, there is a need for automated correlation between these photometric features and physically cognizable objects, e.g., people, scenery, etc. Identification of physical objects may be accomplished by object extraction. With the proliferation of videos (e.g., on the Internet), there is an increasing need for efficient methods and apparatus for extracting objects to support object-based tagging and searching of videos.