With the advent of the Internet, digital still cameras, and digital video cameras, individuals routinely assemble large collections of “digital media.” As those collections grow it becomes more and more difficult to quickly locate and identify a desired item of media for review and/or editing.
Several techniques have been devised in an effort to resolve this problem. For example, some techniques identify a “keyframe” as a representative for that particular item of media. However, one problem with current techniques of keyframe selection is that similar items of digital media (i.e. those containing similar content) will often result in keyframes that are similar to the point of being indistinguishable. That situation is quite common even in professionally-produced digital video. For example, a common film technique is to compose a dialog as a sequence of alternating shots of each speaker. After segmentation, each shot of the same speaker will be quite similar, as it will be taken from the same angle of the same subject with the same lighting, background, etc. Many common video sources share this problem, such as short video clips from a digital camera, or pre-segmented results from a segment-based video repository.
Therefore, it is desirable to produce a system and method which automatically selects keyframes that are both representative of the digital media and distinctive from other selected keyframes.