Video is an effective way to capture a scene or an unfolding event. People often capture a video sequence for birthday parties, weddings, travel and sports events. Unlike still images, video has an advantage of capturing evolving, unstructured events, such as particular natural facial expressions and human interactions (e.g. talking, mutual smiling, kissing, hugging, handshakes). It is often desirable to select individual frames from a sequence of video frames for display or for use as content in printed books in the same way as still images are used.
With increasing demand and accessibility of mobile phones and other consumer oriented camera devices, more and more video data is being captured and stored. Videos present a problem due to the large number of frames of a video sequence that are candidates for selection for printing or display. A video of 10 minutes may have eighteen thousand frames.
A common scenario for frame selection is that a user selects a number of video sequences and requests that a selection system process the selected video sequences to select frames for printing or display. An example is a user providing a set of video sequences captured within a particular year and requesting a photobook for that year made up of frames selected from the selected video sequences. The user expects the selection system to operate in a timely manner. The user might expect, for example, that the selection system process an hour long set of video sequences in less than ten minutes. Such an expectation presents a challenge when the processing system may be a personal computer or mobile device.