In recent years, there has been a proliferation of consumer digital cameras and camera-equipped mobile devices (e.g., smartphones and tablets). The cost of such devices and digital media storage continue to decrease, while usage continues to increase. Accordingly, there has been an explosion in the amount of digital image and video data produced and stored. However, much of this data consists of long-running or unedited content such as unsorted photo collections, home videos, or surveillance feeds.
A photo or video summary can provide a visual synopsis, or a “trailer” of sorts, to quickly reveal the subject matter of media content and highlight the more salient portions. When browsing digital content, for example, on a social content-sharing platform, being able to identify quickly the more interesting parts of a video can save significant time. Moreover, video summaries can provide a compact but rich feature set for activity-recognition heuristics.
To create compact summaries from unedited content, existing summarization techniques may prioritize selection of higher quality content or segments with significant motion or activity. However, conventional techniques may overemphasize these factors to the detriment of providing comprehensive coverage of the input content. Moreover, conventional techniques are generally too computationally intensive to be effectively implemented on computationally constrained mobile devices. Accordingly, content captured at a camera-equipped mobile device often must be offloaded to another computing device for effective summarization.