The use of body worn wearable cameras (referred herein as egocentric cameras) is on the rise. These cameras are typically operated in a hands-free, always-on manner, allowing the wearers to concentrate on their activities. While more and more egocentric videos are being recorded, watching such videos from start to end is difficult due to two aspects: (i) the videos tend to be long and boring; and (ii) camera shake induced by natural head motion further disturbs viewing. These aspects call for automated tools to enable faster access to the information in such videos.
Fast forward is a natural choice for faster browsing of egocentric videos. The speed factor depends on the cognitive load a user is interested in taking. Naïve fast forward uses uniform sampling of frames, and the sampling density depends on the desired speed up factor. Adaptive fast forward approaches known in the art try to adjust the speed in different segments of the input video so as to equalize the cognitive load. For example, sparser frame sampling giving higher speed up is possible in stationary scenes, and denser frame sampling giving lower speed ups is possible in dynamic scenes. In general, content aware techniques adjust the frame sampling rate based upon the importance of the content in the video. Typical importance measures include scene motion, scene complexity, and saliency. None of the aforementioned methods, however, can handle the challenges of egocentric videos.