Videos of events such as weddings and travel are captured using devices such as cameras, smartphones and camcorders. The videos may vary in length, and often need to be combined to tell the whole story of the event. Sometimes, the combined videos can be long and time-consuming to watch. One method of providing a quick summary or storytelling of an event is to create a photo book using frames taken from the video. However, photo books generally do not provide a natural sense of the motion present in the video, as it would require many frames to do so using conventional methods, especially to represent a complex motion.
Additionally, if there are too many frames, it is difficult to fit the frames into a page, such as an A4 page. Hence, most layouts focus on the aesthetics of the photo book, or depict only simple motion.
In one conventional method, images are arranged in a photo book using pre-defined layouts, where the pre-defined layouts are based on time and direction, where the layout is characterised by a path with pre-defined arrangement points. The arrangement points are also characterised by time or direction. In this photo book method, the images are arranged along the path by matching time/direction information extracted from the images with the time/direction information associated with the arrangement points. However, in such a method, the disadvantage is that the layout paths have to be pre-defined. Also, the method can be used only to describe simple motion, and layout does not change based on the local motion characteristics.
In a second conventional method, a mosaicing system is used to generate visual narratives from videos depicting the motion of one or more actors. In this second known method the foreground and background regions of video frames are composited to produce a single panoramic image, using a series of spatio-temporal masks. The user selects the frames to create a linear panoramic image. However, in this method, even though the images on the layout path indicate motion, there is the disadvantage that user has to select layout path and images to create a panoramic image. In addition, the panoramic image is linear and requires many frames to express a complex motion.
In third conventional method, a graphical user interface is provided to select the page size, decisive frames (i.e., frames that are salient in an action sequence that is characterised by sudden motion changes) and frames surrounding the decisive frame. A template is used in this third conventional method and frames are made to fit the template. However, this third method, like the other methods described above, has the disadvantage that the selection of the layout path is not dynamic. The placement of images on the layout path is also not dynamic. The layout path and the image characteristics do not express the motion.
A similar problem occurs when trying to represent the track of a moving object, such as a person, in a video summary image. In one method, images of the moving objects or blobs representing the objects are shown with varying opacities on the object track. The opacities may combine due to the actions of the object, such as moving or stopping. This varying opacity method has the advantage that the layout path is dynamic and the blobs on the layout path show motion. However, adapting this varying opacity method to a photo book, by using the video frames in place of the moving object images or blobs would produce just a single image showing a simple motion. In addition, with the varying opacity method, it is difficult to show all the blobs' characteristics at every track position, in a single image.
Thus, it is difficult to express motion in a photo book page layout, when laying out video frame images.