Prior to the background of the invention being set forth, it may be helpful to set forth definitions of certain terms that will be used hereinafter.
The term “object” as used herein is defined as an entity in an image (e.g. a photo), or set of images, that corresponds to a real object in the world, e.g. a person, pet or even an inanimate object as a car. Therefore, a single person that is recognized in multiple images will be considered as a single object, having several instances.
As smartphones with cameras become ubiquitous, people take photos on a daily basis. A compelling way to view these images is by creating a music clip from them. However, the process of editing the images and providing a soundtrack is both time consuming and requires editing knowledge. One of the most important challenges is to order the photos or the images in such a way that will “tell a story” rather than an otherwise unarranged sequence of images. The term “story telling” used herein is sometimes referred elsewhere as “cinematography”.
It would be advantageous to use the computing power of a computer for making the decisions of the ordering and arrangement of the captured images, to create a clip that will be interesting to watch in the sense of a “story told” by the clip.