Various image capturing devices have become prevalent in recent years as a variety of mobile devices, such as cellular telephones, video recorders or the like, having cameras or other image capturing devices have multiplied. As such, it has become common for a plurality of people who are attending the same event to separately capture video of the event. For example, multiple people at a sporting event, a concert, a theater performance or the like may capture video of the performers. Although each of these people may capture video of the same event, the video captured by each person may be somewhat different. For instance, the video captured by each person may be from a different angle or perspective and/or from a different distance relative to the playing field, the stage, or the like. Additionally or alternatively, the video captured by each person may focus upon different performers or different combinations of the performers.
In order to provide a more fulsome video of an event, it may be desirable to mix the videos captured by different people. However, efforts to mix the videos captured by a number of different people of the same event have proven to be challenging, particularly in instances in which the people who are capturing the video are unconstrained in regards to their relative position to the performers and in regards to the performers who are in the field of view of the videos.