Many types of events are held everyday which generate or are capable of generating different types of multimedia data. For example, consider the typical sporting event or music concert. Such events may be the subject of live broadcasting, filming, or streaming over the internet. The video of the event may be recorded from multiple camera angles, and focused at many different subjects from different parts of the music stage or sports field. In addition, for the concert example, sound recordings may be taken from many different locations, performers, or instruments. Still photographs are yet another type of media which may be captured for the event from many locations to obtain photographs of many different scenes at the event.
As is evident, any event may be associated with multiple sources of data that are created or recorded for that event. The data may be of different types and formats, e.g., sound, video, photographs, etc.
While these many devices are capturing data relating to the exact same event, conventionally these capture devices are completely independent from one another. The conventional media that is used to capture these events, e.g., film, MPEG4, MPEG3, etc, inherently includes only information specific to each individual recording device and medium. Therefore, while the MPEG4 video recording may provide an accurate video of what is being recorded from a very specific camera angle at a very specific recording subject, there is no inherent way to correlate or relate that recording with any other video recording of the exact same event that may have occurred from another camera angle, with an audio or still photo recording of the same subject, or recordings in multiple media which are being directed at another recording subject.
Existing solutions to this problem are highly manual in nature, high in cost, and are generally imprecise. For example, the broadcast of a sporting event may involve the strategic positioning of video cameras at different locations within the sporting arena. A production crew is charged with the task of knowing the locations of these cameras and the subjects that are being recorded with these cameras. During either a live broadcast or later production of an aggregated film clip, the production/editing crew must manually review the video recordings to determine the exact subject being recorded, and must essentially estimate or re-generate the relations between the different recordings. Therefore, any attempt to integrate the data from the multiple sources is essentially done in an ad hoc manner using highly manual techniques that generally “guess” at the recording parameters of each recording.
This problem is further complicated by the modern trend of having audience members bring portable electronic devices that are capable of capturing and recording the live event. For example, audience members may bring mobile phones that have image, video, or sound capture capability, and use those mobile devices to capture data relating to the event. Those mobile devices may be recording videos, images, or sound at different angles and at different subjects at the event. However, even though these portable recording devices are not “officially” recording the event on behalf of the event promoters, those recording may still be of great interest to those that wish to provide a live broadcast or later production of a film for the event. This is because the mobile devices may be capturing videos or photographs that were not captured by the “official” recording devices, and which would be useful or desirable to include in the live broadcast or later production. For example, the mobile device may have captured the scene of a disputed referee call at a sporting event from a very useful angle, or captured the recording of a musical performance from a very unique angle or recording posture.
Conventionally, these mobile devices are completely independent from the control or even access by the production crew for the event. Therefore, there are no known existing approaches that would allow any type of automated approach to integrate recordings from these mobile devices with the “official” recordings of the event. Even if the production/editing crew for the event has access to the recording from the mobile devices, the same issues mentioned before would arise with regard to the lack of a mechanism to easily and efficiently relate the different recordings together in a consistent manner, e.g., with regard to temporal and spatial positioning.