Multimedia capturing (recording) devices are increasingly popular in recent years. Multiple recording devices may record the same subject matter at the same time and at the same event (e.g., accidents, parties, performances, and sporting events.) The contents recorded by these recording devices may have temporal and/or spatial correlations. That is, the recorded contents may be related in time since they are created at proximately the same time. In addition, the recorded contents may also be related in space since the recording devices may record the same subject matter in the nearby environment. Thus, the recorded contents of a subject matter from the different recording devices may contain similar information, which may together provide a broader perspective of the subject matter.
Although the contents recorded by these recording devices are similar or related, the obtained multimedia objects (e.g., audio, video, and images) are often isolated from each others. The temporal and spatial correlations existed among these recorded contents are hard to maintain across the different recording devices, and may be lost once the recorded contents are subsequently distributed. Without these correlations, it would be hard to search for the related recorded contents, and it would limit a user's comprehensive understanding of the recorded contents.