The ability to automatically analyze, understand and annotate multimedia content is becoming increasingly important today given the rapid growth in security, entertainment, medical and education applications. A number of analysis tools (or agents) have been proposed and developed in this area, such as cut detection, object tracking in video, emotion detection, speaker/gender identification in audio and color/texture/shape feature extraction in images. However, further developments in these areas continue to prove difficult to achieve, primarily due to two causes.
First, a gap exists between the intelligence level of automatic algorithms and the requirements of real applications. For example, in an airport camera monitoring system, it is desirable to detect suspicious events like “A man leaves luggage on a chair and walks away.” However, most algorithms proposed nowadays address only the detection and characterization of low-level features like color, texture or motion. Second, reliability of such systems is challenged by environmental complexity and many other factors. For example, a little makeup or other disguises can easily fool the most advanced face recognition algorithms. The development of robust and powerful multimedia content understanding will require the collaboration of a set of specialized, effective and relatively primitive annotation tools. By integrating such tools in a hierarchical scheme, more intelligent systems can be built.
However in practice the feasibility of this strategy is hindered by technical and proprietary issues. Multimedia content analysis requires expertise in a number of fields such as image and video processing, audio processing, speech recognition, linguistics, information retrieval and knowledge management. The range of expertise spans from digital signal processing techniques for feature extraction to methods for knowledge representation, integration and inference. It remains unlikely that a single researcher or research laboratory can cover the required range of expertise to develop a multimedia analysis system from scratch. Usually, each lab concentrates on its own research agenda using commercial tools (if available) or borrowing experimental tools from other researchers to develop a multimedia analysis prototype. Borrowing from others is not easy due to the variety of platforms, programming languages, data exchange formats and unwillingness of developers to disseminate their intellectual property in an unprotected fashion. In short, research efforts remain fragmented and cooperation is difficult to achieve.
Therefore, it would be advantageous to provide a technique for fostering collaboration between researchers working in the area of multimedia content analysis. Such a technique should allow the research tools developed by collaborators to be widely distributed, but at the same time protect the intellectual property rights of the respective collaborators.