Due to decreasing costs of storage devices, higher data transmission rates, and improved data compression techniques, digital multimedia content is accumulating at an ever increasing rate. Because of the content's bulky data volume and unstructured data format, access to multimedia content remains inefficient to this day.
For example, although it may be misconceived as an easy task, processing multimedia content based on the perception of various information sources such as audio, video and text present in the content, efficient access to multimedia content continues to be a very complicated process for a computer to emulate. The reasons relate to limitations of machine analysis of multimedia under unconstrained environments and due to the unstructured nature of the media data. For instance, most of the current digital video players can only provide basic functions such as fast forward, rewind, pause and stop for a linear content search. Very few of them could support non-linear access such as random seek based on the content in the video.
While a DVD player allows users to jump to a particular access point such as a scene or a chapter, most of the indexing information that facilitates that jump is either content-blind or manually generated. That approach is obviously labor-intensive and becomes impractical for a large multimedia collection. Therefore, there is a need in the art for a comprehensive multimedia analysis system that automatically extracts content semantics at multiple and different resolutions to facilitate efficient content access, indexing, browsing and retrieval.