Multimedia resources, including audio and video files, are becoming increasingly important as sources of information, communication and entertainment. For example, whereas the Internet originally provided primarily text-based communication and information services, this has evolved over time to include not only still images, but also multimedia content, including audio, video and other rich media resources. Popular Web sites such as YouTube facilitate the publication of multimedia content by all users, and audiovisual material posted to such sites ranges from personal diary and commentary, through current events and political footage, to educational and informative content, and everything in between. In general, multimedia content, delivered via the Internet and other channels, is increasingly being utilised for purposes of education, information, advertising, entertainment and so forth.
One problem raised by this proliferation of multimedia content is the difficulty of identifying particular multimedia resources, and/or portions thereof, that are of particular value or interest to users. This may be appreciated by considering the comparative maturity of text-based information systems. Text-based searches, including search facilities of databases and other information resources, as well as Internet search engines such as Google, enable the rapid identification of documents of interest, and of particular portions within those documents, particularly by reference to specified search terms or keywords. Presently, no similarly sophisticated system exists for providing comparable functionality in relation to multimedia resources, such as audio or video files. In some cases, metadata associated with a multimedia resource, such as a title and/or brief description of the content, may enable the resource to be identified as being of potential interest using text-based search systems. However, such limited information does not enable specific portions of the multimedia content to be identified and accessed, based upon a user search.
Accordingly, it is typical that users now seeking information within multimedia resources, such as video files, will need to review the entire content in order to identify portions of particular interest. This generally involves listening to, or viewing, the multimedia content in its entirety. While it may be possible to fast-forward through segments of no interest, and to rewind and review segments of particular interest, this process is substantially passive, linear, and relatively time-intensive.
In cases in which a user may be aware of a particular multimedia content file that contains information of interest, it is presently up to the user to remember, or record, the location within the multimedia content at which the interesting information is located. If the precise time of the required passage is not known, a manual search, via fast-forward/rewind/play, may be necessary in order to find the content of interest. This task becomes more time-consuming if the user is unable to recall which multimedia file of a collection of related files contains the specific content of interest.
As multimedia content becomes increasingly prevalent as a source of critical information to users, both in their personal and business lives, more precise methods for searching and retrieving information from multimedia files will be required.
An emerging solution to this problem involves the provision of textual information that is associated with a corresponding multimedia content file. For example, a video file may have associated with it a transcript, commentary, or other text description corresponding with sequences within the video content. The text may be searchable, enabling more detailed identification of the content of the video file. While this partially addresses user needs, it remains desirable to associate particular passages within the textual description with corresponding sequences, time intervals and/or cue points within the video content.
Known methods have been implemented for associating text with content within a multimedia file. In one approach, the video file is modified so as to embed time cues therein. This enables external programs to re-cue the video playback on demand to the designated points. These points may then be associated with a corresponding textual description. The video content corresponding with a particular passage in the description may be replayed by cueing the video file to the closest embedded time cue. In another known method, an additional file is created (known as a concordance file) which contains references to positions within the textual file and corresponding references to elapsed time within the associated video file. These known methods are reasonably practical, e.g., for applications not requiring changes to the textual description file. If the text is to be edited, these methods either reprocess the video file, in order to update embedded time cues, or reprocess the concordance file, in order to update the corresponding references to both the description and video files.
There remains a need for improved and/or alternative methods and apparatus for searching and accessing information content within multimedia files.