Field of the Endeavor
The present invention relates to a method and system for indexing and searching timed media based upon relevance intervals. More particularly, the present invention relates to a method and system for indexing and searching timed media based upon relevance intervals that returns portions of timed media files that are selected as specifically relevant to the given information representations, thereby eliminating the need for a manual determination of the relevance, replacing manual editing processes, and avoiding missing relevant portions. The timed media includes streaming audio, streaming video, timed HTML, animations such as vector-based graphics, slide shows, other timed media, and combinations thereof. The method and system of the present invention determines the relevant portion of the media around each occurrence of the information representation rather than requiring the user to perform such functions.
Background of the Invention
The rapid growth of the Internet encompasses rapid growth in the use of real-time, digital timed media such as Web-based conferencing, e-learning, presentations, training, events, corporate communications, radio broadcasts, and other broadcasting. Such new types of media are becoming commonplace methods of communication. As the use of timed media communication tools continues to gain popularity, storehouses of timed media files are growing to meet this new demand. Organizations require tools capable of capturing, indexing, and retrieving the massive amount of information contained within such mediums of communication.
Traditionally, search engines create a large table that is indexed by words, phrases, or other information such as hyperlinks. Each word or phrase points to documents that contain it. The pointing is rated by a relevance magnitude that is calculated by some algorithm, typically including information such as the frequency with which the word or phrase appears and whether it occurs in the title, keywords, etc. Advanced search engines augment the foregoing system by adding the capability to check synonyms or by letting the user indicate the intended definition of the word in question, either by choosing it manually or by entering a natural language query. Other functions are plentiful, such as putting the words searched for in bold in an HTML document or organizing the returned results into customized folders, as is done by Northern Light®.
While the foregoing search engines are excellent models for static documents, their usefulness is minimal for timed media, such as an audio or video file. Due to the unidirectional movement of human consciousness through space-time, a user cannot simultaneously experience more than one point of a timed media file. As a result, the ability of the user to find a relevant portion within a timed media file, once they have found the file itself, is greatly constrained. Therefore, a useful timed media search and retrieval system must not only assist the user in locating a relevant file, but must also assist the user in locating the relevant portions of that file.
Due to the time-dependent nature of viewing such timed media files, locating relevant information contained within timed media files is even more complicated than locating information contained in static text-based files. When searching static text-based files, a user can review the text by seeking occurrences of search terms from within any text viewing application. In contrast, when searching timed media files, a user cannot judge the detailed content of the file any faster than by playing the file through from beginning to end. If only a small portion of a video is of interest to a particular viewer, for example, it is unlikely he or she will identify that portion without viewing the entire file.
Attempts have been made to provide search capability for timed media files. Conventional timed media search systems attempt to solve the foregoing problem by segmenting the timed media files into short sections. The precise length of such sections or scenes is usually determined automatically by sudden visual changes in the timed media, such as those caused by an edit or cut; manually by a human editor; or arbitrarily into clips of roughly uniform length. Each scene is then indexed as if it were a separate document, usually with the help of manually entered keywords. The user can visually skim a list of representative images from the scenes that compose the timed media file, thereby utilizing the visual information inherent in the timed media file itself to select an appropriate starting point for viewing the file. Some timed media search systems also use speech recognition to display a portion of any spoken text from a given scene.
The foregoing method is particularly useful in the field of digital video editing and production processes, as a sequential storyboard is often an ideal presentation of the media. Unfortunately, such an approach is not nearly as useful in the context of factual information searching and retrieval. Users of factual information searching systems are often less interested in the visual information, and a great deal of the factual information-centered timed media content created specifically for the Internet contains little such visual information.
Other conventional timed media systems do not divide a timed media file into segments. Such systems index the precise time at which a particular term is spoken. A user can then search for a particular term and use the search results to begin replaying the timed media file from the precise occurrence of the search term. While this method guarantees that a user can locate the occurrence of the search term, the user still must manually determine how much of the timed media file, before and after the occurrence of the search term, is relevant. Consequently, determining the extent to which the timed media file or particular portions of the timed media file are relevant still requires a significant amount of manual navigation and review of irrelevant content.
A further problem exists because of the rigid nature of the aforementioned systems. An important technique for solving the problem of creating a useful timed media search index is to make assumptions about the timed media based upon its origin or intended use. For example, timed media presentations from different industries should use different speech recognition lexicons, and a multi-speaker video conference might be segmented using different processes than a single-speaker speech. The aforementioned systems are fairly limited solutions in that they do not allow the user to customize the processes involved in creating or using a search index. As a result, such systems do not even optimally use their own technologies for indexing particular types of timed media.