1. Field of the Invention
The invention relates to video retrieval and more specifically to developing an indexed library by processing a text stream transmitted with a media presentation to enable later retrieval of one or more portions of the media content.
2. Introduction
Multimedia currently provides many advantages to those of some diminished physical capacity. For example, those that visually impaired have previously had difficulty enjoying a movie or a program. One recent advantage to help in this regard is a descriptive audio track added to media in order to allow someone to still fully enjoy a media event. As one example, movies are now available with a separate track of audio that will describe the actions and events taking place on the screen so that even those unable to see the screen, can still garner an understanding of what is taking place in the movie. Further, these descriptive audio tracks are being included when movies are widely distributed in a digital format so that movies can be enjoyed by those with a visual disability in the home.
Many digital formats are available to the consumer, including downloadable movies in varying formats, digital movies distributed over cable or fiber optics, high definition movies, as well as DVD movies. In all of these formats, a descriptive audio stream can be included in order to appeal to the visually impaired. Regardless of the digital format, the descriptive audio track is typically available in a high quality digital format that an automatic speech recognition program can convert into text.
It is known in the art to use automatic speech recognition (ASR) to retrieve video about a certain subject that is part of the conversation in the presentation. For example, using ASR or a movie, and then based on that data, retrieving the segments that talk about a dog or cat. This approach has deficiencies, however, in that the conversation in the media may never mention “cat” or “dog” and may thus not provide the best indexing mechanism. What is needed in the art is an improved method of retrieving portions of a video presentation.