When presented with a video, audio, podcast, conference call or any other media file containing audio, it may be difficult to navigate the audio or other media without some form of index. This is sometimes provided manually by offset timestamps alongside text extracts that prompt the reader about a particular section of audio. These extracts and timestamps can be used by a listener to move forward and backward in the media application being used to listen to the audio. In the case of a longer conference call, audio or other media, finding the section of interest can be difficult. If the intent is to quickly find and listen to all the areas where certain topics are covered, this can become extremely challenging.
Some solutions allow searching of audio. Products such as Dragon Audio Mining™ allow conversion of voice to text for data mining purposes. Other applications, such as Wordle™ produce words clouds from text streams.
Some solutions involve indexing an audio or media file and providing offset timestamps with a database. Services exist to convert audio files to text, and some include timestamp information as well. The customer must still make use of the database to search within the data, and the searches that are produced often only reflect occurrences of a search term, and not their relative importance. More advanced searching may allow a user to search for terms and see in a video where those terms are mentioned. Other approaches provide extracts of text as a word tree, which is essentially a collection of phrase start points that allow a user to explore the text where similar phrase use occurs.
The existing solutions, however, fail to provide an effective visualization of the text and link the same to the media. As an example, if during an hour conference regarding a variety of a vendor's products a participant wanted to see which products were discussed the summary would identify this and enable the participant to easily find the place in the stream.