It is an everyday occurrence for most of us to use an Internet search tool on a computer connected to a network to find any conceivable information that is of interest to us. As more and more information is posted on the internet, as well as on private networks, the need to efficiently search and access that information has grown exponentially. Search has become a huge business dominated by Google, Inc. of Mountain View, Calif. In response to a search string or query input by a user, a search engine such as Google's considers a host of factors before it delivers a prioritized list of results. Other companies offering similar search services to users hungry for information include Microsoft, Yahoo, and IAC among others.
The most successful search engines use keywords typed in by the user to comb through millions of web pages in search of relevant information that the search engine algorithms are programmed to return. For text-based web pages, this type of searching has proven enormously successful. As bandwidth for internet users has expanded, web pages have become far more sophisticated and dynamic, now hosting many different formats such as audio, video and/or A-V recordings that can be played by a user through their internet connection. For purposes of this specification, the terms audio, video, A-V, media and multimedia are all files containing content of the different types that may be streamed live or played back from a recording. It should be further understood that any of these different types of content lend themselves to storage and play back in different file formats which will be discussed throughout the specification. It should be understood that the use of “audio,” “video,” “A-V,” “media” or “multimedia” individually or together throughout this specification is intended to cover any one or more of these content types in one or more formats where appropriate.
As these different types of media have proliferated across the web and become standard on internet web pages, the search engines have failed to keep up in terms of the ability to search the content of non-text-based formats such as audio, video and A-V recordings. While text-based pages are predominantly in one of a few formats such as XML, HTML, DOC, or PDF that allow strings of characters to be identified and compared, searching the content of an audio, video or A-V recording is far more challenging.
Of the media players available on the market today, none allow for a simple, seamless searching and synchronized playing of a selected segment of audio, video and/or A-V content directly from the popular search engines.
However, once the audio track of recorded material that contains speech, sounds or visual cues is converted to text, that text is searchable by search engines available in the market today. It is worth noting that the actual search results produced by a search engine analyzing a transcribed audio, video or A-V track is only as accurate as the speech-to-text, sound-to-text or visual-to-text transcription that is performed.
While text based search engines are widely available for network use, audio, video and/or A-V search tools are not. An example of a limited capability search tool for video is the experimental video search “gadget” (formerly the “Gaudi gadget”) provided by Google of Mountain View, Calif. On Google's web pages dedicated to political videos, a user may search the videos on a limited set of web pages by entering a search term in the search query box. The results from the limited set of web pages are listed and can be selected by a user to be taken to the beginning of the video that contains that term.
There are various companies offering software that uses algorithms to automatically produce transcripts from audio, which are then synchronized with video containing the audio. One such product is MetaPlayer produced by RAMP, Inc. of Woburn, Mass. (formerly Everyzing, Inc. of Cambridge, Mass.) Companies like RAMP offer search within an individual video by searching for a text string matching text in the transcription. When a search string is entered for a particular video, results are listed. When a result is selected, the video is played from that occurrence of the searched string.
The present invention recognizes the desirability of producing advanced search capabilities for audio-only and/or audio-video content, as well as the use of those same capabilities enhanced with accurate transcription and synchronization.