Speech can be recorded for a variety of purposes and by many different techniques. Recorded speech can be utilized as a way of storing information. Oftentimes it may then be desirable to retrieve portions of that recorded speech for reference purposes. Recorded speech is valuable as stored and retrieved information for a number of reasons. First, in most cases, speech is the most natural way to communicate. Second, transcribing speech to text is expensive. Third, listening to recorded speech is possible even while a person is busy with something else (i.e., while driving). Fourth, compared to text, speech contains additional information about the speaker's mood and feeling. Fifth, storing recorded speech is inexpensive since it consumes only a small amount of storage capacity.
However, when using recorded speech, it can be difficult to locate specific contents of the speech in a large amount of recorded speech. For this reason, up to the present time saving hours of recorded speech as an information reference source has been ineffective and inefficient, because finding the relevant information in the recorded speech has required listening to hours' worth of recording in order to locate the desired segment of speech which contains the relevant information. Therefore, up to the present time recorded speech has rarely been utilized as a reference source.
For example, the media network CNN, which provides a 24 hours news broadcast, produces 24 hours of recorded speech information every day. The majority of this information is informative and would constitute an excellent reference source for student and researcher. Currently the raw information is not searchable, making it impossible to use the audio track as a reference source. Accordingly, what is needed is a system and method that overcomes the above-identified problem. The present invention addresses such a need.