1. Technical Field
The present application relates generally to a system and method for managing an archive of audio data and, more particularly, to an audio processing system and method for segmenting and indexing audio or multimedia files based on audio information such as speaker identity, background and/or channel, for storage in a database, and an information retrieval system and method which utilizes the indexed audio information to search the database and retrieve desired segments of audio/multimedia files.
2. Description of the Related Art
In general, management of an archive is important for maximizing the potential value of the archive. Database management is especially challenging for owners of audio/multimedia archives due to the increasing use of digital media. Indeed, the continuing increase in consumer use of audio and multimedia recording devices for memorializing various events such as radio and television broadcasts, business meetings, lectures, and courtroom testimony, has resulted in a vast amount of digital information that the consumers desire to maintain in an audio/multimedia archive for subsequent recall.
This increasing volume of digital information compells database owners to continuously seek techniques for efficiently indexing and storing such audio data in their archives in some structured form so as to facilitate subsequent retrieval of desired information. Accordingly, a system and method for indexing and storing audio data, and an information retrieval system which provides immediate access to audio data stored in the archive through a description of the content of an audio recording, the identity of speakers in the audio recording, and/or a specification of circumstances surrounding the acquisition of the recordings, is desirable.
The present application is directed to a system and method for managing a database, of audio/multimedia data. In one aspect of the present invention, a system for managing a database of audio data files comprises:
a segments for dividing an input audio data file into segments by detecting speaker changes in the input audio data file;
speaker identifier for identifying a speaker of each segment and assigning at least one identity tag to each segment;
a speaker verifier for verifying the at least one identity tag of each segment; and
an indexer for indexing the segments of the audio data file for storage in a database in accordance with the identification tags of verified speakers.
In another aspect of the present invention, the system further comprises a search engine for retrieving one or more segments from the database by processing a user query based on an identity of a desired speaker.
In another aspect of the present invention, the system for managing a database of audio/multimedia files further indexes audio/multimedia files and data streams according to audio information such as, background environment (music, street noise, car noise, telephone, studio noise, speech plus music, speech plus noise, speech over speech), and channel (microphone, telephone) and/or the transcription of the spoken utterances, and the user may retrieve stored audio segments from the database by formulating queries based on one or more parameters corresponding to such indexed information.
These and other aspects, features and advantages of the present invention will be discussed and become apparent from the following detailed description of preferred embodiments, which is to be read in connection with the accompanying drawings.