1. Field of the Invention
This invention relates generally to the search of a digital audio database such as a series of voice messages stored in a telephone answering device, compact disk (CD), audio cassette tape, etc. More particularly, it relates to an efficient and useful technique and apparatus for searching an audio database for the identification and/or playback of a particular voice message, audio track, etc. containing a particular voice clip.
2. Background of Related Art
Digital audio databases in common use today include such devices as the Flash memory of a telephone answering device, the hard drive of a voice messaging system, the CD of a stereo system, or the audio cassette tape of a tape player system. While each of these systems is capable of storage and/or playback of uncompressed digital audio, data compression techniques are often employed to maximize the storage capacity of a given digital audio storage medium.
Whether or not digital compression is employed, the audio can be characterized as being stored as a representation of an analog waveform signal, which does not lend itself to digital searches for particular audio content.
The Flash memory, hard drive, CD, cassette tape, etc. are capable of storage of a significant amount of audio (e.g., from 30 minutes, to hours or more.
Conventionally, digital audio databases (e.g., CDs, Flash memory, etc.) are either separated by silent periods, or beeps, or by track numbers. To advance or rewind to a particular message, audio track, etc., a user can either designate an absolute message number, or audio track number, for playback, or designate a relative number of messages or tracks to skip forward or backward. In either case, the conventional search mechanism for a digital audio database such as these is limited to the identification of a particular message or audio track based on its position within the database (i.e., message number or audio track number), and not based on the substance within any particular message.
Conventional search mechanisms for digital audio databases work sufficiently for applications where a user knows which particular voice message or audio track that the user would like to listen to. However, if a user does not know which particular voice message or audio track contains a particular passage which they desire to hear, they typically must listen to all messages or audio tracks until the desired message or audio track is found.
For instance, FIG. 5 shows a conventional voice messaging machine (e.g., a telephone answering device) including a digital audio database comprised in voice message memory.
More particularly, in FIG. 5, a telephone answering device 11 is connected to a telephone company central office 13 via a telephone line 15. A telephone line interface (TLI) 17 in the telephone answering device 11 provides the conventional isolation, DC and AC impedance as required by telephone company standards. The telephone line interface 17 also provides a ring detect signal to a controller 19. The controller 19 is typically a suitable microprocessor, microcontroller, or digital signal processor (DSP). The ring detect signal indicates to the controller 19 the ringing of an incoming telephone call on the telephone line 15.
After a desired number of ring signals, the telephone answering device 11 causes the telephone line interface 17 to place the telephone line in an off-hook state, and instructs a voice recorder/playback module 21 to play a pre-recorded outgoing greeting message over the telephone line 15 to the calling party. Upon completion of the outgoing greeting message, the calling party may leave a voice message in voice message memory 23 under the control of the controller 19. A plurality of voice messages recorded by a corresponding plurality of calling parties form a plurality of digital audio segments forming a database within the voice message memory 23.
A user of the telephone answering device 11 later selects a particular voice message (i.e., digital audio segment) from the database in the voice message memory 23 by message number, or perhaps using Caller ID text information associated with a particular underlying voice message.
Upon manual selection, the user retrieves the recorded voice message from the voice message memory, using the user control keys 25 or other buttons or controls for selecting various modes of operation, and then deletes the voice message if desired. The user control keys 25 include an alphanumeric twelve-key keypad 25a to allow the user to manually dial a telephone number and use the telephone answering device 11 as an otherwise conventional telephone (using a handset, not shown). The user control keys 25 further include voice message playback control buttons such as REW, FF, STOP, PLAY 25d, and REC.
To make room for new voice messages, voice messages may be deleted using a delete message button 25c or other appropriate control. When deleted, the entire voice message is effectively erased from the voice message memory 23 (e.g., by allowing new voice messages to overwrite all portions of the deleted voice message).
A voice message number display 200 indicates a sequential message number, e.g., 1, 2, 3 . . . to assist the user in selection of a particular voice message from the voice message memory 23.
FIG. 6A illustrates an exemplary voice message table 800 contained in one sector of the voice message memory 23.
In particular, in FIG. 6A, the message table 800 contains various header information relating to an underlying voice message stored in the same or linked page of voice message memory 23. Conventional header type information includes a time/date stamp 802 indicating the time and date when an underlying speech message was stored. TAG information 804 in the header contains user defined data. Typically, to maximize efficiency in the conventional digital answering machine 11, the speech data is encoded. Thus, the header includes coder information 806 identifying the type of encoding used to encode the underlying voice message data, e.g., the particular coded data rate. The new/old information 808 entry in the header of the message table 800 relates to whether or not the underlying speech message has been reviewed at least once by the user of the digital answering machine 11. The deleted/non-deleted information 810 in the header conventionally indicates whether or not the underlying voice message has been deleted by the user. The number of bytes in the last sector information 812 relates to the length of the voice message in the last sector in which the voice message is stored, avoiding replay of the unused end portion of a partially used last sector when replaying the relevant voice message. Link list information 814 in the header indicates the addresses of all sectors used to store the relevant voice message. Some systems include additional header information 816 in the message table 800.
FIG. 6B shows an exemplary speech data sector 900 in the voice message memory 23 containing the underlying voice message 902-908. The speech data sector 900 shown in FIG. 6B is the first listed in the link list 814 of the message table 800 for the underlying voice message. Zero, one or more pages of speech data 902-908 may be listed in the link list 814 of a message table 800 for a single voice message.
Voice messages stored in conventional digital audio databases such as the Flash memory voice message memory 23 of a telephone answering device are selected using identification information such as a message number or Caller ID information. However, if the user does not have knowledge of the substantive content of a particular voice message, they are not able to substantively search the particular voice message but rather must listen to the voice message to manually determine the substantive content of the message.
There is a need for an improved search technique and apparatus for automatically locating a particular message or audio track containing particular content within a larger digital audio database such as a CD, Flash memory, hard drive, cassette tape, etc., without requiring a user to manually browse through the audio tracks or messages stored in the digital audio database.
In accordance with the principles of the present invention, a digital audio search module comprises a voice clip search module, a digital audio database including a plurality of audio segments, and a text storage medium including a plurality of textual information relating to a corresponding plurality of portions of at least one of the plurality of audio segments.
A method of marking individual entries of a digital audio storage medium for textual search in accordance with another aspect of the present invention comprises associating individual text with each of a plurality of portions of each audio segment stored in the digital audio storage medium. A time stamp is provided with each associated individual text.
A method of searching for an individual audio segment stored on a digital audio storage medium in accordance with yet another aspect of the present invention comprises entering a desired text string for search. A text storage medium is searched for at least a close match of the desired text string. A location of a corresponding audio portion of the desired text string is determined in the individual audio segment.
After the location of the voice clip is found, the audio track or voice message can be played back from the beginning of the audio track or voice clip, from the location of the voice clip, or other point in relation to the located voice clip.