The invention relates to a method and a system for generating a speech recognition dictionary based on greeting recordings in a voice messaging system. The invention finds practical applications in telephone systems, such as Private Branch Exchange (PBX) systems, also called xe2x80x9cKey systemsxe2x80x9d that have a voice messaging capability and also speech recognition functions, such as the ability to connect a caller to a subscriber of the telephone system (called party) by recognizing the name of the subscriber uttered by the calling party.
Modern telephony brings to consumers a broad range of enhanced functions above the basic telephone service such as the ability to establish a communication link between taco remote locations in a network. Specific examples of such enhanced call-related functions include speech recognition, and voice messaging, among many others. An example of speech recognition services that are available today is the ability of a telephone system, such a PBX system, to effect a connection when the caller utters the name of the subscriber he/she wishes to call. The telephone system uses a speech recognition unit which processes the signal derived from the spoken utterance and tries to match this utterance to vocabulary items in a speech recognition dictionary. The vocabulary items in the speech recognition dictionary are representations of the names of the subscribers serviced by the telephone system. When the speech recognition unit finds the best match to the spoken utterance, the connection with the subscriber associated with the chosen vocabulary item is effected either immediately or after completion of a confirmation dialogue with the caller.
During the commissioning phase of the telephone system, the speech recognition dictionary is built. Typically, a text-to-transcription unit processes orthographic representations of vocabulary items associated to respective subscriber names. For each vocabulary item, the text-to-transcription unit outputs at least one transcription indicative of the pronunciation of the vocabulary item. Each transcription is comprised of a plurality of sub-word units, each sub-word unit being associated to a respective speech model. Typically, a speaker independent model set trained on the basis of a plurality of speakers is used.
A deficiency of the above-described method is that variations in pronunciations of the subscriber names are not usually provided by the text-to-transcription unit. This problem is particularly noticeable when a subscriber""s name is in a language of origin different than that supported by the text-to-transcription unit. In such situations, the pronunciation derived by the text-to-transcription unit may not properly describe the actual pronunciation of the subscriber name. Consequently, the recognition performance for such name is poor.
Against this background it is clearly apparent that there exists a need in the industry to provide an improved method and a system to generate a speech recognition dictionary particularly, for use in the context of telephone systems that offer speech recognition services to users.
The invention provides a system and a method for generating a speech recognition dictionary by making use of the audio greetings recorded by telephone system subscribers. The audio greetings are played before allowing callers to leave messages in a voice mailbox of subscribers. An individual greeting is audio information that contains the name of the subscriber. This audio information can be processed to generate a transcription indicative of a pronunciation of a vocabulary item in a speech recognition dictionary representative of the subscriber name.
In a specific example of implementation, the individual greeting is an identification message consisting essentially of a signal representative of the name of the subscriber.
Advantageously, using an individual greeting to generate a transcription associated to a vocabulary item allows the speech recognition dictionary to capture a pronunciation of the subscriber name as he would pronounce himself.
In a specific example of implementation, the telephone system is a PBX system including a speech recognition unit capable to effect a connection when a caller utters the name of the called party (subscriber). The speech recognition process is effected based on the speech recognition dictionary containing the vocabulary items representative of the subscriber names, which have been generated from the individual greetings. As a variant, the vocabulary items are further associated to alternative pronunciations of the vocabulary items derived on a basis of the orthographic representation of the subscriber name as well as text to phoneme rules.
The present invention allows the generation of a speech recognition dictionary when the individual greetings are available.
The invention also extends to a telephone system with voice messaging capability that can generate a speech recognition dictionary from the audio greetings of the subscribers.