The present invention relates generally to speech recognition and speech synthesis systems. More particularly, the invention relates to developing word-pronunciation pairs.
Computer-implemented and automated speech technology today involves a confluence of many areas of expertise, ranging from linguistics and psychoacoustics, to digital signal processing and computer science. The traditionally separate problems of text-to-speech (TTS) synthesis and automatic speech recognition (ASR) actually present many opportunities to share technology. Traditionally, however, speech recognition and speech synthesis has been addressed as entirely separate disciplines, relying very little on the benefits that cross-pollination could have on both disciplines.
We have discovered techniques, described in this document for combining speech recognition and speech synthesis technologies to the mutual advantage of both disciplines in generating pronunciation dictionaries. Having a good pronunciation dictionary is key to both text-to-speech and automatic speech recognition applications. In the case of text-to-speech, the dictionary serves as the source of pronunciation for words entered by graphemic or spelled input. In automatic speech recognition applications, the dictionary serves as the lexicon of words that are known by the system. When training the speech recognition system, this lexicon identifies how each word is phonetically spelled, so that the speech models may be properly trained for each of the words
In both speech synthesis and speech recognition applications, the quality and performance of the application may be highly dependent on the accuracy of the pronunciation dictionary. Typically, it is expensive and time consuming to develop a good pronunciation dictionary, because the only way to obtain accurate data has heretofore been through use of professional linguists, preferably a single one to guarantee consistency. The linguist painstakingly steps through each word and provides its phonetic transcription.
Phonetic pronunciation dictionaries are available for most of the major languages, although these dictionaries typically have a limited word coverage and do not adequately handle proper names, unusual and compound nouns, or foreign words. Publicly available dictionaries likewise fall short when used to obtain pronunciations for a dialect different from the one for which the system was trained or intended.
Currently available dictionaries also rarely match all of the requirements of a given system. Some systems (such as text-to-speech systems) need high accuracy; whereas other systems (such as some automatic speech recognition systems) can tolerate lower accuracy, but may require multiple valid pronunciations for each word. In general, the diversity in system requirements compounds the problem. Because there is no xe2x80x9cone size fits allxe2x80x9d pronunciation dictionary, the construction of good, application-specific dictionaries remains expensive.
The present invention provides a system and method for developing word-pronunciation pairs for use in a pronunciation dictionary. The invention provides a tool, which builds upon a window environment to provide a user-friendly methodology for defining, manipulating and storing the phonetic representation of word-pronunciation pairs in a pronunciation dictionary. Unlike other phonetic transcription tools, the invention requires no specific linguistic or phonetic knowledge to produce the pronunciation lexicon. It utilizes various techniques to quickly provide the best phonetic representation of a given word along with different means for xe2x80x9cfine tuningxe2x80x9d this phonetic representation to achieve the desired pronunciation. Immediate feedback to validate word-pronunciation pairs is also provided by incorporating a text-to-speech synthesizer. Applications will quickly become apparent as developments expand in areas where exceptions to the rules of pronunciation are common, such as streets, cities, proper names and other specialized terminology.
For a more complete understanding of the invention, its objects and advantages refer to the following specification and to the accompanying drawings.