The present invention deals with a lexicon for use by speech recognition and speech synthesis technology. In particular, the present invention relates to an apparatus and method for compressing a lexicon and accessing the compressed lexicon, as well as the compressed lexicon data structure.
Speech synthesis engines typically include a decoder which receives textual information and converts it to audio information which can be synthesized into speech on an audio device. Speech recognition engines typically include a decoder which receives audio information in the form of a speech signal and identifies a sequence of words from the speech signal.
In speech recognition and text-to-speech (speech synthesis) systems, a lexicon is used. The lexicon can contain a word list and word-dependent data, such as pronunciation information and part-of-speech information (as well as a wide variety of other information). The lexicon is accessed by a text-to-speech system, for example, in order to determine the proper pronunciation of a word which is to be synthesized.
In such systems (speech recognition and text-to-speech) a large vocabulary lexicon is typically a highly desirable feature. However, it is also desirable to provide speech recognition and speech synthesis tasks very quickly. Due to the large number of words which can be encountered by such systems, the lexicon can be extremely large. This can take an undesirable amount of memory.
Compression of data, however, brings its own disadvantages. For example, many compression algorithms make it cumbersome to recover the compressed data. This often requires an undesirable amount of time, especially with respect to the desired time limitations imposed on speech recognition and speech synthesis tasks. Further, since a conventional lexicon may contain in excess of 100,000 words, along with and each word's associated word-dependent data, it can take an undesirable amount of time to build the compressed lexicon based upon an input text file containing the uncompressed lexicon. Similarly, many compression algorithms can render the compressed text non-extensible, or can make it quite cumbersome to extend the compressed data. However, it may be desirable to change the lexicon, or modify the lexicon by adding or deleting words. Similarly, it may be desirable to add additional word-dependent data to the lexicon or delete certain types of word-dependent data from the lexicon. Therefore, limiting the extensibility of the lexicon is highly undesirable in speech-related systems.