1. Field of the Invention
This invention relates generally to data compression, and more particularly to a system and method using correspondence techniques to compress a pronunciation guide.
2. Description of the Background Art
Computer Random Access Memory (RAM) and disk space are becoming more available and affordable in desktop computer systems. A typical desktop computer system currently provides on the order of sixteen megabytes of RAM and one gigabyte of hard disk memory. This increasing availability allows programmers the freedom to create application programs and data files which occupy several megabytes of computer memory. However, minimizing the size of data files remains important for optimizing system performance and use of memory resources.
To minimize storage requirements, programmers compress large data files. One type of large file is a pronunciation dictionary, which includes dictionary words for a language such as American English and dictionary phonemes (phonetic sounds) representing the pronunciation of each of the dictionary words. A typical uncompressed pronunciation dictionary occupies up to about ten megabytes of memory.
Information such as a pronunciation dictionary can be compressed using certain symbols to replace redundant data. For example, a typical compression technique assigns symbols to represent particular patterns of redundant data such as multiple zeros or ones. Multiple compression techniques may be performed successively to eliminate more redundancies and compress data further. Accordingly, a pronunciation dictionary may be compressed to around thirty percent or less of its original size.
Previous techniques for compressing pronunciation dictionaries do not take into account redundancies inherent in dictionary words and dictionary phonemes. Therefore, as an addition to other techniques for compressing a pronunciation dictionary, it is desirable to have a system and method for taking advantage of redundancies in pronunciation.