Modern technologies have made it possible to conduct communication using different devices and in different forms. Among all possible forms of communication, speech is often a preferred way to conduct communications. For example, service companies more and more often deploy interactive response (IR) systems in their call centers that automates the process of providing answers to customers' inquiries. This may save these companies millions of dollars that are otherwise necessary to operate a man-operated call center. In situations where a communication device lacks real estate, speech may become the only meaningful way to communicate. For example, a person may check electronic mails using a cellular phone. In this case, the electronic mails may be read (instead of displayed) to the person through text to speech. That is, electronic mails in text form are converted into synthesized speech in waveform which is then played back to the person via the cellular phone.
When speech is used for communication, generating synthesized speech with natural sound is desirable. One approach to generating natural sounding synthesized speech is to select phonetic units from a large unit database. However, the size of a unit database used by a text to speech processing mechanism may be constrained by factors related to the device (e.g., a computer, a laptop, a personal data assistant, or a cellular phone) on which the text to speech processing mechanism is deployed. For example, the memory size of the device may limit the size of a unit database.