This invention relates to digital speech synthesis circuits capable of being implemented in an integrated circuit device. More specifically, this invention relates to interpolation circuitry utilized to increase the effective data rate in speech synthesis circuits.
Several techniques are known in the prior art for digitizing human speech. For example, pulse code modulation, differential pulse code modulation, adaptive predictive coding, delta modulation, channel vocoders, cepstrum vocoders, formant vocoders, voice excited vocoders, and linear predictive coding techniques of speech digitization are known. The techniques are briefly explained in "Voice Signals; Bit by Bit" on pages 28-34 of the October, 1973 issue of IEEE Spectrum.
In certain applications and particularly those in which digitized speech is to be stored in a memory, most researchers tend to use the linear predictive coding technique because it produces a very high quality speech using rather low data rates. An excellent example of the use of linear predictive coding systems, implementable in integrated circuit techniques may be seen in U.S. patent application Ser. No. 901,393, filed Apr. 28, 1978, now U.S. Pat. No. 4,209,836 issued June 24, 1980. The speech synthesis system described in the aforementioned U.S. Pat. No. 4,209,836 utilizes frames of data which are comprised of digital representations of pitch, energy and certain linear predictive coefficients which are utilized to control a digital filter. The system described in the aforementioned U.S. Pat. No. 4,209,836 is capable of producing high quality synthetic human speech at a bit rate of as low as 1200 bits per second, utilizing a fixed rate of data frame entry. A more accurate representation of human speech may be obtained by increasing the frame rate to a level significantly higher than that described in U.S. Pat. No. 4,209,836; however, a corresponding increase is experienced in the number of bits which must be stored in memory to synthesize a given quantity of human speech. Further, certain aspects of human speech are quite redundant, and may be accurately synthesized utilizing a data rate significantly lower than that disclosed in the aforementioned U.S. Pat. No. 4,209,836. An ideal solution to the aforementioned problem, would require a speech synthesis system capable of synthesizing human speech from frames of data which change rapidly during those complex periods of human speech and change slowly during redundant periods, thereby minimizing the required bit storage. A problem encountered in attempting to utilize variable frame rate data in speech synthesis circuits occurs when interpolation calculation is utilized between frames of data to enchance data rate capability. A fixed interpolation system such as that described in U.S. Pat. No. 4,209,836, wherein eight interpolation calculations take place between each frame data is adequate for fixed frame rate systems; however, a variable frame rate system requires much more sophistication in interpolation circuitry. Specifically, during slowly changing periods of speech data, a more accurate protrayal of the human speech waveform may be achieved by increasing the number of interpolation steps between frames. Conversely, during rapidly changing aspects of human speech, few or no interpolations between frames of data are required to accurately synthesize human speech. Thus, in order to solve the aforementioned problem, a speech synthesis circuit must be able to vary the number of interpolation calculations taken between successive frames of speech data. Further, it has been discovered that in certain aspects of synthesis of human speech, the interpolation between frames of data may more accurately portray human speech if interpolated linearly, or in other circumstances, nonlinear interpolation may provide greater accuracy.
It is therefore one object of this invention to improve speech synthesis technology.
It is another object of this invention to provide a speech synthesis system capable of accurately synthesizing human speech over variable frame rates.
It is still another object of this invention to provide a speech synthesis system capable of providing specialized interpolation calculations between adjacent frames of data which are utilized at varying rates.
The foregoing objects are achieved as now described. A speech synthesis system is constructed with a linear predictive filter utilizing coded reflection coefficients to produce digital signals representative of human speech. A variable interpolation circuit within the linear predictive filter allows a variable number of interpolation steps to be calculated between successive values of reflection coefficients. Additionally, a user programmable option allows the user to select a linear, nonlinear, or combination form of interpolation.