This invention relates to the synchronous control of the transfer of digital speech coefficients in a speech synthesis circuit and particularly a speech synthesis circuit capable of being implemented on one, or a few, integrated circuit chips.
Several techniques are known in the prior art for digitizing human speech. For example, pulse code modulation, differential pulse code modulation, adaptive predictive coding, data modulation, channel vocoders, cepstrum vocoders, formant vocoders, voice excited vocoders and linear predictive coding techniques of speech digitalization are known. The techniques are briefly explained in "Voice Signals: Bit by Bit" on pages 28-34 of the October 1973 issue of IEEE Spectrum.
In certain applications and particularly those in which the digitized speech is to be stored in a memory, most researchers tend to use the linear predictive coding technique because it produces very high quality speech using rather low data rates. Linear Predictive Coding systems usually make use of a multi-stage digital filter. In the past, the digital filter has typically been implemented by appropriately programming a large scale digital computer. However, in U.S. patent application Ser. No. 807,461, filed June 17, 1977, since abandoned in favor of continuation U.S. application Ser. No. 905,328 filed May 12, 1978, now U.S. Pat. No. 4,209,844 issued June 24, 1980, there is taught a particularly useful digital filter for a speech synthesis circuit, which digital filter may be implemented on an integrated circuit using standard MOS or equivalent technology. A theoretical discussion of linear predictive coding can be found in "Speech Analysis and Synthesis by Linear Prediction of the Speech Wave" at Volume 50, number 2 (part 2) of The Journal of the Acoustical Society of America.
Disclosed herein is a talking learning aid which utilizes speech synthesis technology for producing human speech. A complete talking learning aid is disclosed, so, in addition to describing the speech synthesis circuits in detail, the details of the controller for the learning aid and the Read-Only-Memory devices used to store the digitized speech are also disclosed. Of course, those practicing the present invention may wish to practice the invention in conjunction with a talking learning aid, such as that described herein, other learning aids or any other application wherein the generation of human speech from digital data is desirable. Using the techniques described in the aforementioned U.S. Pat. No. 4,209,844 and the teachings disclosed herein will permit those desiring to make use of digital speech technology to do so with one, or a small number of relatively inexpensive integrated circuit devices.
The present invention relates to the synchronous control of the transfer of speech coefficients in the speech synthesis circuits, as aforementioned. During the development of the speech synthesis circuits described herein it was discovered that by synchronously timing the transfer of data, as opposed to asynchronously timing the transfer of data, the circuits used to implement the speech synthesizer could be significantly simplified. This is an important objective in any electronic device, including integrated circuits because it tends to (1) reduce the size of the device and hence the cost thereof and (2) improve device yield rates during manufacture.
It was, therefore, one object of this invention to simplify speech synthesis circuits.
It was another object to reduce the physical size of speech synthesis integrated circuit devices.
It was yet another object to improve yield rates during the manufacture of speech synthesis integrated circuit devices.
The foregoing objects are achieved as is now described. The speech synthesis circuit has an input port for receiving frames of digital speech coefficients and preferably a interpolator circuit. The interpolator circuit slowly interpolates the data received to enable the digital speech coefficients to be updated less frequently for use by a digital filter of the speech synthesis circuit than would otherwise be the case to further reduce the amount of data storage necessary to accommodate the digital speech coefficients in memory which is required by the speech synthesis circuit in generating digital speech signals representative of human speech. The generation of synthesized speech of high quality depends upon an absence of abrupt changes in the speech parameters (i.e., digital speech coefficients) which control the digital filter of the speech synthesis circuit. The interpolator circuit enables the speech parameter values to be changed in a consistent and smooth manner. Without interpolation, the speech parameters must be updated more frequently from the memory in which the digital speech coefficients are stored which would result in a higher data rate and increased memory storage requirements. The interpolator circuit is effective to produce a plurality of intermediate estimated values or interpolated values of digital speech coefficients for each of a plurality of speech parameters in the time interval between receipt of successive frames of digital speech coefficients comprising the speech parameters. The interpolated values derived by the interpolation circuit are therefore estimated values of digital speech coefficients between the values of the speech coefficients of the previous frame of data and the values of the speech coefficients of the current frame of data. A memory coupled to the interpolator circuit stores the interpolated values of the speech coefficients. A synchronous timing circuit is provided for generating a data frame timing signal, interpolation count timing signals and parameter count timing signals at predetermined times. The rate of the parameter count timing signals is a multiple of the rate of the interpolation count timing signals, which is, in turn, a multiple of the rate of the data frame timing signal. In the embodiment disclosed, these timing signals are generated by Programmed Logic Arrays (PLA's) which are driven by an interpolation counter and a parameter counter. Specifically, the frame timing signal may be generated every 20 milliseconds, the interpolation count timing signals may be generated 8 times between each frame timing signal, or approximately every 2.5 milliseconds, and the parameter count timing signals may be generated 13 times for each interpolation timing period, or approximately every 0.2 milliseconds. The data frame signal controls the receipt of a new frame of data at the input port while the interpolation count timing signal controls the initiation of a sequence of interpolations by the interpolator circuit. The parameter count timing signals control when each coefficient is received at the input port after a data frame timing signal has occurred and also control the transferring of particular speech parameters between the interpolator circuit and the memory. Preferably, an input memory is coupled to the input port for storing the most recently received speech coefficients from a frame of data.