This invention relates to the generation of complex waveforms using digital signals and more specifically to the synthesis of speech by digital circuits using linear prediction methods. Disclosed is a digital filter having an array multiplier for use in speech synthesis or waveform generation circuits. The disclosed speech synthesis circuit may be integrated on a single integrated circuit, thereby facilitating its use in various applications in the communication handling industry, including such applications as: teaching machines, communication equipment (i.e., telephones, voice cryptographic equipment, radios, televisions, etc.), and other equipment which generate the sound of a human's voice.
Several methods are currently being used and experimented with to digitize human speech. For example, pulse code modulation, differential pulse code modulation, adaptive predictive coding, delta modulation, channel vocoders, cepstrum vocoders, formant vocoders, voice excited vocoders, and linear predictive coding methods of speech digitalization are known. These methods are briefly explained in "Voice Signals: Bit by Bit" at pages 28-34 in the October 1973 issue of IEEE Spectrum.
Computer simulations of the various speech digitalization methods have generally shown that the linear predictive methods of digitizing speech can produce speech having greater voice naturalness than the previous vocoder systems (i.e., channel vocoders) and at a lower data rate than the pulse coded modulation systems. As will be seen, the linear predictive systems often make use of a multi-stage digital filter and as the number of stages of the digital filter increases, the more natural sounding becomes the resulting generated speech.
An early application of linear predictive methods to digital speech synthesis occurred in the late 1960's and early 1970's. A historical analysis of some of this early work is set forth in Markel and Gray, "Linear Prediction of Speech" (Springer--Verlag: New York 1976) at pages 18-20.
The multi-stage digital filter used in linear predictive coding is preferably an all pole filter with all roots preferably occurring within the unit circle .vertline.z.vertline.=1 when the mathematical transfer function of the filter is expressed as a Z-transform. The filter itself may take the form of a lattice filter of the type depicted in FIGS. 2a and 2b, however, other filters including ladder filters, normalized ladder filters and others are known, as set forth in Chapter 5 of "Linear Prediction of Speech". As will be seen, each stage of the lattice filter requires two addition operations, two multiplication operations and a delay operation. The filter is excited from either a periodic digital source for voiced sounds or a random digital source for unvoiced sounds. The filter coefficients are preferably updated every few milliseconds while the excitation signal is updated at a faster rate.
In the prior art, the lattice filter network of FIG. 2a has been implemented by appropriately programming large digital computers. Exemplary Fortran programming of a computer for speech synthesis purposes is set forth in the aforementioned "Linear Prediction of Speech". Given the data rate of the excitation signal and the large number of arithmetic operations, i.e., two multiplications and two additions for each stage of a multi-stage filter and given that increasing the number of stages thereof increases the naturalness of the generated speech, high speed digital computers have been utilized in most speech synthesis work done to date. However, Dr. J. G. Dunn, J. R. Cowan and A. J. Russo of the ITT Defense Communications Division in Nutley, N.J. have attempted to implement a multi-stage filter using metal oxide silicon (MOS) large scale integration techniques. They attempted using a multi-processing approach, wherein many arithmetic units are operated simultaneously; however, this technique requires a very large number of multiplier and adder circuits be implemented on a semiconductor chip. Some discussion of the work done by Dr. Dunn et al is set forth in "Progress in the Development of Digital Vocoder Employing an Itakura Adaptive Predictor" published in "Telecommunications Conference Records, I.E.E.E. Publ. No. 73" (1973). Replacing the lattice structure of FIG. 2a with various adders and multipliers results in a complex and large size semiconductor chip.
It was one object of this invention, therefore, to implement a lattice type filter for generating complex wave forms, such as human speech, on a single semiconductor chip.
It was another object of this invention, that the filter components be implemented with MOS devices.
It is still yet another object of this invention that the resulting MOS filter be of smaller size than that heretofore known in the prior art.
The foregoing objects are achieved as is now described. The digital filter includes a multiplier, one input of which receives the filter coefficients from a memory. The output of the multiplier is applied to one input of an adder/subtractor, whose output is applied to a short delay circuit. The output of the short delay circuit is applied to a long delay circuit. The short and long delay circuits preferably comprise shift registers of short and long lengths, respectively. The output of the long delay circuit is coupled to a latch memory via a switch. The other input to the multiplier is selectively coupled to the output of the adder/subtractor, the output of the short delay circuit or the output of the latch memory. The other input to the adder/subtractor is selectively coupled to the output of the latch memory, the output of the long delay circuit or the output of the adder/subtractor. The multiplier is preferably an array multiplier. The output of the filter is provided at the output of the latch memory and the input is either coupled to the adder/subtractor or the multiplier in the two disclosed embodiments.