1. Field of the Invention
This invention relates to structures and methods for simultaneously multiplying and adding a plurality of signals. This specification describes such a multiplier/adder circuit in the context of artificially synthesizing human speech.
2. Description of the Prior Art
Multiplier/adder circuits are known in the prior art. A typical multiplier/adder circuit of the prior art is a relatively complicated structure requiring the use of a substantial amount of semiconductor material in its fabrication. One particular use for such circuits is in the synthesis of speech utilizing linear predictive coding techniques. A number of techniques exist for synthesizing speech. One technique for synthesizing speech is the phoneme based system. The phoneme based system is based on the principle that most languages can be described in terms of a set of distinctive sounds, or phonemes. For American English, there are approximately 42 phonemes, as shown in FIG. 1. The 42 phonemes for American English are broken down into four broad classes (vowels, diphthongs, semi-vowels, and consonants), and these four broad phoneme classes are broken down into subclasses as shown in FIG. 1. A simplified block diagram for a phoneme based speech synthesis circuit is shown in FIG. 2. The digital representation of each of the phonemes is stored in phoneme memory 1. Speech memory 7 contains the address locations of the phonemes contained in phoneme memory 1, such that phonemes are selected in sequence from phoneme memory 1, thus providing a phoneme string corresponding to the speech to be synthesized. Address locations stored in speech memory 7 are applied via address bus 10 to phoneme memory 1, thus providing an output phoneme string from phoneme memory 1 to digital-to-analog converter 17 through phoneme bus 9. Digital-to-analog converter 17 then converts the digital representation of the phonemes to an analog form which may be applied to other circuitry or a suitable audio transducer (not shown) by output lead 19.
The major disadvantage of the phoneme based speech synthesis system is that the synthesized speech is robotlike, of a very poor quality, difficult to understand and unpleasant and tiring to listen to . An improvement on the phoneme bsed system utilizes 600 sub-phonemes, thus resulting in better quality than the pure phoneme based system, although the quality of a sub-phoneme based system is still relatively poor.
Another method of artificially synthesizing speech is to simply pulse code modulate a speech signal, and store the pulse code modulated representation in a memory. Such a scheme is shown in the block diagram of FIG. 3. An audio input signal is applied via audio input 19 to pulse code modulation encoder 20. The digital representation of the audio input signal is input to memory 21 from PCM encoder 20 via bus 23. When the speech stored in memory 21 is desired to be synthesized, appropriate addressing circuitry (not shown) causes the digital representation of the speech stored in memory 21 to be output to PCM decoder 22 via output bus 24. PCM decoder 22 then converts this digital representation back into an analog speech signal available at audio output 25.
One disadvantage with using a pulse code modulation scheme, as shown in FIG. 3, for synthesizing speech is that an enormous memory 21 is required for even a modest amount of speech synthesis. For example, assuming a sampling rate of 5 kilohertz, and utilizing 8-bit digital bytes, the bit rate of the pulse code modulation speech synthesis system of FIG. 3 would be 40 kilobits per second. Thus, for 25 seconds of synthesized speech, a rather modest amount, memory 21 must be capable of storing 1,000,000 bits. This large amount of memory required makes pulse code modulation speech synthesis systems impractical for most uses.
Another method of speech synthesis is called differential pulse code modulation (DPCM) or linear delta modulation. A block diagram of a speech synthesis circuit employing differential pulse code modulation is shown in FIG. 4. This system is identical to the pulse code modulation system of FIG. 3, with the exception that pulse code modulation encoder 20 is replaced with differential pulse code modulation encoder 20a, and pulse code modulation decoder 22 is replaced with differential pulse code modulation decoder 22a. A pulse code modulation encoder will convert an audio input sample to a digital representation of the magnitude of the sample voltage. Similarly, a pulse code modulation decoder will take a digital representation and convert it to an analog voltage level. On the other hand, a differential pulse code modulation encoder will cause the amplitude difference between the present sample and the next previous sample to be converted to a digital representation. This digital representation of the amplitude differential between the sampled amplitude and the next previously sampled amplitude is stored in memory 21. A differential pulse code modulation decoder will convert the differential pulse code modulated bytes stored in memory 21 to an analog signal available at audio output 25 which replicates the audio input signal applied to differential pulse code modulation encoder 20 via audio input 19.
Adaptive quantization methods utilize non-linear quantization steps during the encoding and decoding process. In analog speech signals, non-uniform quantizers may be used to allow greater precision over small amplitude changes than over large amplitude changes. For an adaptive differential pulse code modulation (ADPCM) speech synthesis system resulting in the same quality speech synthesis as a pulse code modulation method utilizing a 40 kilobit per second bit rate, a bit rate of only 24 kilobits per second is required. Thus, the same 25 seconds worth of speech synthesis will require only 600,000 bits utilizing an ADPCM system, compared with the 1,000,000 bits required by the PCM system.
Yet another method of coding and synthesizing speech is known as linear predictive coding (LPC). This method has become the predominant technique for estimating the basic apectral parameters of speech, vocal tract area functions, and for representing speech for low bit rate transmission or storage. LPC is capable of providing extremely accurate estimates of the speech parameters, and is capable of rapid computation of these estimates. LPC is based on the fact that speech samples can be approximated as a linear combination of past speech samples. By minimizing the sum of the square differences over a finite interval, between the actual speech samples and the predicted ones, a unique set of predictor coefficients can be determined. The predictor coefficients serve as the weighting coefficients used in the linear combination. One of the great advantages in using linear predictive coding to artifically synthesize speech is that the bit rate required for reliably synthesizing high quality speech is much lower than with many other methods of speech synthesis. For example, a system utilizing linear predictive coding to synthesize speech having quality equal to or greater than the PCM or ADPCM methods mentioned above requires a bit rate of only 2.4 kilobits per second. Thus, for the same 25 seconds worth of synthesized speech, the LPC method requires only 60,000 bits of storage. This is a ten-fold improvement in the storage requirements of a speech synthesis system utilizing adaptive differential pulse code modulation, and a greater than fifteen-fold improvement over the storage requirements of a speech synthesis system utilizing pulse code modulation. For this reason, linear predictive coding is widely used in speech synthesis systems where a minimization of required memory, and thus cost, is desired.
Such a speech synthesis integrated circuit device utilizing linear predictive coding is described in U.S. Pat. No. 4,209,836 issued June 24, 1980 to Wiggins, et al. A primary disadvantage in prior art speech synthesis circuits utilizing linear predictive coding, including the Wiggins circuit, is the relatively large area required by the integrated circuit. For example, the integrated circuit device of the Wiggins patent measures approximately 210 mils (0.210 inches) by 214 mils (0.214 inches), thus consumming approximately 45,000 square mils. By integrated circuit standards, this is a very large chip, even though it is fabricated utilizing a P-channel MOS process, which is capable of producing rather compact integrated circuits. Specifically, Wiggins' array multiplier 401, which performs digital multiplications, measures approximately 90 mils by 110 mils, for a total area of approximately 10,000 square mils. Further, Wiggins' digital-to-analog converter 426, which converts the digital output of array multiplier 401, measures approximately 40 mils by 60 mils, thus requiring a chip area of approximately 2,500 square mils. Thus, approximately 1/4 of Wiggins' prior art circuit is consummed by array multiplier 401 and digital-to-analog converter 426. Due to the rather large size of Wiggins' integrated circuit, no on-chip memory is provided by Wiggins to store digital representations of speech to be synthesized. Thus, the Wiggins circuit requires an external memory for this purpose.
Other prior art circuits used, for example, for the artificial synthesis of speech also utilize binary multipliers which require rather large semiconductor chip areas, thus increasing their cost and requiring external components. Such binary multipliers are described, for example, by Bartee in the book entitled, "Digital Computer Fundamentals", published by McGraw-Hill, 1972 edition, and the book by Rabiner and Gold entitled, "Theory and Application of Digital Signal Processing", published by PrenticeHall, 1975.