Phonetically-driven electronic speech synthesizers conventionally include a filter network or model which simulates the characteristics of the human vocal tract. The vocal tract filter network or model receives input signals indicative of vocal and/or fricative sounds in the phoneme to be synthesized, and provides an output to an appropriate speaker or the like. Each available phoneme has associated therewith a number of parameters for effectively controlling poles of the vocal tract filter network or model, as well as controlling amplitude and timing characteristics of input and output signals to or from the vocal tract. To synthesize words or phrases, necessary phoneme parameter signals are fed in turn to the synthesizer electronics.
U.S. Pat. No. 3,836,717 discloses a phonetically-driven speech synthesizer in which a multiplicity of phoneme speech parameters are stored in a read-only-memory matrix addressable by a six-bit phoneme input code. The selected parameters for each phoneme are fed through resistor ladder networks for conversion to analog signals, and then fed through lowpass filter networks to simulate dynamic sluggishness of a human vocal tract. Vocal and fricative sounds from separate sound generators are combined and directed to a vocal tract which includes a series of tuned frequency domain resonant filters for combined amplitude and frequency control as a function of the filtered phoneme parameters. The remaining two bits of the eight-bit input code control pitch of synthesized vocal sounds. Among the phoneme parameters stored in the ROM matrix are the constants which define the poles of the resonant filter vocal tract, and parameters which operate on vocal and fricative sounds to simulate interaction of successive phonemes.
U.S. Pat. No. 3,908,085 discloses an improvement in the synthesizer disclosed in the aforementioned patent in which the vocal tract comprises series-connected tunable filters which receive duty-cycle control signals as a function of phoneme parameters. U.S. Pat. No. 4,209,844 discloses a speech synthesizer in which a digital time-domain lattice filter network is alternately connected to a vocal or a fricative sound source for receiving digital data indicative of sounds to be uttered. The digital lattice filter network, which is implemented in a custom integrated circuit, performs a series of multiplications and summations on input data under control of filter pole-indicating coefficients which vary between the decimal equivalent of minus 1 and plus 1. Other prior art patents of background interest are U.S. Pat. Nos. 4,128,737, 4,130,730, 4,264,783 and 4,433,210.
Although speech synthesizers of various constructions have been developed and marketed in accordance with one or more of the above-noted patents, a number of deficiencies remain. For example, speech synthesizers heretofore proposed are generally characterized by substantial bulk and expense, severely limiting the scope of commercial applications. Furthermore, devices heretofore proposed do not simulate human speech as closely as desired in terms of certain types of phonetic sounds--i.e., combined voice/fricative sounds--and certain types of sound transitions between adjacent interacting phonemes. A general object of the present invention is to provide a speech synthesizer and method of operation which are compact and versatile in design and implementation, which are economical to fabricate and market, which are readily amenable to programming for articulation of differing phoneme strings, and which generate phonetic sounds which closely simulate human speech. A further object of the invention is to provide a speech synthesizer and method of the described character in which parameters such as pitch and speed rate may be varied at will by an operator.