The present invention relates to simplified electronic voice synthesizers capable of producing quality speech and in particular to internal circuits therein for modulating pitch and speech rate.
In general, the present invention relates to voice synthesizers of the type disclosed in U.S. Pat. No. 4,128,727, issued Dec. 5, 1978, entitled "Voice Synthesizer," and in U.S. Pat. No. 4,130,730 issued Dec. 19, 1978, entitled "Voice Synthesizer," both of which have been assigned to the assignee of the present invention. Both of these patents disclosed voice synthesizers that phonetically synthesize human speech in response to sequence of digital input command words that identify a sequence of phonemes.
The U.S. Pat. No. 4,128,737 described a synthesizer capable of producing remarkably realistic sounding speech, which included control circuits responsive to input command words to vary the overall rate and volume of the speech generated, as well as the duration of each phoneme produced. In particular, each input command word consisted of twelve bits, seven of which were dedicated to phoneme selection to define a particular phoneme, pause, or control function, three of which were dedicated to inflection control, that is, varying the fundamental frequency or pitch of the voiced component of the phoneme, and two of which were dedicated to speech rate timing, that is, varying the normal time duration of the production of any given phoneme. With seven bits dedicated to phoneme selection, the synthesizer had the capacity to recognize 2.sup.7 or 128 different phonemes or commands. When these seven bits assumed one particular state, preferably 0000000, special decoder and control circuits within the synthesizer recognized the command word as a special instruction or flag command, rather than a phoneme selection command. The remaining five bits of flag command words were then decoded and directed to latch circuits which remembered their state. In particular, the two speech rate bits of the flag command word were directed to special flip-flop circuits which remembered the state of the bits until the next flag command was received. The output of these flip-flop circuits were then directed to speech rate modulation circuitry where they caused a relatively large adjustment in the speech rate in comparison to the effect of the speech rate control bits during a phoneme selection command. The two inflection control bits of the flag command were similarly directed to other flip-flop circuits and from there to pitch modulation circuits that modulated the over-all frequency of a series of phonemes. Thus, through use of a single command word, a computer or other device driving the voice synthesizer could set the over-all volume and rate of the synthesized speech for any desired number of phonemes following thereafter. When a device driving the synthesizer had been properly programmed to use flag command words, the synthesizer generates speech that is more natural sounding and much less monotonic than when the flag command words are not used to vary the rate and volume of the synthesized speech. The two speech rate timing bits and associated circuitry within the synthesizer disclosed in the U.S. Pat. No. 4,128,737 enables the external device driving the synthesizer to make minor changes in the normal duration of any given phoneme. These two bits provide four possible time intervals for each phoneme to be produced, one of the intervals being of normal duration, and the other three being minor variations thereof. These externally programmed rate bits enhance the ability of the synthesizer to generate extremely realistic-sounding speech by allowing the phonemes to be more contextually precise in time duration.
The U.S. Pat. No. 4,130,730 disclosed a speech synthesizer that is simpler in design, smaller in size, and less expensive than the one in the former patent, which nonetheless is capable of producing quality speech. The simplifications were made in part by using an eight bit command word to drive the latter synthesizer. Six bits of the command word are devoted to phoneme selection, which limits the maximum number of phonemes which can be synthesized to 2.sup.6 or 64. The remaining bits of the command word are dedicated to inflection control, which yields a maximum of four inflection control states: one normal state and three variations thereof. Absent from this synthesizer are some of the very features which gave the former synthesizer its sophistication and flexibility: the extra inflection control bit, the two phoneme timing control bits, and the flag command, decode, and control circuitry which enabled the former synthesizer to modulate the overall pitch and speech rate of the synthesized speech. As a result, the speech produced by the latter synthesizer is relatively monotonic and monospeed.
The synthesizer disclosed in the U.S. Pat. No. 4,130,730, however, incorporates a number of unique improvements into its circuits which help improve the quality of the synthesized speech in certain other ways in spite of the aforementioned reduction in sophistication and flexibility. For example, additional inflection variations are derived from internal control signals that control phoneme articulation; a glottal waveform that is more representative of those produced by the human glottis is employed; and a white noise generator is used to provide a component part of the excitation energy provided to the vocal tract under the control of the vocal amplitude control signal to produce a "breathier" sound. These improvements were made without significantly increasing the complexity or cost of the synthesizer. However, the problem with the monotonic and monospeed output remained.
The present invention seeks to maintain the tradition of creating simpler and less expensive synthesizers, while simultaneously improving upon the ultimate understandability of synthesized speech. Accordingly, the principal object of the present invention is to provide a relatively uncomplicated and inexpensive voice synthesizer which internally and automatically modulates pitch and speech rate. Another object of the invention is to provide fairly inexpensive circuitry for accomplishing the principal object within the type of synthesizer disclosed in the U.S. Pat. No. 4,130,730. Yet another object of this invention is to provide a method for improving the understandability of phonetically synthesized speech by providing a synthesizer that automatically varies the pitch and speech rate of the synthesized speech without resorting to the use of externally programmed input command bits.
Other objects, features and advantages of the present invention will become apparent from the subsequent description and the appended claims taken in conjunction with the accompanying drawings.