A speech synthesizer, a melody generator or a combination of melodies and synthesized speech is useful in a variety of commercial equipments.
A conventional melody generator, as shown typically in FIG. 1, includes a START ROM 11, TEMPO COUNTER 13, RHYTHM COUNTER 15, ADDRESS COUNTER 17, MELODY ROM 19, ENVELOPE COUNTER 12, TONE COUNTER 14, D/A CONVERTER 16, MIXER 18 and oscillator (OSC) 10, and generates accessed melody 181 at the MIXER 18.
In response to different trigger signals TG1, . . . TGn, a corresponding melody in MELODY ROM 19 is selected. The START ROM 11 stores the tempo and start address of each melody in a data structure shown in FIG. 2. The start address 111 selected by the trigger signal TGn are received by the ADDRESS COUNTER 17, which is clocked by a clock signal CLK, and sends address signal 171 to access the contents of the MELODY ROM 19 incrementally.
The MELODY ROM 19 stores information, such as rhythm, tie and tone, of each note in the synthesis sequence corresponding to the selected melody in a data structure shown in FIG. 3.
The tempo, representing the speed of the melody, is decided when selection is made to the START ROM 11 by TGn signal, while the rhythm of each note, representing the specific relative duration of the note under the specified tempo, is decided by the value of the RHYTHM in MELODY ROM 19.
The tempo represents the speed of the melody and the TEMPO COUNTER 13 is pre-set by the tempo signal 113. The TEMPO COUNTER 13 receives a basic clock 101 from the OSC 10 and divides the frequency of the basic clock 101 in response to the value of the tempo signal 113. The greater the value of tempo signal 113, the smaller the frequency of the system clock 131 output from the TEMPO COUNTER 13. When the frequency of the system clock 131 is low, the frequency of the output signal 151 from the RHYTHM COUNTER 15 is low and, as a result, the speed of the melody output 181 or the tempo from mixer is thereby slowed down.
The rhythm information 191 is output from MELODY ROM 19 to pre-set the RHYTHM COUNTER 15. When the specified relative duration, represented by the value of the rhythm information 191, of a note comes to an end, the output signal 151 of RHYTHM COUNTER 15 changes state once which increments the ADDRESS COUNTER 17 by one. Therefore, each consecutive note of a melody is accessed sequentially until an END information in the MELODY ROM 19 is reached.
The tone information 193 from MELODY ROM 19 is received by TONE COUNTER 14, which is clocked by CLK2 signal, and generates OUT signal shown in FIG. 5. In FIG. 5, each square wave signal with a frequency corresponds to one tone value stored in MELODY ROM 19.
The TIE information 192 from MELODY ROM 19 is received by ENVELOPE COUNTER 12, which is clocked by CLK1 signal, and generates a digital ENV signal. The digital ENV signal is fed to the D/A converter 16, and the output of the D/A converter 16, as shown in FIG. 4, is then mixed with OUT signal by MIXER 18 to result in the melody output 181 shown in FIG. 6. In the example of FIG. 4, the third note is tied to the fourth note indicated by TIE=1 while others being not tied to its immediate following note indicated by TIE=0.
It is obvious, in order to generate melody, the circuit shown in FIG. 1 is complicated and is expensive.
One typical speech synthesizer, as shown FIG. 7, includes CONTROL CIRCUIT 71, ROM 73, SPEECH GENERATOR 75, D/A converter 77 and oscillator 79.
As shown in FIG. 7 and FIG. 8, the ROM 73 has three different segments, START ADDR 731, GO COMMAND 732 and SPEECH DATA 733. The data structures of each segment and the access path are shown in FIG. 8 by 81, 82, and 83 respectively.
The START ADDR 731 has the same function as START ROM 11 of the melody generator of FIG. 1, and stores attribute information and a start address of each speech code TGn which is input to CONTROL CIRCUIT 71. GO COMMAND 732 stores data attribute, a data length and a data address for each basic speech section accessed in the synthesis sequence corresponding to a speech code. The data attributes within GO COMMAND 732 may include speech playback frequency, length of bytes and LED control signals in accordance with a well-known conventional approach. In a well known manner, the value of the playback frequency is used to control the operation speed of the speech generator 75 and thereby control the playback speed of the output 771. The SPEECH DATA 733 stores data representing basic speech (sound) section for synthesis purpose.
As an example, suppose a speech equation TG: HEAD+2*SOUND1+SOUND2+TAIL is programmed into the ROM 73. The start address within the START ADDR 731 stores the address value, assuming it is 00, for accessing this speech equation TG. The location of address 00 of the GO COMMAND 732 stores the data attribute, data length and data address for the first sound section HEAD. The location of the following address 01 stores the data attribute, data length and data address for the second sound section SOUND1. The location of the further following address 02 stores the data attribute, data length and data address for the third sound section SOUND2, etc.. On the other hand, the SPEECH DATA 733 stores respectively the data required for synthesizing the sound section HEAD, SOUND 1, SOUND2 and TAIL respectively. Furthermore, the SPEECH DATA 733 may store data representing silence, or, in different term, no speech being generated.
The output of the D/A converter 77 corresponding to the speech equation TG: HEAD+2*SOUND1+SOUND2+TAIL may have a shape shown in FIG. 9. The HEAD enables the output signal rising from zero to an intermediate value which biases the external amplifier transistor in an operating range. When the TAIL is encountered, the output signal decreases to the initial zero state. However, the above described speech synthesizer in FIG. 7 is applicable to the production of synthesized speech only.
There are several different types of approaches, according to the conventional arts, to produce melody and speech by a single integrated circuit.
Referring to one conventional approach of FIG. 10, a melody circuit 102 and a speech circuit 103 are coupled to each other back-to-back in a single monolithic chip 100. However, the operation of the individual circuits is independent from each other and therefore no substantial benefit results from this conventional approach. Furthermore, it is difficult, if not impossible, to synchronize the melody circuit 102 with the speech circuit 103 in this configuration.
Referring to another conventional approach of FIG. 11, the OSC circuit 114 and the control circuit 112 are common to speech circuit 115 and melody circuit 117 in a single monolithic chip 110. No further saving of common circuits are achieved in this configuration and synchronization between speech circuit 115 and melody circuit 117 still is not readily implemented.
Referring to still another conventional approach shown in FIG. 12, the MELODY ROM 120 and SPEECH ROM 122 are integrated together in a single monolithic chip 118 and are distinguishable by the labels M, S. The advantages of the design reside in the easy synchronization between the melody circuit 125 and the speech circuit 127, and the interchangeable operation of the melody circuit 125 and speech circuit 127. However, this configuration does not allow output of speech and melody at the same time, since both functions use a common DATA ROM including MELODY ROM 120 and SPEECH ROM 122. Only one melody data or a speech data can be accessed at any time.
U.S. Pat. No. 4,613,985 discloses a synthesizer with the function of developing melodies. The synthesizer includes a memory storing the sequence of synthesis for each word and melody, a synthesized word generator providing audible indications of respective speech and a melody generator providing melodies in the form of a synthesized sound. The selected melodies are audibly delivered by fetching their associated sequence of synthesis from the memory.