1. Field of the Invention
The present invention relates to a speech synthesis system in which a plurality of clients and at least one voice synthesizing server are connected to a local area network (LAN).
2. Description of the Related Art
Systems for synthesizing voice from text data at a client's request and transmitting the result to the client have become popular. These systems include a voice synthesizing server and more than one client on a LAN. FIG. 1 shows a basic configuration of such systems. A client 1 includes a text input unit 11, a text sending unit 12, a waveform receiving unit 13, and a voice output unit 15. A voice synthesizing server 2 includes a text receiving unit 21 for receiving text data sent from the text sending unit 12, a pronunciation symbol generating unit 22, an acoustic parameter generating unit 23, a waveform generating unit 24, and a waveform sending unit 26 for sending to the client 1 a voice waveform synthesized by the waveform generating unit 24.
When text data are applied from the text input unit 11 of the client 1, the text sending unit 12 sends the text data to the voice synthesizing server 2. The voice synthesizing server 2 receives at the text receiving unit 21 the text data sent by the text sending unit 12, and the pronunciation symbol generating unit 22 converts the text data to pronunciation symbol strings representing how the data are actually pronounced. Then, the acoustic parameter generating unit 23 converts the pronunciation symbol strings to voice-parameters-in-time-series, and the waveform generating unit 24 generates voice waveforms according to the voice-parameters-in-time-series. Thus, the waveform sending unit 26 sends the generated voice waveform to the client 1.
The client 1 receives a voice waveform at the waveform receiving unit 13, and the voice output unit 15 regenerates the voice waveform as voice.
The above described conventional speech synthesis system has a problem in that there is heavy traffic in a LAN because the system transmits voice data (synthesized voice waveforms) directly between the client 1 and the voice synthesizing server 2.
Additionally, since the conventional speech synthesis systems execute communication between a server and a client using fixed type data regardless of the server's and the client's resources (functions), they have another problem in that the resources of the client 1 are not made use of much. That is, although the client 1 has the function of generating a pronunciation symbol using the system shown in FIG. 1, the data sent from the client 1 to the voice synthesizing server 2 are text data only. Thus, the function of the client 1 is not utilized efficiently.
The client 1 may not have a D/A converting function, and a user of such a client 1 cannot regenerate digital data sent from the voice synthesizing server 2. Therefore, the conventional systems have a further problem that only clients having the D/A converting function can receive voice data.
Recently, dictionary retrieving systems have become popular, too. These systems comprise, in the above described local area network, a dictionary retrieving server for storing of word data. When a user of the client 1 requests the retrieval of a specific word, the dictionary retrieving server retrieves the meaning and the phonetic symbols of the word, and transmits the total information to the client. When a word is retrieved, it is very convenient to obtain the meaning and the vocal information of the pronunciation of the word. However, no conventional systems have the function.
Also, commonly used are schedule managing systems for storing schedule data inputted by a user and for informing the user of the data through a message, etc. when the scheduled date has arrived. Such systems are more useful if they vocally output the content of the schedule. However, no conventional systems cover such functions.