In the Internet having remarkably developed in recent years, for the purpose to deliver a voice from a server to a client, there has been employed the technology to compress a voice into a form of waveform data (.wav or .au) and transfer the waveform.
In the Internet, there is a tendency that users do not want to download a home page including a large quantity of data to be transferred. Thus, it is a key to popularization of voice communications to enable the transfer of waveform data having a large data size as a small quantity of data to be transferred.
To solve the problems relating to a transfer rate in voice communications as described above, there is, for instance, the technology disclosed in Japanese Patent Publication No. HEI 5-52520. This publication discloses the technology in which a voice is divided into voice source information and voice route information corresponding to the voice source information. The voice source information and voice route information corresponding to each other are then synthesized into a voice when desired.
However, as the Internet is a communication network utilized by many unspecified persons, generally a client accesses arbitrary voice source information, namely voice-generating information from a server, and fetches the voice-generating information. In this process, the client cannot confirm whether the prepared voice route information, namely voice tone information, is identical to the accessed voice-generating information or not.
For this reason, if a speaker providing voice tone information is identical to a speaker providing the voice-generating information, and at the same time conditions for making the voice tone information are the same as those for making the voice-generating information, there is no problem in reproducibility of a voice by means of voice synthesis. However, if the speakers or conditions are different, as an amplitude is specified as an absolute amplitude level and voice pitch is specified as an absolute pitch frequency, an amplitude pattern inherent to the voice tone information is not reflected, and there is the possibility that the voice may be inappropriately reproduced when synthesized.