1. Field of the Invention
This invention relates to a communication method, a voice transmission apparatus and a voice reception apparatus for use for communication through a non-guarantee type network such as an internet.
2. Description of the Related Art
As popularization of an internet proceeds, a technique has been proposed wherein voice is transmitted reciprocally through internet networks to effect bidirectional communication similarly as in a public telephone network. Such a technique of the type just described is called internet phone service.
The internet phone service at present is influenced, because of intervention of a number of networks each including a computer and a router, by a delay by a load of a server at each node such that it suffers from a delay, unnatural interruption of language, jump of voice and so forth, and it is usually the case that complete communication cannot be achieved. This arises from the fact that the internet is a non-guarantee type network (also called best effort type network) which does not guarantee time and arrival of information.
In order to solve the problem just described, a real time protocol and a reserve protocol which assures a line have been proposed. However, they still fail to guarantee complete communication between parties because a network is used commonly basically by a large number of communicating parties.
While the transmission capacity of the internet itself naturally takes part in a cause of the problem described above, also the compression capacity of voice data takes part in it. In compression of voice data, if the compression ratio is raised, then the voice quality is deteriorated, but if the compression ratio is suppressed low, then although the voice quality is raised, since an increased band width of the transmission line is consumed, a delay is produced and skipping, jumping or blank of voice occurs. In either case, a desired voice quality cannot be obtained.
One of techniques for sending much voice information in a narrow band is to convert inputted voice of a talking person into character data by a speech recognition technique and transmit the character data to the reception side. Since the information amount of character data is much smaller than that of voice information, the communication delay can be reduced, and besides, any problem involved in speech recognition does not occur. A technique of the type described is disclosed, for example, in Japanese Patent Laid-Open Application No. Heisei 60-136450 or Japanese Patent Laid-Open Application No. Heisei 61-256848.
The former document discloses a system wherein input voice is recognized first and then converted into data in the form of a packet of a packet exchange and the data are communicated between terminals of the packet exchange, and proposes the system as a countermeasure to improve the processing efficiency of the exchange. In the system disclosed, since it involves communication in the single packet exchange network, no countermeasure is taken against a long delay or a load variation which occurs in internet networks wherein communication is performed through a large number of unknown nodes. Further, as recited in the document, it is difficult to apply the system to flexible and wide range information transmission in that reproduction is performed only with a uniform tone and no attention is paid to natural voice.
The latter document discloses another system wherein speech recognition is performed to obtain character codes by an originating terminal and the character codes are sent through an exchange to a terminating terminal and then speech synthesis is performed based on the character codes by the terminating terminal. Since also the system involves communication in a single communication network, similarly to the system of the former document, no countermeasure is taken against a long delay or a load variation which occurs in internet networks wherein communication is performed through a large number of unknown nodes. Also, no countermeasure is taken for real time conversion or conversion into natural voice.
Accordingly, the systems described above have the following subjects to be solved.
The first subject resides in that conversation which allows recognition of significance is disturbed by deterioration of voice, unnatural interruption of language, jump of voice or the like which arises from the transmission capacity of internet networks themselves or unstable communication or a load variation because of intervention of an unknown communication path such as servers provided in multiple stages.
The second subject resides in that, also in a voice transmission system which employs speech recognition, where only transmission using character codes is involved, since mechanical voice is reproduced, natural conversation cannot be achieved and there is the possibility that such a problem as misunderstanding may possibly occur.