This application claims the benefit of Japanese Patent Application No. 2002-349621 filed on Dec. 2, 2002, the contents of which are incorporated by reference herein.
The present invention relates to voice data transmitting and receiving systems and, more particularly, to a voice data transmitting and receiving system capable of securing meaning from data transmitted via a communication path, such as a quality of services (QoS) non-guaranteed network that may be, for instance, the Internet.
The Internet is in common use across borders and all over the world. Electronic commerce transactions and Internet telephone, i.e., internet protocol (IP) telephone, are attracting attention aside from such conventional applications as reading of web pages, electronic mail, and file transfer. This is greatly attributable to a rapid advancement of not only networks centered on line exchange in telephone networks, but also IP networks based on packet exchange.
In some IP telephone communication, various data including voice (or FAX) data and also data of still images and motion picture images, are converted to IP packets to be transferred in an IP based network. What is called Internet telephone is the utilization, in part of or full, of the same IP network, i.e., communication network, for communication in Internet protocol as is utilized for such applications as the World Wide Web, for voice telephone service utilizing IP network techniques.
Among IP telephone systems are the following three different systems. In a first one of these systems, voice messages are exchanged between personal computers which are dial-up interconnected on the Internet. In this system, it is necessary that the same software is installed in the personal computers, which are in turn connected to a server. In a second system, communication cannot be obtained unless a telephone call is provided from a personal computer to a usual subscribed telephone set (converse call being impossible) or prearrangements are made between the two sides. As a third system, two systems are present. In one of these systems, communication is made by inputting a user ID and a PIN via an Internet telephone gateway to a point of connection between an Internet network for communication between usual subscribers' telephone sets and a public telephone line switchboard. The other system is one for communication between direct Internet-coupled terminals. These systems are closest to the present telephone communication system, and their technical advancement is outstanding.
In the meantime, a system for transmitting a great deal of voice data in a narrow band has been proposed, in which on the transmission side an input voice is converted by voice recognition to character data, which are packeted and then transmitted, and on the reception side the received character data is converted to voice data, followed by voice synthesis and output of the resultant data as voice, thereby greatly reducing the transmitted data quantity and avoiding communication delay (see, for instance, Literature 1: Japanese patent laid-open Hei 10-285275). This system, however, although it has an advantage of reducing transmitted data quantity, is based on character data transfer. Therefore, the voice obtained by the synthesis has a fixed character, and is different in character from the speaker's voice.
In IP voice communication via an IP network such as the Internet or a local network without guaranteed QoS communication quality, usually Real Time Communication Packets (RTPs) of User Data Packet (UDP) protocol are used for transmission and reception of voice data. In this case, although RTPs are used with importance attached to the real-time property of data in voice communication and motion picture playback, for the RTP no measure is provided against packet loss occurring on a communication path, and lost packets are not re-transferred, thus posing problems in voice quality, such as interruptions of voice.
To cope with these problems, heretofore, a system has been proposed, in which RTPs are transmitted together with preceding and succeeding packet data for an interpolating process according thereto, so that the voice will not be interrupted even in a packet loss event. However, in an environment in which data communication other than voice is frequently present, voice packet loss is pronounced, and the voice quality deterioration is too significant even by using the interpolation, sometimes resulting in failure of recognizing the meaning of the speech.
As shown above, the real-time voice communication by packet transmission is subject to loss of RTPs due to deterioration of a communication path environment, thus resulting in lost parts of voice data. Heretofore, satisfactory communication could be obtained only in good communication environments.