Today, various types of communications such as electronic mail and WWW (World Wide Web) communications are performed on the Internet by using IP (Internet Protocol) (see Non-patent literature 1) packets.
The Internet, widely used today, is a best-effort network, in which delivery of packets are not guaranteed. Therefore, communication that performs retransmission control using the TCP (Transmission Control Protocol) (see Non-Patent literature 2) is often used to ensure more reliable packet transmission. However, if retransmission control is performed to resend a lost packet on occurrence of packet loss in such communications as communication using VoIP (Voice over Internet Protocol) in which real-time nature is essential, the arrival of packets will be significantly delayed and therefore the number of packets that are stored in a receiving buffer will have to be set to a large value, which spoils the real-time nature. Therefore, such communications as VoIP communications are typically performed by using the UDP (User Datagram Protocol) (see Non-patent literature 3), which does not use retransmission control. However, this has posed the problem that packet loss occurs during network congestion and consequently the speech quality is degraded.
One conventional approach to preventing speech quality degradation without resending packets is to send the duplications of the same packet in accordance with the packet loss rate during the transmission to increase the probability of arrival of packets, thereby preventing speech interruptions (see Patent literature 1). However, packet loss occurs most frequently during network congestion and if excessive duplicated packets are sent in such a state, there arises a problem that the increase in the amount of information sent and the number of sent packets aggravates network congestion and consequently further increases the number of packet losses. Another problem is that, because duplicated packets are being sent constantly while the packet loss rate is high, the network transmission interface is overloaded, resulting in packet transmission delay.
An approach to preventing speech quality degradation due to packet loss without increasing delay is a speech data compensation approach. For example, the method in G.711 Appendix I (see Non-patent literature 4) repeats data in the past pitch period to fill a lost segment. However, this method has a problem that, if speech data in a region such as a speech rising period in which a signal changes drastically is lost, abnormal noise occurs, because the speech data synthesized from the past data has a power and pitch different from those in original speech.
Another approach has been proposed in which the sending end assumes that packet loss will occur at the receiving end and the sending end synthesizes a speech waveform by repeating a speech waveform of the pitch length in the current frame and, if the quality of the synthesized speech waveform with respect to that of the original speech waveform of the next frame is lower than a threshold, then a compressed speech code of the next frame is sent as a sub-frame code along with the speech code of the current frame by using packets (Patent literature 2). With this method, on the occurrence of packet loss of the current frame at the receiving end, if a sub-frame code is not contained in any of the packets of the preceding and succeeding frames, the current frame is synthesized from the waveform of one pitch length in the preceding frame, or if a sub-frame code is contained, the code is decoded and used. In either case, a speech waveform with a lower quality than that of the original speech signal will be generated. This method has the following problem: the method adds the sub-codec information to the preceding and succeeding packets in addition to the current frame on condition that the quality of the compensatory waveform is lower than a specified value, therefore if three or more consecutive packets are lost, both of the coded information of the current frame and the sub-codec coded information which is sent using the preceding and succeeding packets cannot be available and thus the quality of the decoded speech is degraded.
Patent literature 1: Japanese Patent Application Laid-Open No. 11-177623
Patent literature 2: Japanese Patent Application Laid-Open No. 2003-249957
Non-patent literature 1: “Internet Protocol”, RFC791, 1981
Non-patent literature 2: “Transmission Control Protocol”, RFC793, 1981
Non-patent literature 3: “User Datagram Protocol”, RFC768, 1980
Non-patent literature 4: ITU-T Recommendation G.711 Appendix I, “A high quality low-complexity algorithm for packet loss concealment with G.711”, pp. 1-18, 1999
Non-patent literature 5: J. Nurminen, A. Heikkinen & J. Saarinen, “Objective evaluation of methods for quantization of variable-dimension spectral vectors in WI speech coding”, in Proc. Eurospeech 2001, Aalborg, Denmark, September 2001, pp. 1969-1972