With the emergence of 3rd generation (3G) mobile communication systems and the rapid development of networks based on the Internet Protocol (IP), the video communication increasingly becomes one of the main communication services, for example, bi-party or multiparty video communication services such as video phones, video conferences, and mobile terminal multimedia services.
During a video/audio transporting process, in order to reduce a transported data amount, an original video/audio sequence is compressed according to a certain coding algorithm (for example, H.263, H.264, or G.729), so as to obtain a code stream with different data amounts per frame. In order to be adapted to the network transportation, a packaging and fragmentation process is performed on the compressed code stream according to a certain packaging protocol, for example, the packaging and fragmentation process on the IP network is called packing. FIG. 1 shows a typical video/audio transporting process on the IP network.
After being coded and packed, one frame of an original picture is divided into a plurality of Real-time Transport Protocol (RTP) packets for transmission. In order to improve a transporting efficiency, video packets are usually transported through a User Datagram Protocol (UDP) protocol, in which the handshake and acknowledgement are not required, so the transporting efficiency is quite high. However, the video data packets may be lost easily, so that the receiving terminal cannot decode a complete picture.
In order to reduce the effect of the data packet loss on the video picture, recently many methods were proposed, which may be concluded in the following types.
(1) Compensation method based on time and space: the method is performed on the receiving terminal; when detecting the packet loss, the receiving terminal identifies the picture (voice) regions which cannot be decoded due to the packet loss, and then compensates the lost regions by using the motion compensation, linear interpolation, and other methods according to a dependency of the voice and the picture in time and space.
(2) Packet compensation method based on Forward Error Correction (FEC): in the method, the FEC algorithm is adopted on the transmitting terminal. The video/audio code stream is verified to generate verifying data, and then the video/audio data and the verifying data are transferred to the receiving terminal. After detecting the video/audio data packet loss, the receiving terminal completely recovers the lost data packets according to the verifying rule the same as that of the transmitting terminal based on the verifying data, so as to finally recover the complete picture and voice.
(3) Lost packet retransmission method: after detecting the pack loss, the receiving terminal notifies the transmitting terminal, and the transmitting terminal retransmits the lost packets.
In a current packet verifying and transporting system adopting the FEC method, in order to enable the receiving terminal to identify the data packets and the verifying packets, and identify the data packets for generating the current verifying packet, an RTP packing format of the verifying packet is regulated as follows.
A packet header of a verifying packet and a load of the verifying packet are put into a load of the RTP packet, so as to form an FEC packet (that is, an RTP-based verifying packet), as shown in FIG. 2. A length of the packet header of the verifying packet is 12 bytes, and the format is as shown in FIG. 3. The packet header includes a sequence number (SN) base domain adapted to denote a packet SN, a length recovering domain, an E domain, a payload type (PT) recovering domain, a Mask domain, and a tag switching (TS) recovering domain.
In the packet header of the verifying packet, a value of the SN base domain must be set to a minimal packet SN in the data packets corresponding to the verifying packet, for example, if the verifying packet is generated by the data packets with the packet SNs of 12, 14, and 18, the value of the SN base domain must be set to 12. A length of the Mask domain is 24 bits; if an ith bit is set to 1, the data packet with the SN of N+i is associated with the verifying packet, that is, the data packet with the SN of N+i exists in the data packets corresponding to the verifying packet. Here, N is the value of the SN base domain, a least significant bit (LSB) is corresponding to i=0, and a most significant bit (MSB) is corresponding to i=23, so one data packet is generated by 24 data packets at most. For the above case, in the Mask domain of the packet header of the verifying packet, a 0th bit, a 2nd bit, and a 6th bit are set to 1, indicating that the data packets corresponding to the verifying packet are those with the packet SNs of 12 (12+0), 14 (12+2), and 18 (12+6).
However, the inventors of the present invention find that a corresponding relation between the verifying packet and the data packets is determined according to the SN base domain (that is, the packet SN) and the Mask domain in the verifying packet currently; however, in the multipoint video communication, a relay device, for example, a multipoint control unit (MCU) usually modifies the packet SN to ensure a continuity of the packet SN during the site switching. Therefore, the receiving terminal cannot correctly recover the corresponding relation between the verifying packet and the data packets, resulting in verification errors. In order to prevent the problem, it is necessary to perform the recovering operation on the receiving terminal of the MCU, and to perform the verification again on the transmitting terminal, as shown in FIG. 4, as a result, the load of the MCU is increased, and a matching dependency on the MCU is generated.