1. Field of the Invention
This invention relates generally to video communication over a packet-based network, and relates more particularly to a system and method for improving the quality of video communication.
2. Description of the Background Art
Video communication over packet-based networks such as Internet protocol (IP) networks use a variety of communication protocols. For example, H.323 is one member of a family of real-time communication protocols approved by the International Telecommunication Union (ITU). Each protocol in the family addresses a different underlying network architecture, such as, but not limited to, a circuit switched network, a local area network (LAN) with quality of service (QoS), and a LAN without QoS. H.323 defines how audiovisual conferencing data is transmitted across packet-based networks. H.323 is not an individual protocol, but rather a vertically-integrated suite of protocols that defines every component of a conferencing network: endpoints, gateways, gatekeepers, multi-point control units (MCUs), and other feature servers. For example, H.323 uses a call setup protocol (Q.931), a call signaling protocol (H.225), and a protocol for exchanging terminal capabilities and creation of media channels (H.245).
Another example of a video communication protocol in IP networks is a Session Initiated Protocol (SIP). SIP is a signaling protocol for Internet conferencing, telephony, events notification, and instant messaging. The protocol initiates call setup, routing, authentication, and provides other control features between endpoints within IP networks. SIP is more or less equivalent to the Q.931 and H.225 components of H.323.
Packet-based networks, using SIP or H.323 protocols, for example, do not offer an end-to-end connection with a guarantee that all packets will reach their destination. However, packet-based networks may implement additional protocols to reduce lost packet error rates. For example, E-Mail applications use transmission control protocol/Internet protocol (TCP/IP). TCP/IP is a connection-based protocol that verifies arrival of packets at a destination node. If a packet does not arrive at the destination node, TCP/IP may send a request to the source node for re-transmittal of the lost packet. User datagram protocol/Internet protocol (UDP/IP) is a connectionless protocol that runs on top of IP networks. However, UDP/IP provides very few error recovery services, unlike TCP/IP. UDP/IP is used primarily for broadcasting messages over a network without first establishing an endpoint-to-endpoint connection, and lost packets are typically handled by endpoint applications.
Since video communication occurs in real-time, any delay in video data propagation time over an IP network may reduce the quality of the video communication received and processed by the endpoints. A handshake protocol like TCP/IP increases delays, particularly when dealing with lost packets, and degrades the video communication. Therefore, video communication using H.323, SIP, or similar protocols will normally also use UDP/IP. However, since UDP/IP does not have an internal mechanism for lost packet recovery, some of the packets may never reach their destination. Typically, video communication using UDP/IP loses 0-10% of transmitted packets.
In addition to the various communication protocols, video communication uses various video compression standards such as H.261 and H.263, among others. Two types of frames are defined under these compression standards: non-referential frames such as Intra (I) frames, and referential frames (typically referred to as non-Intra frames), such as Inter frames, B frames, PB frames, and P frames, among others. A referential frame generally includes an image difference between a current frame and a previous known frame. A frame is composed of an array of Macro Blocks (MBs). Each MB is comprised of an array of pixels, and typically a MB is composed of a 16×16 pixel array. The endpoints encode, decode, and send video communications over the IP network as packets, and each packet includes an integer number of MBs. Further, the packet may include a whole frame or part of a frame.
Since endpoint video encoding/decoding is based on referential information between non-Intra frames, loss of a packet might damage the video quality for all frames subsequent to the lost packet until a new Intra frame arrives at a destination endpoint, also referred to as a destination node. Loss of packets may also cause loss of video synchronization at the destination node. The usual way a video decoder at the destination node handles lost or damaged packets is by signaling a source node to send an entire Intra frame if a packet has been lost, or signaling the source node to send a part of the Intra frame if the packet has been damaged. A corrupted portion of the damaged packet can be reconstructed using an associated part of the Intra frame. The signaling may be done outside of the video communication band as defined in the H.225 signaling protocol, for example. While waiting for the next Intra frame, the endpoints generally freeze the display. Furthermore, frequent Intra frame transmission introduces disruptive video quality. Consequently, frequent loss of video synchronization in an endpoint-to-endpoint connection causes the video images received at the endpoints to be poor in quality.
Video quality may be further degraded when a network uses a MCU node to allow two or more endpoints to participate in a multimedia communication session. The MCU node increases the complexity of handling lost packets over the network.
There are few known ways in the audiovisual communication art by which a network node handles missing packets in a manner that does not degrade transmitted audiovisual images. Particularly, there are few methods by which a source node generates redundant information to enable a destination node to replace or rebuild missing or damaged packets. In addition, these prior art methods require either that both the source node and destination node communicate via a handshake procedure, or that the destination node be specifically modified to handle missing packets. Therefore, there is a need for a system and a method to better handle missing or damaged packets while maintaining high quality video images.