1. Field of the Invention
The present invention relates to a data processing apparatus and method and a network system such as a teleconferencing system.
2. Description of the Related Art
Teleconferencing systems in which two or more people at different locations confer over a network commonly employ a device referred to as a multipoint control unit (MCU).
The main functions of the MCU include, besides call control, a computational processing function for receiving and if necessary decoding the voice, video, and data signals arriving from the terminals of the participants in the conference, a selective combining function for selecting and combining the incoming signals, including mixing different voice signals and spatially mixing different video signals, and finally functions for coding the selected and combined signals and distributing the resulting voice, video, and data signals to the conference participants. The coding and decoding functions are carried out by software or hardware devices referred to as codecs.
Transmitting and receiving video over a network requires video codecs that can compress and decompress the video data. The video codecs at both ends must be compatible with regard to coding system, frame rate, image size, and bit rate. By assigning a separate video codec for each participating terminal, and matching the assigned codec to the type of codec used by the terminal, an MCU with multiple video codecs can absorb the incompatibilities between different terminals, so that all types of terminals can connect to the conference and exchange video signals, as described in U.S. Pat. No. 6,584,077 to Polomski.
This type of MCU has the merit of sending each terminal video data that have been coded with the coding system, frame rate, image size, and bit rate optimal for the terminal, and the further merit of minimizing, the transmission of intraframes. An intraframe is a frame that is coded with reference only to itself. Because intraframes have a much lower compression ratio than interframes, which are coded with reference to other frames, the transmission of an intraframe has a negative effect either on the network or on the receiving terminal. The negative effect on the network occurs if the intraframe is coded at full resolution, causing a momentary jump in the amount of data traffic the network must carry. The negative effect on the terminal occurs if the intraframe is coded at reduced resolution, forcing the terminal user to accept a low-quality image. Despite these negative effects, a terminal must occasionally request the transmission of an intraframe because the loss of a data packet in transit on the network has made it impossible to decode an interframe. When this happens, an MCU that uses a separate codec for each connected terminal only has to transmit an interframe to one terminal, so the extent of the negative effect is minimized.
An MCU that uses a separate codec for each terminal, however, can only allow a limited number of terminals to connect to a conference, because each additional connected terminal requires the MCU to operate an additional codec and handle the load of its coding processing. Particularly when the server on which the MCU is implemented is a general-purpose personal computer, the number of terminals that can participate in a conference is severely limited.
An alternative system, described by Nakano et al. in ‘Daikibo bideo kaigi ni okeru eizo no hinshitsu kaizen no kento’ (A study of video quality improvement for large scale video conferences), IEICE Technical Report, Vol. 107, No. 229, CQ2007-66, 2007 pp. 165-170, uses only one video codec per conference and transmits the video output of the codec to all participants in the conference. The transmitted video signal has been coded in such a way that it is receivable by all participating terminals. With this system, when additional participants enter a conference the MCU does not have to provide additional codecs for them, so a conference with a large number of participants can be carried out with a relatively small computational load.
The usage of communication bandwidth can also be greatly reduced by multicasting the video data from the MCU to the connected terminals, but because it is not possible to send each connected terminal optimally coded data, several problems arise that do not occur in Polomski's system.
One problem is that since the video data must be coded by methods compatible with the participating terminal having the most limited capabilities, participants whose terminals support the latest video coding systems, and are capable of receiving and producing large high-definition video images transmitted at high bit rates, must content themselves with the smaller images of poorer quality that can be transmitted at a lower bit rate, using an older coding system. Another problem is that when a particular terminal requests an intraframe from the MCU to provide video data that were lost because of a dropped packet, the MCU must send the intraframe to all terminals. When this happens, all connected terminals experience the negative effects of intraframe transmission, including a temporary increase in network bandwidth usage or the reproduction of a low-resolution video frame, while only one terminal experiences the positive effect of the recovery of a lost image.
The frequency with which intraframes must be transmitted increases with the number of connected terminals. When a conference has several dozen participants, if an intraframe is transmitted in response to every request for an intraframe from a terminal, then even if the network has a comparatively low packet loss rate, substantially continuous intraframe traffic places the network under a constant heavy load, or forces every terminal to accept images of consistently low resolution.
To avoid this problem, Nakano et al. propose to have the MCU transmit only some of the requested intraframes, using a fixed method to select the requests to be honored, but if this practice is followed, on the average a terminal that experiences packet loss must wait longer before recovering a reproducible video image. Terminals with high packet loss rates will tend to lose their video image with annoying frequency.
Given the goal of providing video teleconferencing service with high video quality, the two systems described above both have defects that become increasingly apparent as the number of conference participants increases.
The system proposed by Polomski places too much of a processing load on the MCU. If implemented on a general-purpose personal computer, the MCU is limited to conferences with comparatively few participants. If specialized coding processors are used the number of participants can be increased, but the specialized processors are expensive. Moreover, since the MCU transmits a separate data stream for each terminal, a large conference may overload the network. If network bandwidth is limited, the bit rate must then be reduced, lowering the quality of the transmitted video image.
The system proposed by Nakano et al. forces all terminals to accept a least common denominator of video service. In a large conference, the least common denominator is likely to be lower than the level of service expected by the majority of the conference participants. Moreover, the large number of intraframe requests that tend to occur in a large conference forces all terminals to suffer the negative effects of frequent intraframe transmissions, or forces terminals experiencing high packet loss rates to accept frequent loss of their video image.
Similar problems can also occur in large voice teleconferences.