1. Field of the Invention
This invention relates generally to multipoint conferencing, for example, multipoint videoconferencing between multiple participants. More particularly, the invention relates to multipoint videoconferencing in which the computer processing used to implement various conferencing functions is distributed among the conferencing terminals used by the conference participants.
2. Description of the Related Art
As the result of continuous advances in technology, videoconferencing is becoming an increasingly popular means of communication. The development and standardization of advanced coding/decoding and compression/decompression schemes has facilitated the communication of ever larger amounts of information over communications links of limited capacity. Technological advances in the communication links themselves and increases in the sheer number of links have further increased the effective number of communications links that are available to carry videoconferencing information. Advances in the basic components used in videoconferencing systems, such as computers, cameras, video displays, audio speakers and microphones, have resulted in the availability of better quality components at lower prices.
These advances translate to more powerful videoconferencing systems available at lower prices. Video and audio quality has improved, as has the capability to combine the basic videoconferencing functionality with other functionalities, such as presentation software, desktop publishing applications, and networking. As a result, videoconferencing systems have progressed from being expensive novelty systems, which were used infrequently, to moderately priced systems which were more often used but still located in a dedicated facility shared by many users, to relatively inexpensive systems, which one day may be as ubiquitous as the telephone is today.
Current videoconferencing systems may be operated in a point-to-point mode between users of videoconferencing terminals or may include a number of clients connected to each other by a centralized multipoint control unit (MCU). The clients are the local systems used by the participants in the videoconference. Much, if not all, of the videoconferencing functionality is typically implemented by the MCU. As a result, the MCU is often a complex and expensive piece of equipment. For example, current MCU""s may implement half of the signal processing and all of the decision making required to implement a videoconference. The MCU may be required to decode audio signals received from each client, mix the received audio signals together to generate appropriate audio signals to be transmitted to each client, and then re-encode and retransmit the appropriate mixed signal to each client. The MCU may perform analogous functions on the video signals generated by each client. Furthermore, the MCU typically also determines, according to some predefined method, which video signals (or which combinations of video signals) should be sent to which clients. As the number of clients increases, the functions required of the MCU increase correspondingly, quickly making the MCU prohibitively complex and expensive.
Note that in the above scenario, the audio signals and possibly also video signals experience both tandem encoding and multisource encoding, each of which reduces the quality of the signals. Tandem encoding means that a signal has been repeatedly and sequentially encoded and decoded. Here, the source client encodes its audio signal, which is decoded and mixed by the MCU. The MCU then encodes the mixed signal (i.e., tandem encoding), which is decoded by the destination client. Since the encoding algorithms used typically are lossy (i.e., the recovered signal is not a perfect replica of the original signal), each time a signal is encoded and decoded the quality of the signal decreases. Tandem encoding introduces significant delay and impairs natural communication. Multisource encoding of the audio signal is the encoding of an audio signal produced by more than one source (i.e., a mixed signal). Because the encoding algorithms typically are optimized for single source signals, encoding a signal from more than one source also results in a lower quality signal.
If the MCU is also responsible for determining which signals are to be mixed and sent to which clients, it typically must receive audio and video signals from all of the clients and then determine what to do with these signals. For example, with respect to video which is transmitted in packets, the MCU is continuously receiving video packets. Once the MCU has received these video packets, it must determine which of these video packets to forward to destination clients. This determination in its own right may be computationally intensive. The video packets that are not mixed or forwarded are simply discarded by the MCU and need not have been transmitted to the MCU in the first place. The transmission of video packets not required by the MCU creates unnecessary traffic on the network, thus wasting valuable bandwidth and unnecessary encoding work in the sending client.
Another approach to videoconferencing is to have each client indiscriminately broadcast its audio and video signals to all other clients. Because each signal goes directly to all other clients without any intermediate decoding, mixing or re-encoding, this method eliminates the need for tandem encoding and multisource encoding. However, continuously encoding, sending, receiving, and decoding video and audio signals from all clients to all other clients taxes both the network and the clients, particularly as the number of clients increases.
In view of the foregoing discussion, there is a need for a videoconferencing system that reduces or eliminates the cost of the MCU function. There is also a need for a videoconferencing system that does not require tandem encoding and/or multi-source encoding. There is a further need for a videoconferencing system that does not indiscriminately send video packets, but rather sends video packets only (or preferably) when they are to be received and utilized by other clients. There is also a need for a videoconferencing system that can accommodate an increased numbers of clients on a given network.
The present invention comprises a videoconferencing system, method and apparatus that may take advantage of distributed processing techniques. In a preferred embodiment, the system includes two or more apparatus connected via a network in order to transfer video and audio signals. A sending client can apply various internal and/or external factors to a decision algorithm. Based on this algorithm, the sender decides whether to generate and send video and/or audio signals. Because the decision to send signals is made locally at the sender, all signals generated and sent, preferably, will be received and utilized by at least one receiving client. Each sender encodes the signal it generates before it sends the signal. This signal is then decoded at the receiver, preferably without any intermediate decoding or encoding.
Each sending client may include multiple decoders and memory modules such that the client can store a copy of the signals sent to each receiver. As a result, the sending client can compare the current image with the immediately preceding image that was sent to any particular receiving client. If there was no change, then no new image need be sent; otherwise only a difference signal is sent, thus increasing the overall efficiency of the videoconferencing system.
Similarly, each receiving client may include multiple decoders and memory modules, such that the receiving client can store a copy of the signals, if any, received from each sender. As a result, the receiving client can, either manually under command from a user or based upon an automated decision algorithm, display the audio and/or video signals from various senders. Preferably, of course, in order to conserve bandwidth, each receiving client sends a command instruction to each sending client instructing them not to send any signals if the same will not be displayed and played at the receiving client.