1. Field
The present invention relates generally to hardware architectures and associated methods for a multipoint control unit.
2. Related Art
Video conferencing and the associated hardware, falls broadly into two camps. In the first camp, “conferencing” occurs between only two participants and the participants are connected directly to one another through some form of data network. In this form of network, only two endpoints are involved and true conferencing only occurs if multiple participants are present at one of the two endpoint sites. Examples of this type of conferencing are, at the low technology end, PC enabled endpoints interconnecting using software such as NetMeeting® or Skype® and at the higher end equipment using dedicated endpoint hardware interconnected, for example, via ISDN links.
In the second camp, video conferencing allows more than two endpoints to interact with one another. This is achieved by providing at least one centralized coordinating point; a so-called “multipoint control unit (MCU)”, which receives video and audio streams from the endpoints, combines these in a desired way and re-transmits the combined composite video/audio stream to the participants. Typically the conference view transmitted to the endpoints is the same for each endpoint. The composition may change over time but is the same for all the participants.
The provision of only a single composition is a significant problem because each participant must therefore receive a conference stream tailored so that it is acceptable to the least capable endpoint in the conference. In this situation therefore many endpoints are not used to their full capacity and may experience degraded images and audio as a result.
More recently, modern MCUs such as the Codian MCU 4200® series have been designed to allow a unique view to be created for each participant. This allows the full capabilities of each endpoint to be utilized and also allows different compositions for different participants so that, for example, the emphasis of a particular participant in the conference may be different for a different user. However, the processing of video data in real time is a highly processor intensive task. It also involves the movement of large quantities of data. This is particularly so once the data has been decompressed in order to perform high quality processing. Thus processing power and bandwidth constraints are a significant bottleneck in the creation of high quality video conferencing MCUs which allow multiple views of the conference to be produced.
FIG. 1 shows a typical prior art MCU architecture. The exemplary architecture has a plurality of digital signal processors 2 such as the Texas Instruments TMS series, which are interconnected via a Time Division Multiplexed (TDM) bus 4. A controller and network interface 6 is also connected to the TDM bus. Each DSP 2 is allocated one or more time-slots on the TDM bus. It will be appreciated that the TDM bus is a significant bottleneck. Whilst increased processing power for the MCU may be achieved by adding more powerful DSPs or additional DSPs, all the data flowing between DSPs and between the network 8 and the DSPs must fit into a finite number of time slots on the TDM bus 4. Thus, this form of architecture generally scales poorly and cannot accommodate the processing requirements of per-participant compositions.
FIG. 2 shows an alternative prior art configuration. In this example, a plurality of DSPs 2-1 are each connected to a Peripheral Component Interconnect (PCI) bus 10-1. Similarly, a plurality of DSPs 2-2, 2-3 and 2-4 are connected to respective PCI buses 10-2, 10-3 and 10-4. The PCI buses 10-2, 10-3 and 10-4 are in turn connected via buffers 12 to a further PCI bus 14. A significant advantage of this architecture over that shown in FIG. 1 is that the DSPs in group 2-1 may communicate amongst one another with the only bottleneck being the PCI bus 10-1. This is true also for the groups 2-2, 2-3 and 2-4. However, should a DSP in group 2-1 wish to communicate with a DSP for example, in group 2-3, the PCI bus 14 must be utilized. Thus although this architecture is a significant improvement on that shown in FIG. 1 in terms of scalability and the ability to effectively use a plurality of DSPs, the PCI bus 14 must still be used for certain combinations of intra-DSP communication and thus may become a performance limiting factor for the MCU architecture.
Attempts have been made to offload processing from DSPs. For example, IDT produces a “Pre-processing switch (PPS),” under part number IDT 70K2000, for use with DSPs. The PPS carries out predetermined functions before delivery to a processor such as a DSP or FPGA. Processing is determined based on the address range on the switch to which packets are sent. The chip is designed, e.g., for use in 3G mobile telephony and is designed, e.g., to offload basic tasks from DSPs which would normally be carried out inefficiently by the DSP. U.S. Pat. No. 6,883,084 also proposes the use of path processing; however, in that case it is proposed as an alternative to a Von Neumann type sequential processor.