A multimedia conferencing system is a remote communication system that supports bidirectional transmission of voices and videos. By means of such a system, users located at different positions can implement real-time voice and video communications having an effect which is similar to that obtained in face-to-face communications.
Standard organizations such as the International Telecommunication Unit (ITU), Internet Engineering Task Force (IETF) and 3rd Generation Partnership Project (3GPP) are in charge of formulating video conferencing standards. ITU has formulated several video communication standards such as ITU-T H.320, ITU-T H.323 and ITU-T H.324, in which ITU-T H.320 is for multimedia communication applications of narrow-band circuit-switched networks, ITU-T H.323 is for multimedia communication applications of IP networks, and ITU-T H.324 is for multi-media communication applications of very low-rate networks such as a Public Switched Telephone Network (PSTN) and mobile network. IETF is in charge of formulating the Session Initialization Protocol (SIP) and a multimedia conferencing standard based on the protocol. 3GPP is in charge of formulating standards of an IP Multimedia Subsystem (IMS), and also formulated an IMS-based multimedia conferencing standard on the basis of the IETF standard. Besides the aforementioned organizations, other organizations contributed directly or indirectly to standards of video conferencing. To meet needs of product design and development, some enterprises formulate proprietary products and communication specifications that are for internal use, or make proprietary extensions on integration of developed standards. Video conferencing products can comply with one or more open standards or enterprise proprietary standards.
Divided by inter-operating interfaces of equipments, a video conferencing system typically consists of entities or devices such as a terminal, a Multipoint Control Unit (MCU), a gateway and a call controller.
The terminal is an equipment used by a user, and one system typically includes multiple terminals. The terminal typically consists of a core codec and external input/output devices. The codec is in charge of pre-processing, encoding, decoding, post-processing of voice and video signals, network communication and user control and other processing. The input devices include devices such as microphone and camera, and output devices include devices such as sound system and TV. The terminal collects user's voice and video signals, performs compressed encoding on the signals after pre-processing them, and then encapsulates encoded signals into data packets which are transmitted to far ends by networks; and the terminal receives data packets from the far ends through the networks, decodes valid data obtained from de-capsulation, and plays post-processed decoded data for the user.
The MCU is used to implement multi-party conferencing communication. When a multipoint conference is held, a many-to-one connection is established between multiple terminals joining in the multi-party conferencing communication and the MCU, and the terminals switches audio/video signals through the MCU. The MCU is in charge of implementing switching and mixing of media streams. For voice media streams, the MCU outputs a sound-mixed voice media stream for each terminal, and the sound-mixing is performed by superposing several voice media streams having a largest input volume. For videos, the MCU may transmit to a certain terminal a single-frame video stream of another terminal, and if the MCU support a multi-frame function, the MCU may also mix videos of multiple terminals into a multi-frame image and then transmit the multi-frame image to one or more terminals.
The call controller is used to select a route of a call, for example, a Gatekeeper entity defined in the H.323 standard and a Proxy entity defined in the SIP standard are in charge of implementing a call routing functionality.
Gateway devices are used to implement switching between devices of different network protocols and media formats for the purpose of intercommunications.
Information content switched between video conferencing devices includes call control instructions and one or more audio streams, video streams and textual message streams. For meanings of various media streams, encoding/decoding of media streams and management of their transmission, please refer to related standards specified in ITU-T H.323 and related standards specified in IETF SIP.
In practical applications within an enterprise, the video conferencing system is usually a kind of rare resources, thus it should be managed through reservations. A conference reservation is constrained by two aspects, including time and resources. For example a person who reserves a conference wants the multi-party video conference to be held during 9:00-11:00 a next morning with terminals and conference rooms involved in the conference located respectively in conference room 101 (terminal 1001 included) of Shenzhen Headquarter, conference room 201 (terminal 2001 included) of Beijing Branch and conference room 301 (terminal 3001 included) of Nanjing Branch Office. Since multiple parties are involved, thus MCU equipment resources typically need to be occupied. If both terminals and MCU resources are idle in this period of time, the conference reservation system will reserve these resources for the conference and return a result of reservation success. When other user desires to reserve a conference during a same period of time and plans to occupy conference room 101, the reservation system will return a result of reservation failure since conference room 101 is already occupied.
In an existing conference reservation system, a user is typically required to input a start time of a conference, an end time of the conference and a list of terminals desired to join in the conference, and the system calculates a reservation result according to parameters input by the user and occupation conditions of system resources. When the system includes too many terminals and conference services are too busy, the above processing way has apparent disadvantages. One or more terminals specified by a user in a list are likely to be occupied in a specified period of time, a reservation failure is thus generated; the user needs to attempt continually to select different terminals or select different periods of time, thus existing reservation techniques have a disadvantage that conference reservation is inconvenient.