A video conference system is generally defined as a software and hardware system in which two or more users in different locations exchange information such as audio, video, and data files using audio capture devices, camera devices, audio output devices, display devices, and a communications network, so as to implement instant interaction and communication. Depending on different implementation manners, current video conference systems are generally categorized into personal computer (PC)-based software video conference systems, and hardware video conference systems that are based on a digital signal processor (DSP) and embedded software.
An existing PC-based unified communications system, such as Microsoft Lync and Skype, generally uses a universal serial bus (USB) camera of a PC host to capture a video image, uses a USB microphone to capture audio, uses a display to display the video image, and uses a loudspeaker to play an audio signal. Video communication in a UC software manner is convenient, supports a variety of services such as instant messaging, and may implement collaboration with PC office software (for example, WINDOWS OFFICE).
A video conference terminal based on a hardware scheme generally uses an independent camera. A photographed video signal is input into a conference terminal using an interface such as a digital visual interface (DVI) and a high definition multimedia interface (HDMI), and an audio signal is captured using an independent microphone/microphone array. The conference terminal generally uses a platform such as a DSP/field programmable gate array (FPGA)/application-specific integrated circuits (ASIC) chip to perform audio and video processing and encoding/decoding, encodes a locally captured image and an audio signal that is locally picked up and sends them to a remote end, and outputs, using a video interface or an audio interface, a decoded image and audio signal from the remote end to the display device for displaying and to the speaker for playing respectively.
Based on research on the foregoing two implementation solutions, the inventor finds the following problems in the two implementation solutions in the prior art:
The PC-based UC video conference system generally uses a USB camera, but the USB camera has a small image sensor and a small lens, and the camera lacks image signal processing (ISP) or has a limited processing capability. Therefore, an image effect of the camera is inferior. Due to bandwidth limitation of a USB interface, an image resolution and a frame rate of a video image is relatively low. Due to limitation of a computing capability of a PC, a high resolution and a high frame rate cannot be implemented, for example, encoding and decoding of a 1080p60 video cannot be implemented. In addition, effects of encoding and decoding are not satisfactory enough. Further, the PC-based UC video conference system is characterized by a complicated software scheme, difficult to deploy and maintain, and vulnerable to attacks of computer viruses and malicious software, resulting in poor security. A video conference system based on a hardware scheme generally uses a remote control as a human-computer interaction interface. An operation interface is displayed on a television set, and a video image is generally displayed on a full screen after a call is made successfully. However, due to use of a dedicated hardware platform, the system is not well scalable and provides few types of services, and can hardly provide other services except audio and video communication functions. The system cannot collaborate well with PC office software and the like, and is relatively costly.