Currently, communication technologies are widely applied in the image transmission field. An important application in the image transmission field is videoconferencing including ordinary videoconference communication that involves multiple terminals concurrently, and continuous presence videoconference that displays video information of multiple terminals in the screen of one terminal concurrently.
In the ordinary videoconference application, a videoconference system may include terminals that have different processing capabilities, for example, high definition terminal (generally, 720P that indicates 1280×720 pixels, progressive scanning, or higher definition), standard definition terminal (generally, 4CIF that indicates 704×576 pixels), and general terminal (generally, CIF that indicates 352×288 pixels). When terminals of different processing capabilities attend a conference simultaneously, the image transmission capabilities of the terminals need to be coordinated so that the screen of every terminal can display pictures appropriately.
A method for coordinating the image transmission capabilities of all terminals includes: a Multipoint Control Unit (MCU) for controlling terminals in the videoconference system receives the resolution capability information of each terminal, and applies the common highest capability of all terminals to the conference; and the terminals in the videoconference encode and decode images according to the negotiated common highest capability. However, in the case that all terminals in the videoconference system employ the negotiated common highest capability for convening the conference, only low-resolution images are presented when the images are transmitted between the terminals of high resolution if one low-resolution terminal exists.
Another method for coordinating the image transmission capabilities of all terminals includes: The MCU performs adaptation and transcoding for the code streams in this way: The MCU decodes a received high-spatial-resolution code stream, downscale the decoded image into a low-resolution image, encodes the low-resolution image to obtain the code stream of the low-resolution image, and sends the code stream of the low-resolution image to the terminal that needs to display the image at low resolution. However, this method involves decoding code stream and downscaling of every high-resolution image, and coding of low-resolution image, thus leading complicated calculation and low efficiency.
As shown in FIG. 1, a videoconference system in the prior art includes: terminal 1 to terminal N, and an MCU for connecting the terminals. The MCU may be stand-alone, or embedded into a terminal.
A process of convening a conference with a videoconference system shown in FIG. 1 includes the following steps:
Step 1: The MCU in the system determines the common highest conference capability of all terminals in the system, and sends the determined common highest conference capability to every terminal in the system.
Step 2: After knowing the common highest conference capability, each terminal in the system encodes the image according to the common highest conference capability, and sends the code stream.
Step 3: The MCU in the system receives the code stream sent by the terminal, and transmits the code stream to the terminals that need to receive the code stream.
This method has the following drawback: As long as one low-resolution terminal exists, only low-resolution image is presented even if the image is transmitted between high-resolution terminals.
Another process of convening a conference with a videoconference system shown in FIG. 1 includes the following steps:
Step 1: An MCU in the system records the conference capability of every terminal in the system.
Step 2: When the MCU finds that the capability of the terminal receiving the code stream does not match the capability of the terminal sending the code stream, the MCU decodes the code stream sent by the terminal sending the code stream, encodes the image according to the capability of the terminal receiving the code stream, and transmits the code stream to the terminal receiving the code stream.
This method has the following drawback: decoding and encoding are required. Therefore, calculation is complicated and efficiency is low.
A third process of convening a conference with a videoconference system shown in FIG. 1 includes the following steps:
An MCU in the system receives a code stream of a large picture, and forwards the code streams of the large picture to the terminal that displays only the large picture. For the terminal that needs to display multiple pictures that include this picture, namely, for the terminal that needs to display multiple small pictures, the MCU decodes the code stream, downscales the image deriving from the decoding to the size of a subpicture, combines the downscaled image with other subpictures into a large picture, encodes the combined large picture, and sends the code stream of the combined large picture to the terminal that needs to display the image.
This method has the following drawback: each subpicture needs to be decoded and downscaled and the combined large picture needs to be encoded. Therefore, calculation is complicated, and efficiency is low.