The existing telepresence system generally uses multiple terminals binding multiple external devices (for example, multiple displays and multiple cameras) to construct a telepresence system. In this manner, the image picked up by each camera is coded by a corresponding terminal and then transmitted to a remote terminal. After receiving the remote code streams, the terminal decodes the code streams and output over the corresponding display. In this way, a life-size display may be implemented and the problem of eye-to-eye communication may be addressed to some extent.
In the telepresence system, captions may need to be displayed through a display. Currently, the modes for displaying captions includes the following two modes:
First, the caption transmission is implemented in the mode described in FIG. 1. An encoding end uses the captions are image content, directly superposes the captions to the image picked up by the camera, and then codes the image after superposition. In this manner, the captions exist on the actually coded image. Therefore, a decoding end only needs to decode the received code streams for display.
Second, the caption transmission is implemented in the mode described in FIG. 2. The encoding end uses the caption information as separate content and transmit the caption information and the image picked up by the camera to the decoding end. The decoding end superposes the received caption information and the decoded video code streams to the image, and displays the image after superposition.
During implementation of the above caption display, the prior art has at least the following problem: The above two modes for displaying the captions are only applicable to a telepresence system having a single terminal and a single image; multi-screen coordinated processing for the captions in the telepresence system cannot be implemented. During coordinated display of the captions on multiple displays, the displays must be set manually and content to be displayed on each display needs to be adjusted. Therefore, an overall coordination function for caption display cannot be implemented.