The telepresence technology is a teleconference technology which appears in recent years and integrates video communication and communication experience. This technology is characterized by having a life size, super high definition and low time delay and is focused on an effect of almost real face-to-face communications, and the implementation process relates to a plurality of aspects such as networks, communications, conference environments and function applications, with an integrated real communication experience combined with business applications being ultimately presented to the conference participants.
With the continuous expansion of the promotion and application scope of the telepresence technology, how to realise the interoperability between the telepresence products of various manufacturers becomes a problem which needs to be solved urgently. CISCO Company gradually promotes the Telepresence Interoperability Protocol (TIP) used by the telepresence products thereof to be the interoperability protocol recognized by international telepresence products.
In the TIP protocol, one complete call between two telepresence devices is divided into two phases: the first phase is the call establishment phase, i.e. a normal calling process between two devices needing to perform a media communication, for example, the establishment of a Session Initiation Protocol (SIP) call of an application layer, or the establishment of an H323 call, and the completion of this phase marks the opening of media channels of the two parties; and the second phase is a TIP negotiation phase, which phase is to complete the TIP capability negotiation and the negotiation of media multiplexing parameters, etc. After the two phases are both completed, both communication parties can start a normal media communication and can listen to or watch sounds and images of each other. The media capability used by both communication parties at this moment is the media capability obtained through the TIP capability negotiation.
However, in the existing TIP protocols, there are few types of audio and video capabilities that can be described, only several fixed types, wherein there is only one audio type, AAC_LD; the main video has two types of capabilities, different capabilities being selected according to different rates; and the auxiliary video also has only one type of capability, different frame frequencies being selected according to different rates. The above-mentioned provisions lead to a not very good expansibility. When there appears a new audio and video coding/decoding technology, and if the content of the TIP protocol is not updated in time, that is, when the TIP protocol does not support the new audio and video coding/decoding technology, the telepresence system can not apply the new audio and video coding/decoding technology.
Aiming at the above-mentioned problem, no effective solution has been presented.