Three dimensional (3D) Video or 3D TV has gained increasing momentum in recent years. A number of standardization bodies (the International Telecommunication Union, ITU, the European Broadcasting Union, EBU, the Society of Motion Picture and Television Engineers, SMPTE, the Moving Picture Experts Group, MPEG, and the Digital Video Broadcasting, DVB, suite of internationally accepted open standards for digital television. and other international groups (e.g. the Digital TV Group, DTG, and the Society of Cable Telecommunications Engineers, SCTE), are working toward standards for 3D TV or Video. Quite a few broadcasters have launched or are planning to launch public Stereoscopic 3D TV broadcasting.
Several 3D video coding formats have been proposed in the 3D video community so far. Examples include, but are not limited to: stereoscopic 3D, Video plus Depth (V+D), Multiview Video (MW), Multiview Video plus Depth (MVD), Layered Depth Video (LDV), Depth Enhanced Video, frame-packed side-by-side, frame-packed top-bottom, full resolution per view formats, frame interleaving, row-interleaving, multiview video coding (MVC), multiview video coding extension of high efficient video coding (MV-HEVC) and 3D-HEVC, where the three last ones are multi-view and 3D codecs developed or under development by the joint JCT-VC and JCT-3V standardization groups in MPEG/VCEG.
Apart from broadcasted television and cinema, 3D video is also being considered in other video services such as video conferencing and mobile video calls. 3D video conferencing may be enabled in many different forms. To this effect, 3D equipment such as stereo cameras and 3D displays have been deployed. 3D video or 3D experience commonly refers to the possibility of, for a viewer, getting the feeling of depth in the scene or, in other words, to get a feeling for the viewer to be in the scene. In technical terms, this may generally be achieved both by the type of capture equipment (i.e. the cameras) and by the type of rendering equipment (i.e. the display) that are deployed in the system.
A commonly used way to negotiate media capabilities (i.e., the ways by which a media stream may be sent and/or received by a client) between the clients before setting up a video conference call is to use the Session Description Protocol (SDP).
SDP negotiation is specified in IETF RfC 3264; “An Offer/Answer Model with the Session Description Protocol (SDP),” June 2002, which describes an offer/answer model with SDP. The specification defines a mechanism where two clients can make use of SDP to arrive at a common view of a multimedia session between them. In the model, one client offers the other client(s) a description of the desired session from their perspective, and the other client(s) answers with the desired session from their perspective. The offer/answer model can be implemented using protocols such as the Session Initiation Protocol (SIP).
In general terms, the session description protocol (SDP) is a protocol used for describing multimedia sessions for the purposes of session announcement, session invitation, and other forms of multimedia session initiation. SDP messages may be used for negotiating media capabilities during setup of a video conferencing session.
In general terms, an SDP session is described in an SDP message by a series of fields, one per line, where the form of each field is as follows:                <character>=<value>where <character> is a single case character and <value> is structured text whose format depends upon attribute type. A typical SDP offer for sending audio and video may be formulated as follows:        v=0        o=C-A 2890844526 2890844526 IN IP4 host.anywhere.com        s=        c=IN IP4 host.anywhere.com        t=0 0        m=audio 49170 RTP/AVP 0        a=rtpmap:0 PCMU/8000        a=sendrecv        m=video 51372 RTP/AVP 100        a=rtpmap:100 H264/90000        a=fmtp:100 profile-level-id=42c01f;packetization-mode=1        a=sendrecvwhere v is the version number of SDP (shall be O), O is the originator, in the present example a client with identity “C-A”, and session identifier, s the session name, t is a timing description, m is the media name and transport address, and a is a media attribute line. A media description starts with an “m=” field and is terminated by either the next “m=” field or by the end of the session description. The words in the “m=” field value describe in order the media type, transport port, transport protocol and media format description. When RTP (Real-time Transport Protocol) is used, as in the above example, the media format description contains the RTP payload type number(s). If dynamic payload type numbers are used, the “a=rtpmap” attribute is used to map a media encoding name to the stream. Further, the “a=fmtp” attribute may be used to specify format parameters.        
The last attribute value in the media descriptions in the above example describes the direction of the current media description. Possible directions are sendonly, recvonly, sendrecv, or inactive.
The document IETF Internet-Draft “Signal 3D format,” October 2012, has introduced signalling of 3D formats in SDP. The 3D signalling is carried out by adding a new attribute to the SDP:                a=3dFormat:<Format Type><Component Type>where Format Type is one of the following 3D video formats; “FP” (Frame Packing), “SC” (Simulcast), “2DA” (2D+auxiliary, e.g. depth maps) and Component Type includes view components “D” (Depth map), “L” (Left), “R” (Right), “SbS” (Side by side) and “TaB” (Top and Bottom).        
In case the 3D representation is carried in multiple streams a grouping mechanism is used to group the 3D views.
Commonly, if multiple media streams of the same media type are present in an SDP offer, it would mean that the offering client wishes to send (and/or receive) multiple media streams of that media type at the same time.
A typical SDP media negotiation between two clients may commonly involve the following signalling, which may define an offer/answering model. A first client (the offering client) initializes the SDP negotiation by creating an SDP offer from its media capabilities. The SDP offer is sent to a second client (the answering client). The second client reverses the direction of the media description, filters the received SDP and keeps the lines that match its own media capabilities, thus generating an SDP answer. If a media description for a media stream is not supported it should be disabled by the second client setting the port number of the corresponding m-line to O (zero) and removing the attributes in the media description. In this respect, an m-line is a line in the SDP message comprising an “m=” field. Thus, according to the above example, one example of an m-line is:                m=audio 49170 RTP/AVP 0        
An m-line may not be deleted from the SDP message. This is to allow accessing of the m-lines at a certain absolute position. The SDP answer is sent to the first client. If the SDP answer contains an m-line with a non-zero port the media session can be started.
Although it is possible to continue the negotiation if a media format acceptable for both clients is not found during this process, for example by the offering client creating and sending a further SDP offer with a further media description, conference systems in practice do not always implement this.
Further, although it is possible to signal a number of different 3D video formats using the mechanism proposed in the document IETF Internet-Draft “Signal 3D format,” October 2012, this document does not specify how to proceed in case asymmetric configurations are desired.
Hence, there is still a need for an improved signalling of media capabilities.