The mechanisms for transmitting multimedia streams over data networks are commonly referred to as voice over IP mechanisms. A well-known example for transmitting audio streams over a data network is referred to specifically by the acronym VoIP (for Voice over IP). Although applicable in particular to voice and to telephone applications, the same mechanisms can be used for any other multimedia stream (specifically for video). The terminology must therefore be understood in a broad sense.
In a manner similar to that employed in the context of a conventional switched telephone network, a multimedia session over IP can be subdivided into two mechanisms: a signaling mechanism and a mechanism for transmitting the multimedia streams.
The object of the signaling mechanism is specifically to enable the parties of the session to negotiate so as to enable the multimedia streams to be transmitted.
An example of a signaling protocol is the session initiation protocol (SIP) as defined in request for comments (RFC) 2543 of the Internet engineering task force (IETF).
In that protocol, terminals can interchange messages in order to create, monitor, or terminate a multimedia session. The messages pass via signaling routers, conventionally referred to as “proxies”.
Each network terminal is associated with a signaling router or proxy that is in charge of a portion of the network.
FIG. 1 shows three terminals T1, T2, T3. These terminals are associated with three signaling routers respectively P1, P2, and P3.
When terminal T1 seeks to initiate a multimedia session with terminal T2, it sends a message m1 to its own signaling router P1. That signaling router P1 forwards the message (possibly after modifying it) to signaling router P2 using a conventional shortest-path technique. The signaling router P2 then forwards it to terminal T2. 
The message m1 contains the information that is needed for enabling the multimedia stream to be set up. By way of example, the information may include the number of the port to be used on the terminal T1.
After terminal T2 has replied by means of another message, the multimedia stream can be set up.
That model nevertheless raises a problem for multimedia sessions that have at least three parties, commonly referred to as “conferences”, and in particular for conferences involving more than three parties.
Returning to the example shown in FIG. 1, terminal T2 seeks to invite terminal T3 to join the same multimedia session so as to set up a three-party conference. In the same manner as before, terminal T2 sends a message m2 to the signaling router P2 with which it is associated. The router forwards the message to the signaling router P3 which in turn forwards it to terminal T3.
In the same manner, terminal T1 invites terminal T4 to join the same multimedia session, by sending a message m3 to signaling router P1 which forwards it to signaling router P4 associated with terminal T4.
Nevertheless, at no time can all three messages (m1 and m3 coming from terminal T1 and m2 coming from terminal T2) be correlated by any one of the signaling routers concerned.
It follows that the network is unaware that a four-party conference is under way. So far as the network is concerned, there are at least two distinct multimedia sessions in progress, and the fact that together they constitute a four-party conference is known only to the terminals.
Unfortunately, certain special messages need to be implemented once a multimedia session involves more than two parties. In particular, for the audio stream, it can be necessary to make use of a conference bridge for mixing streams. For example, in order to enable terminal T3 to hear simultaneously the audio signals coming from terminals T1, T2, and T4 (assuming that the users of these three terminals are all speaking at once), it is necessary to mix the signals.
To ensure that special mechanisms can be implemented by the network, it is necessary for some element of the network to be aware that such a conference is in progress.