Communication devices or communication endpoints that display video conferences traditionally receive a single resultant stream of video from a central controller, referred to as a multipoint control unit (MCU). Generally the communication endpoints cannot identify the other participants inside that video stream or manipulate the video stream. Information about the video display layout is not available to the communication endpoints. The MCU typically delivers a video stream having a display composed of several “panes” of video from different communication endpoints and sends this composite display in a mixed video stream. Unfortunately, in this environment, each participant cannot individually control the display of the video conference. Instead, the participants each receive a standard layout, regardless of whether that layout is suitable for the user or not.