Voice over internet protocol (“VoIP”) communication systems allow the user of a device to make calls across a communication network. To use VoIP, the user must install and execute client software on their device. The client software provides the VoIP connections as well as other functions such as registration and authentication. Advantageously, in addition to voice and video communication, the client may also provide video calling and instant messaging (“IM”). With video calling, the callers are able to view video images (i.e. moving images) of the other party in addition to voice information. This enables a much more natural communication between the parties, as facial expressions are also communicated, thereby making video calls more comparable to a face-to-face conversation.
A video call comprising multiple users may be referred to as a “video conference”. In a conventional video conference, each participant (i.e. user) is able to view the video images of one or more of the other participants (users) in the video conference. For example, as a default setting, each user may be presented with the video images of all of the other users in the video conference. These may displayed, for example, using a grid, with each video image occupying a different location on the grid. Alternatively, each user may be presented with one or more video images corresponding to users that have been detected as speaking users. That is, the detection of audio from a speaker may determine which of the video images of the other users are selected for display at a particular user's user terminal. Typically, in a video conference, one user speaks at a time, and so this may result in a single video image of that user being displayed to each of the non-speaking users.