Particular embodiments generally relate to video conferencing.
Video conferences include multiple locations where a subset of the locations can be displayed at once during the conference. A conference system may use loudness when deciding which locations to display on a number of display screens. For example, the top N (e.g., three) loudest locations may be displayed on three screens. This algorithm generally works well as users expect to see whichever locations that have the most people talking the loudest. However, the algorithm does not work when people who communicate using non-audible methods, such as by sign language or by other gestures. These people cannot effectively cause their location to be displayed on the conference. Also, using the top N loudest location algorithm may cause users to try to speak louder than others causing people to raise their voice continually during the conference.