Switched conferencing presents a promising technique to facilitate the distribution of real-time media flows between two or more endpoint devices. Such media may include, for example, audio, video, text (e.g., via a messenger function), combinations thereof, or the like. In prior techniques, a centralized conference server may be employed to collect and aggregate the various media flows before sending the aggregated media to the destinations (e.g., by merging the audio from multiple participants speaking at the same time, etc.). In switched conferencing, however, the conference server makes no alterations to the actual media itself, but simply forwards the media to the conference participants. By simplifying the functions of the conference server, switched conferencing is well suited to be implemented in a cloud-based environment.
While switched conferencing presents significant simplifications to the infrastructure needed to conduct a real-time conference between distributed devices, privacy and security remain a concern, particularly in the context of cloud-based implementations. Notably, shifting to a cloud-based conferencing system means that, typically, an organization will no longer have access to the physical conference server and the conference server will no longer be located behind the organization's firewalls. To ensure both privacy and security, switched conferencing approaches employ the use of two secrets that are shared between the conference participants and are kept hidden from the conference server and outsiders: 1.) media encryption keys used to encrypt/decrypt the media data payloads and 2.) hash keys used for authentication. However, since every conference participant knows both of these secrets, this also gives rise to the possibility of a conference participant impersonating a conference speaker during the conference.