In face-to-face communications, people make eye contact with others, observe the body language of others, and observe the facial expression of others, etc. Experiments show, for example, that mouth motion starts about 0.4s to about 1.2s prior to speech, and thus facial expressions (e.g., mouth motion) can provide signals to others that a person may speak. In addition, in face-to-face communications the delay between the time a person communicates a message (e.g., says or does something) and the time another person receives (e.g., hears or sees) the communicated message is so small that it is substantially non-existent and inconsequential. These features of face-to-face communications aid in turn-taking during a conversation.
However, face-to-face communications involving, for example, people who have to travel to meet face-to-face are more and more-often being replaced by teleconferencing (e.g., videoconferencing, audio/video-chatting using audio/video enabled messenger tools). In some cases, people who live and/or work in the same area but in different buildings are opting to communicate via teleconferencing systems to avoid having to physically travel to the other building, for example. Teleconferencing has become popular in many environments (e.g., educational, business and personal environments) because, for example, a teleconference may eliminate the need for one or more of the conference members to travel to another location. However, some features of face-to-face communications, such as, eye contact ability, facial expression observation ability, and substantially quick receipt (i.e., with an unnoticeable delay) of a communicated message, etc. are generally not available, or are hindered, during teleconferencing.
Teleconferencing systems generally capture images and/or sounds at one site, encode them into a standard format, and transmit the encoded data over a network connection to another site which decodes the encoded images and/or sounds and outputs the decoded result thereof. Although progress is being made in developing faster systems and more efficient ways to use available bandwidth and/or to increase bandwidth, the encoding/decoding and transmission of the data generally causes a delay which impacts the teleconference. Even in teleconferences including video transmission, where images of one conference site are captured and transmitted to the other conference site and facial expressions may be thereby observed via the video images, due to transmission delays and/or poor data quality, teleconferencing generally does not allow for members/people to make eye contact and/or for members/people to sense that another member/person is about to speak based on facial expressions in a manner which would assist the members in turn-taking (i.e., taking turns communicating and listening), for example, during their teleconference based communications. However, instead of requiring, for example, a person to travel to another location to participate in a face-to-face communication with another person, the people involved generally agree to deal with “side-effects” of teleconferencing in order to eliminate the need for travel.
One common “side-effect” resulting from the lack and/or suppression of the above-described features of face-to-face communication in teleconferences, is collisions (i.e., a state where a local conference member and a remote conference member begin to communicate at one time). Although collisions occur during face-to face conferences, they occur much less often in face-to-face communications than they do during teleconferences. Further, when collisions occur during face-to-face teleconferences, because there is substantially no delay (i.e., such a small delay that it is unnoticeable) between the time one person talks and the time others hear and/or see the communicated message, collisions are generally overcome quickly and easily.
In contrast, in teleconferences, repeated collisions may occur before the situation is resolved because, in teleconferences, the lack or reduced ability to observe facial expressions and/or make eye contact is greatly exacerbated by the delays resulting from the need to encode/decode the data and to transmit the data over the network. More specifically, collisions tend to occur in teleconferences because, for example, conference members tend to forget that there is a delay in transmission and thus, out of habits based on face-to-face communications, a first conference member tends to break a period of silence and begin talking again before being interrupted by the receipt of the other party's response to the initial communication (even though that response was on its way (but not yet received) to the first conference member). In other instances, as a result of the silence, for example, conference members may be uncertain as to whether their initial communication was received and/or understood by the other party and thus, may instinctively begin to repeat their message before realizing that another member had responded to their communication (i.e., collision).
In other instances, in an attempt to prevent such collisions, conference members may patiently wait for a communication from the other party when, in fact, the other party had not communicated anything. Further, once a collision occurs each party may simultaneously refrain from communicating to allow the other party to finish their communication and then, upon a realization of such a mutual silence, both parties may begin communicating again substantially simultaneously, before realizing that another collision occurred. Such collisions during teleconferences are time-consuming and distracting.