Telephone calls can result in misunderstandings simply because of the absence of the visual cues that accompany in-person, or face-to-face, communications. Dry humor or sarcasm, for example, may be misconstrued as ignorance. Video conferencing, where the participants in a conversation can view each other's faces while they are talking, can reduce these problems, and can create a communication exchange that closely approximates in-person communications. However, certain types of communications, such as those where participants are discussing an object that is in the presence of one of the participants, are still better conducted in-person than via a video conference because it can be difficult for the participant who is remote from the object to convey to the other participant exactly what part of the object the remote participant is referring to. While the participant who is in the presence of the object may simply, for example, touch the part of the object that is relevant to that participant's comments, and this action may then be seen via video conference by the other (remote) participant, it may be impossible for the remote participant to similarly convey exactly what portion of the object that participant desires to discuss. Accordingly, there is a need for a mechanism by which a participant at one location can easily identify to a participant in a second location a particular aspect or feature of an object that is in the presence of the participant in the second location.