With recent increases in computing capacity and transmission bandwidth capabilities, video conferencing is quickly becoming a simple and effective means of communication and collaboration. Many corporations and individuals utilize video conferencing systems and visual collaborations to provide low-cost face-to-face meetings between colleagues and friends at various locations. To enhance communications at those meetings, some video conferencing or collaboration systems permit computer generated images or presentations to be simultaneously broadcast in a primary feed to participants either in a pop-up window, a picture-in picture format, or as an alternate switchable display on the video monitors. Lately, enhancements to this have been provided for video conferencing over the Internet that permits the manipulation by distant participants of objects in the primary feed such as computer-generated documents, spreadsheets or drawings displayed as a primary feed in the separate pop-up window.
Despite these enhancements, significant limitations remain present when viewing a shared document. For example, generally when viewing either one or more participants on a video conference along with a primary feed, pictures are displayed in a side by side format, or in separate frames of an overall viewing area (i.e. picture-in-picture format). As a result, while trying to walk through a document with a participant in the conference who is closely focused on text, a speaker cannot simply make an indication and say “look at this paragraph here” because if the participant is looking at a document, the participant cannot see where the speaker is pointing. In the context of a shared desktop, the speaker could use a mouse to point a cursor on a certain area, or could select/highlight the paragraph. However, this causes sudden changes in the document which can be very jarring to a conference participant. Further, the participant will likely miss the facial expressions of the speaker because they are focused on the document. As a result, current systems and methods implementing such concepts fail to provide a user with an optimal viewing experience.