The present invention relates generally to communication devices. More specifically, the invention relates to interactive devices presenting real-time audio and video images for facilitating remote collaboration between users.
The nature of the modern workplace has reached a point where two or more collaborators can work independently yet simultaneously on the same creation while being physically separated from one another. This is often required when the two or more collaborators are at different worksites, separated by small or even great distances. Such a situation will generally be referred to as "remote collaboration", and because some system is usually employed to facilitate such remote collaboration, each collaborator will generally be referred to as a "user" of such a system.
We have observed that collaboration, in general, often involves creating and referring to text and graphics. The created text and graphics generally must be shared by each collaborator. This sharing may be facilitated by a system which provides what will be called hereinafter a "shared workspace." Working without a shared workspace can limit collaboration by: (1) limiting the common understanding about a thing, task, etc., being referred to, (2) limiting the ability of one collaborator to visually add to or comment on the work of another collaborator, (3) causing significant delays in the stream of communication between collaborators, etc. Furthermore, we have observed that collaborators in general need to have an adequate awareness of each other in order to interact effectively. We have found that when collaborators work together in a shared workspace, the collaborators need to see not only the textual and graphic marks that are made, but they must also pay attention to the process of making, referring to, and using those marks, and must be contemporaneously aware of the markings and the person making those markings. Seeing this process helps the collaborators interpret the marks, retain the information presented, mediate their interaction, etc.
There exists in the art a number of devices or systems allowing or facilitating various degrees of use of a shared workspace. Some, for example, allow remote users only to view the workspace. For example, U.S. Pat. No. 3,755,623, issued to Cassagne, discloses a video system allowing presentation of an image of a document or the like which may be referred to by the user that includes a portion of the user, such as one or both hands, gesturing in relation to the document. Other users are presented with a view of the document. However, only the participant who has the document can gesturally refer to it, the other collaborators cannot enact their gestures with respect to the document.
Another reference presenting images of both a user and a document or the like, and allowing some degree of interactive use of the document, is U.S. Pat. No. 4,400,724 issued to Fields. This reference discloses a video teleconferencing system wherein a number of conferees can interactively communicate via a plurality of interconnected monitors and cameras. A document (paper or otherwise) is imaged by a video camera suspended above a "target area" where the document is located. Likewise, gestures relating to the document made within the target area are imaged by the same video camera. The images are presented on the monitors of the other conferees. The other conferees may modify the image by marking over their own target areas or refer to the image by gesture in the appropriate locations in their own target area. The composite of the document's image, modifications, and gestures is distributed for viewing by the conferees on their monitors, which are physically separate from their target areas, in real-time.
Also disclosed in the art are certain systems designed to capture and transmit less than the entire image of a user to a remote location. For example, the work of Myron Kruger on "Videoplace", as described in Artificial Reality, Addison-Wesley (1982), demonstrates a system which captures the outline of a user, and allows computational use of the data about the outline, such as drawing with a finger tip in free space, etc.
We have determined that a number of disadvantages exist in present devices or systems of the type discussed above. First, the devices or systems are of unique or unusual configuration and operation. This requires special skill and training for users, such as learning to draw in one location while viewing the consequences of that action in another. What is desired is a device or system which is simple to operate and similar to existing devices or systems so that the need for training is minimized or obviated.
Second, in general in the art there are no devices having writing surfaces which coincide with the viewing surfaces for all users. This disparity between input (writing) location and output (viewing) location can make it unnatural or difficult to collaboratively construct and use a drawing. What is desired is a system in which the writing and viewing surfaces coincide to allow shared marking directly on the presented image.
Third, many of the devices or systems previously known provide a relatively small workspace. We have determined that the relatively small screens used, combined with the line width of markers used in the systems, the limited resolution of the transmission and display of the marks by video and the scale of the user's hands in relation to the available screen space all limit the amount of work that can be accomplished before effectively filling up the screen. What is therefore desired is a system of the type described above having a larger workspace than previously available.
Fourth, the ergonomics of the devices or systems presently available impede efficient, comfortable, and accurate use. We have determined that users had to position themselves close to an upwardly facing horizontally oriented work surface while at the same time preventing their heads from blocking the overhead video camera's view of the drawing surface. To avoid blocking the camera's view, users' heads need to be off to one side of the drawing space. What is therefore desired is a system of the type described above having ergonomics which facilitate, rather than impede, the comfortable and accurate use of such a device or system.
Fifth, the disclosed devices or systems which capture the outline of a user require computational technology to produce an image of the user. This computational technology is costly in time of processing, size of required device and cost of required device. Further, this type of computational environment also has the problem where the input, interacting in "space", and the output, displayed on a monitor, are always separated (similar to the second problem discussed above). Finally, computational technology, based on image features and gross outlines, removes resources (e.g., shading, perspective, occlusion, etc.) present in video images which provide a valuable source of information to a viewer, such as proximity, depth perception, etc. What is therefore desired is a system of the type described above which minimizes or obviates the need for computational technology, allows input and output on the same work surface, and captures and presents for use information about three dimensional activity of the user.
Sixth, those devices or systems that transmit full video signals require high bandwidth transmission between the user's sites. This may be prohibitively expensive and render unsatisfactory images when transmitted according to developing standard communication protocols such as ISDN, or used with common data compression schemes, etc. What is therefore desired is a system of the type described above having a more limited transmission bandwidth requirement than previously available. The above disadvantages, together with a number of additional disadvantages of the existing and disclosed devices or systems has led to the present invention.