The present invention relates to controlling a device, particularly a computing device, through hand drawn markings on a whiteboard or blackboard device. More specifically, the invention relates to image analysis techniques for interpreting marks for purposes of controlling devices.
In collaborative working environments, several users frequently wish to view and manipulate displayed information simultaneously. Whiteboards and Blackboards (hereafter referred to as xe2x80x9cBoardsxe2x80x9d) are widely used to maintain hand drawn textual and graphic images on a xe2x80x9cwall-sizexe2x80x9d surface. The Board medium offers certain properties that facilitate a variety of interactive work practices: markings are large enough to be viewed by several people; markings can be edited by erasing and redrawing; the surface is immobile, so does not get lost, crumpled, torn, or blown by wind; the surface is readily erased, is completely reusable, and (practically) does not wear out. However, one drawback to using a Board is that information is not easily transferred to other media. Thus, it is not currently possible to hold a conversation with someone while maintaining a record of the conversation in text and graphics on a Board and then quickly, easily, and automatically transfer the record to paper or other portable and storable medium.
Existing methods for accomplishing this task are cumbersome, time-consuming, and inconvenient. One can simply transcribe by hand, onto paper, any or all of the text and graphics residing on the Board. This can be time-consuming, and suffers from errors due to mistakes in human reading and writing. Or, one can photograph the Board with a camera. This requires having a camera at hand, introduces the delay of developing the film, can be expensive if an xe2x80x9cinstantxe2x80x9d camera is used, and is subject to poor quality rendition due to improper focus and exposure. A camera further usually produces an image of greatly reduced size that can be difficult to read.
Alternatively, xe2x80x9cwall-sizexe2x80x9d sheets of paper, such as poster pads, lead to a relatively permanent and portable record of what was written, but these sheets of paper are large and cumbersome, and do not permit erasure during image creation.
A copy-board device provides a writing surface which can be transcribed into paper hardcopy, but these are currently conceived as conspicuous portable whiteboards that displace rather than leverage existing, built-in Boards.
The solutions discussed above further do not aid in transferring the image from the Board into an electronically usable form.
Concurrently filed U.S. patent application Ser. No. (Attorney Docket No. D/94266) offers motivations and specific technical details for a device to transcribe marks on a Board into electronic form. In summary, a video camera is mounted on a pan/tilt head. High resolution tiles are obtained by zooming in the camera on patches of the image. These are later pieced together to form a full size high resolution composite image. Perspective distortion, effects of uneven lighting, and tile overlap are handled by image processing operations.
Such a transcription device is useful because an electronic version of a Board image provides variety and flexibility in the further use of the image data. For example, an electronic image can be hardcopied, transmitted by fax, stored to a file, transferred to an electronic workstation, or projected onto a screen. Moreover, prior to any of these operations the image itself may be processed, for example select out just a region of the image, to select just certain colors, to enhance or rectify the line work, to reorder items in a list, and so forth.
The wealth of operations made available by the fundamental ability to transcribe a Board image raises the issue of control: How is the user to specify operations to be done, and when?
Since the Board transcription and image processing operations are computer-based, one possibility is for users to retire to their computers in order to control these functions. This solution is undesirable for several reasons. First, it forces users to break the cadence of their work at the Board in order to address a computer console. Second, either a console must be provided at the Board location, or else users must travel some indeterminate distance to where one is available. Third, many Board users are likely to be unfamiliar and/or uncomfortable either with computers in general, or else with the particular keyboard and mouse commands necessary to operate the program.
A second type of user interface consists of a dedicated control panel mounted adjacent to the Board. If the control panel consists of labeled buttons, these can be associated with a modest set of possible operations such as directing the transcribed bitmap to one of a handful of printers or file directories. Greater flexibility would be obtained by including a keyboard with the control panel, but this begins to present a daunting edifice to novice users. Nonetheless, for some incarnations of a Board transcription device a dedicated control panel is probably appropriate.
In the system of the present invention, however, a third alternative exists which is in several ways ideally suited for seamless creation, capture, and electronically mediated use of images originating on a whiteboard. The user interface is to consist of marks drawn by the user on the Board itself. For example, in the simplest case the user might draw a special xe2x80x9cbuttonxe2x80x9d symbol, and then draw a check mark inside when the button is to be xe2x80x9cpressed.xe2x80x9d The system, knowledgeable about buttons, would act upon the data based on the button press. Enhanced functionality may be achieved by annotating the button symbol with further directives, or by specifying different kinds of button for different operations.
The previously described interface does not eliminate the need for users to possess knowledge of how to communicate with the system in this diagrammatic fashion, but this knowledge may be more accessible and more easily assimilated by many Board users than any control panel or computer console based interface.
A diagrammatic interface exploits the user""s existing skills in drawing on the Board, and furthermore can offer extremely simple access to basic functionality and introduce incrementally the greater diagrammatic complexity required to fine tune greater functionality. Finally, a diagrammatic interface consisting of marks drawn on the Board itself is best suited to providing spatial control directives, such as to extract a region of the image.
A diagrammatic user interface to a Board transcription system hinges on the ability of a computer program successfully to interpret marks on the Board. An additional advantage of the present invention includes tolerance to variability and spurious marks made by a human user.
Relatively crude yet effective basic functionality can be had by the application of very simple image processing techniques. The xe2x80x9cBrightBoardxe2x80x9d system, described in xe2x80x9cControlling Computers by Videoxe2x80x9d by Quentin Stafford-Fraser of EuroPARC, employs a video camera pointed at a fixed position on the board. User-experts execute a graphical program at a computer console to denote special regions of the board to serve as xe2x80x9csensitive locations,xe2x80x9d which are typically the interiors of buttons drawn on the board. Functionality is associated with sensitive regions at setup time. Then, in operation, a simple routine runs continuously to measure the net pixel lightness of the sensitive region, which is assumed to cross a threshold when a dark enough mark is made within it.
The image analysis techniques used in the BrightBoard system provides a basic level of control via marks on the Board, depending on the number of sensitive regions one wishes to define. However there are several drawbacks. First, the sensitive regions must be set up in advance and this Board space reserved for the system until the setup configuration is modified. Second, the functionality associated with simple button toggle is limited. Third, the detection of button presses by a change in lightness in sensitive regions is susceptible to false positives caused by shadows or changes in room lighting, and to false negatives due to thinly drawn and lightly colored marks.
There is therefore need for greater sophistication in the image analysis supporting a diagrammatic user interface to a whiteboard transcription system. The power of the interface can be greatly enhanced by the application of computer vision techniques in the geometric analysis of the marks on the Board. Users should be able to draw buttons on the Board anywhere and at any time, they should be able to press buttons by making xe2x80x9cXxe2x80x9d or check marks in them, and they should be able to make more complex diagrams to control symbolic and geometric functionality such as specifying which printer to hardcopy to, or the region of a Board to be extracted.
Diagrammatic user interface interpretation is a difficult computer vision problem because of the wide range of variation found in meaningful hand drawn commands. Symbols and text can occur in any location and at any spatial scale (size), ostensibly straight lines seldom are truly straight, supposedly continuous lines have spurious gaps and branches, and the conventions of formal geometry (such as that a square consists of two sets of parallel line segments meeting at four 90 degree corners) are seldom obeyed. Existing computer vision and document image analysis techniques perform inadequately on the Board diagrammatic user interface analysis task under normal operational conditions in which users are not likely to be especially careful about the precision and accuracy of their diagrammatic command drawings. In addition, because of the real time nature of this task, the image analysis techniques must be inherently efficient in complexity (such as avoiding combinatoric complexity in the number of marks on the board).
The present invention describes a novel application of computer vision techniques supporting interpretation of hand drawn commands under an open-ended class of diagrammatic user interface designs. The components of these interface designs may include special hand drawn symbols, and curvilinear connectives. Accordingly, the techniques of this invention support recognition of hand drawn command symbols, and tracing of curvilinear connectives. The techniques offered herein further lead to a greater degree of robustness in the machine interpretation of hand drawn diagrams than has previously been demonstrated in the art.
The present invention provides a method for controlling devices by interpreting hand drawn marks on a scanned surface. The method includes determining a hand drawn symbolic token representing an action, determining when the symbolic token is selected, and performing the action when it is selected. The method further provides for indicating a spatial area of the scanned surface associated the symbolic token, and for performing the action with respect to that spatial area.
One object of this invention is to provide human users with the ability to control computing devices and transcription devices through hand drawn images on a Board. These images may be drawn at any location on the Board, at any time, and provide a method for selecting control operations from the Board. Symbolic tokens are based upon geometric symbols that may be easily hand drawn and detected by spatial and symbolic analysis procedures.
A further object of the invention is to provide recognition of control diagrams in hand drawn images, and to provide a method for controlling symbolic and geometric functionality to the Board user.
The following description, the drawings and the claims further set forth these and other objects, features and advantages of the invention.