Technical Field
The disclosed embodiments relate in general to systems and methods for enabling user interaction with computing devices and, more specifically, to systems and methods for enabling fine-grained user interaction with projector-camera and/or display-camera systems using a smartphone or other mobile device.
Description of the Related Art
For projection surfaces and display panels in meeting rooms and public areas that lack input capability, it would be useful to enable interaction in a convenient and low-cost way. For coarse-grain interaction hand gesture input technologies have been successfully developed. Examples include systems for game playing, such as Kinect system for Xbox well known to persons of ordinary skill in the art, as well as a system and method described in Wachs, J., Stern, H., Edan, Y., Gillam, M., Handler, J., Feied, C., Smith, M., A gesture-based tool for sterile browsing of radiology images., J. American Medical Informatics Assoc., 15 (3) 321-323 (2008).
U.S. patent application Ser. No. 13/865,990 entitled “SYSTEMS AND METHODS FOR IMPLEMENTING AND USING GESTURE BASED USER INTERFACE WIDGETS WITH CAMERA INPUT” describes a system for detection of gestures for discrete actions with button widgets and gestures for continuous (but not fine-grained) actions for panning and zooming actions with a viewport widget.
As would be appreciated by persons of skill in the art, for fine-grain user interaction such as annotating a slide image or writing on a whiteboard canvas, it can be more advantageous to track a simpler object than the hand or fingers of the user. Because people are accustomed to traditional input devices like the mouse or stylus, it can feel more comfortable to “write” using a device rather than a finger. Input devices also have a standard set of action events (e.g. pen down, pen dragging, and pen up) that are somewhat complicated to implement with finger gestures. Detecting touch is also a hard problem using a typical camera-projection display setup, even with a depth camera, because the camera can only see the back of the finger and the physical finger must be approximated using a geometric model, as described, for example, in Wilson, A. D., Using a depth camera as a touch sensor, Proc. ITS '10, pp. 69-72. Furthermore, fingers are computationally expensive to track and may not scale well for supporting multiple users interacting with a large display surface.
On the other hand, for tracking an object, there exist commercial and research systems, including, without limitation, Vicon Motion Systems, Flock of Birds Tracker, iLamps described in 14. Raskar, R., Baar, J. v., Beardsley, P., Willwacher, T., Rao, S. and Forlines, C. iLamps: geometrically aware and self-configuring projectors. Proc. SIGGRAPH '03, pp. 809-818 as well as Ubiquitous Coded Light described in Raskar, R., Baar, J. v., Beardsley, P., Willwacher, T., Rao, S. and Forlines, C. iLamps: geometrically aware and self-configuring projectors, Proc. SIGGRAPH '03, pp. 809-818. However, these object tracking systems are expensive and/or complex to build, and they require special hardware. Another practical problem that all the above tracking solutions face is that when input devices are put in public meeting rooms or spaces, they can often become misplaced or lost.
Therefore, the conventional systems and methods for enabling user interaction with projection surfaces and display panels are either too imprecise, expensive or lack requisite reliability. Thus, new and improved systems and methods are needed that would enable fine-grained user interaction with projector-camera and/or display-camera systems.