This disclosure relates generally to touch detection. More particularly, but not by way of limitation, this disclosure relates to techniques for camera-based touch detection on arbitrary surfaces.
Detecting when and where a user's finger touches a real environmental surface can enable intuitive interactions between the user, the environment, and a hardware system (e.g., a computer or gaming system). Using cameras for touch detection has many advantages over methods that rely on sensors embedded in a surface (e.g., capacitive sensor). Further, some modern digital devices like head-mounted devices (HMD) and smart phones are equipped with vision sensors—including depth cameras. Current depth-based touch detection approaches use depth cameras to provide distance measurements between the camera and the finger and between the camera and the environmental surface. One approach requires a fixed depth camera setup and cannot be applied to dynamic scenes. Another approach first identifies the finger, segments the finger, and then flood fills neighboring pixels from the center of the fingertip so that when sufficient pixels are so filled, a touch is detected. However, because this approach does not account even consider normalizing pixel depth-data, it can be quite error prone. In still another approach, finger touches are determined based on a pre-computed reference frame, an analysis of the hand's contour and the fitting of depth curves. Each of these approaches require predefined thresholds to distinguish touch and no-touch conditions. They also suffer from large hover distances (i.e., a touch may be indicated when the finger hovers 10 millimeters or less above the surface; thereby introducing a large number of false-positive touch detections).