1. Field of the Invention
The present invention relates to video cameras, and in particular describes dynamically controlling a cursor on a screen when using a video camera as a pointing device.
2. Description of the Related Art
A video camera can be used as a pointing device for a computer system. To accomplish this, the computer system displays an image on a computer screen of a computer display or projects an image onto a projection screen. The camera is pointed toward the computer screen and controls a screen cursor, which is a moving marker or pointer that indicates a position on the screen. This setup can be used for computer screen presentations in front of groups of people, for example, when the user of the camera gives a presentation.
To detect the cursor of the pointing device in the frame captured by the camera, some current computer systems identify several interest points between consecutive frames, estimate the affine transformation between them, warp one frame to the other using this transformation, and then detect the cursor as the area of difference between the two frames. An affine transformation is a transformation of coordinates that is equivalent to a linear transformation followed by a translation. In addition to being central processing unit (CPU) intensive, these systems break down if the screen shows dynamic content, for example video, animation, and dragging windows. One solution would be to use a camera with a very high frame rate, higher than which the screen of the frame is capable. With current screens refreshing at 70 Hz or more, this solution is very expensive, however. This solution would also require a lot of bandwidth if the processing is done on the computer that acts as a controller to control the screen and cursor.
More importantly, these current computer systems likely lose tracking of their cursors. As users move farther away from the screen, the size of the cursor decreases, as viewed through the camera. In addition, these systems do not provide ways to estimate the distance of the user to the screen, do not provide user interfaces for picking the pointers or re-initializing them after the camera view points away from the screen, and do not estimate the yaw, pitch and roll of the camera, all of which are useful to estimate the location of the user with respect to the screen and useful to predict how the cursor should move. Further, these systems do not allow multiple users. Most likely, these systems would scale poorly in terms of tracking efficiency, CPU, and bandwidth requirements. Further, these systems do not allow users to move the pointer across multiple screens, beyond the trivial case of multiple screens implemented as a single extended desktop.
Regarding tracking a laser pointer, current systems provide a fixed camera that looks at the entire screen and tries to detect the location of a bright laser pointer. This type of tracking requires installing the camera in a fixed location and calibrating it to the system or providing a self-calibrating system. Finding where to mount the camera to prevent occlusion, or possible obstructions in the camera view, can be inconvenient. Multiple pointers can be problematic to detect when they are of the same color. Security can be an issue, as anyone in the room can control the pointer. For example, in a conference room with five hundred people, anyone could shine a laser pointer onto the screen from the back of the room.
Alternate pointing devices are not as desirable for various reasons. Touch screens are inconvenient and sometimes impossible for large screens when users are unable to reach the top of the screen. Touch screens are also expensive. Multiple user input is expensive and sometimes awkward because all users need to come to the screen to touch it. An example of a multiple user touch screen use is a brainstorming session around a vehicle design displayed to the touch screen, or for any application that accommodates several users. A pointing device such as a mouse does not provide direct interaction, as in the case of a touch screen. A mouse provides an indirect interaction because the mouse, as a separate device, is used to move the pointer. Further, pointing with a mouse becomes slower when screens are large. Pointing devices such as light pens and light guns rely on the use of scan-based, but not progressive, cathode ray tube (CRT) screens. They do not function with liquid crystal displays (LCDs) or projection screens, however.
It would be beneficial to provide an improved system for controlling a cursor on a screen when using a video camera as a pointing device.