At present, the main input devices for computers are the keyboard and a pointer device commonly called a “mouse”. For computer gaming purposes, additional devices such as joysticks and gaming consoles with specialized buttons are available to give more input choices to the user in the game's virtual environment. However, these are two-dimensional input devices used for entering 3D information. More recently, “computer interface apparels” such as gaming gloves, actually worn by gamers, are used to input their choice of movements into the computer games.
Despite this plethora of input devices, no satisfactory device exists to accept the desired movements in sword fighting, or racket, club or bat-swinging movements for certain computer games. Other possible applications include symphony-conducting baton movements or the waving of a “magic wand” to perform magical tricks in games. The combined use of cursor control keys and rolling of trackballs do not satisfactorily convey the user's intended movements for the virtual object or character he controls in some computer games.
In addition, other applications beyond the realm of computer games such as computer-aided design and computer-aided manufacturing (CAD/CAM) work, require three-dimensional (3D) input when modifying 3D renditions of designs. Current input devices do not allow ease of use of such applications.
In recent years there have been plethora of technologies being developed and marketed to achieve such 3D, six-degrees-of-freedom (6-DOF) tracking. These include magnetic field-based trackers, ultrasonic mouse, accelerometer/gyro-based trackers, and technologies based on pull-strings and mechanical linkages. However all these technologies require expensive components, difficult to setup and operate, some of which require extensive preparation of operating environment to avoid interference, and are thus not generally available for consumer market.
Beside these technologies, there have been attempts to use machine-vision based technology to achieve 3D/6-DOF tracking using low-cost, commonly available digital cameras that could input real-time images to computers. Such technologies generally require putting a predefined cluster of light sources as tracking points onto the object to be tracked, and from the projected image of these tracking points in the camera the provided algorithms would perform the 3D reconstruction process to recover the position and orientation of the object being tracked. Many of these machine-vision based techniques involve using at least two cameras. An example can be found in U.S. Pat. No. 6,720,949 which uses stereo photogrammetry. However the high cost of using multiple cameras could be avoided if the configuration of the tracking points are properly designed such that the image captured by only one camera could provide unambiguous information about the pose of the object being tracked.
An example of such an input device is described in U.S. Pat. No. 5,889,505 entitled “Vision-Based Six-Degrees-of-Freedom Computer Input Device” and issued to Toyama et. al. The position and orientation of this input device are determined by tracking a physical object suspended by cables as it is moved by a user. The tracking mechanism requires either an initialization where the tracked mechanism is first imaged in a “home position”, or a comparison of current data to previously stored data. The Z coordinate is measured by computing how far apart the pixels of the tracked object are from its centroid. Thus, this method includes all the pixels of the tracked object in its Z computation. Another problem with this approach is that it computes orientation by tracking two reference points that have different distinguishing characteristics. In other words, these reference points must be visually distinguishable. Yet another drawback of this approach is that it does not provide absolute values for the rotation and translation parameters, but only values that are proportional to the actual quantities. These values must then be scaled before being used to control applications.
U.S. Pat. No. 5,856,844, issued to Batterman et al. and entitled “Method and Apparatus for Determining Position and Orientation,” describes a method for determining the six degrees of freedom of a head mounted display and a handle to which an optically-modulated target is attached. The target is marked with squares on its surface, and by tracking the perspective views of these squares, six degrees of freedom are computed. A problem with this approach is that it requires a special orientation mark in the optically-modulated target, in order to identify the ordering of the squares. Another problem is that this approach determines rotation angles directly, and is therefore unduly prone to noise-related distortions.
Techniques described in U.S. Pat. No. 5,227,985, issued to DeMenthon and entitled-“Computer Vision System for Position Monitoring in Three Dimensions Using Non-Coplanar Light Sources Attached to a Monitored Object,” and U.S. Pat. No. 5,297,061, issued to DeMenthon et al. and entitled “Three Dimensional Pointing Device Monitored by Computer Vision,” determine position and orientation of an object by utilizing a set of non-coplanar light sources mounted on the object. A problem with this approach is that the use of non-coplanar light sources makes the device more difficult to manufacture and therefore more costly. Another problem is that the light sources used in this approach are of different sizes, in order to correctly identify the ordering of the light sources in the corresponding image, which adds additional complexity to the device. This requirement would also become problematic if the tracker is positioned very close to the camera, especially if wide-angle lens is used, as the perspective projection would cause the projected images of the light sources out of proportion and thereby making the comparison of their actual sizes very difficult. The use of large-size light sources would also reduce the resolution of the tracking and increases the occurrence of light sources' overlapping. Moreover this approach is capable of only hemispherical tracking, as the algorithm would not be able resolve, without assuming that the device only need to operate for one-half of the full sphere of tracking, the ambiguity that any unique light sources' projection can be the result of at least two very distinct tracker's states. This limitation can also be easily deduced from the preferred embodiment described in U.S. Pat. No. 5,297,061, which involves using tips of optical fibers as the light sources. Such arrangement would allow the light sources to be observable in the camera only if they are oriented towards it. Its function is thus limited to being a pointing device as spelt out in the invention's title.
U.S. Pat. No. 4,672,562, issued to Egli et al. and entitled “Method and Apparatus for Determining Location and Orientation of Objects,” describes an input device comprising an orthogonally-related target array. The points are arranged in a very specific configuration such that the fourth target point forms a common intersection point of first, second and third line projections passing separately through the first three points and intersecting the fourth point. In addition, these line projections must form three right angles at the fourth target point. Such constraints are generally undesirable in that they can render the device difficult to manufacture and use. It is also computationally expensive to online-reconstruct the 3D/six-DOF information using the vector replicas' approach. The device is also limited to hemispherical tracking even if the tracked object is transparent, as the configuration would not allow the algorithm to distinguish if the tracked surface is facing towards or away from the camera.
More recently, U.S. Pat. No. 6,417,836, issued to Kumar et al. and entitled “Computer Input Device Having Six Degrees of Freedom For Controlling Movement of a Three-Dimensional Object”, describes an input device comprising a handle with a plate attached to an upper portion thereof. Associated with an upper planar portion of the plate is a set of at least five principle lighting sources. This device is capable of providing only hemispherical tracking as the light sources will not be visible to the camera if the upper planar portion of the plate is oriented away from it. The computation involved in the 3D reconstruction process is also expensive as it involves much online equation-solving for finding the three-dimensional positions of all the lighting sources first prior to finding the orientations. Moreover the device's design is not ergonomic for most gaming purposes and cannot be handled as a stick-like manipulator, which is the common shape of most virtual arsenals used in electronic games.
In view of the above, a need clearly exists for an intuitive, ergonomic and low-cost input device that enables representation and input of 3D movements by users into computer and gaming devices, while also avoiding the problems associated with the conventional approaches, particularly on the points regarding limited hemispherical tracking, lack of ergonomics as a gaming device, non-coplanar configuration, and complexity in online computations required in the reconstruction process.