Not applicable
Not applicable
1. Field of the Invention
The invention relates to simple input devices for computers, well suited for use with 3-D graphically intensive activities, and operating by optically sensing object or human positions and/or orientations. The invention in many preferred embodiments, uses real time stereo photogrammetry using single or multiple TV cameras whose output is analyzed and used as input to a personal computer.
2. Description of Related Art
The closest known references to the stereo photogrammetric imaging of datum""s employed by several preferred embodiments of the invention are thought to exist in the fields of flight simulation, robotics, animation and biomechanical studies. Some early prior art references in these fields are
U.S. patents
Pugh USP#
Birk U.S. Pat. No. 4,416,924
Pinckney U.S. Pat. No. 4,219,847
U.S. Pat. No. 4,672,564 by Egli et al, filed Nov. 15, 1984
Pryor U.S. Pat. No. 5,506,682, robot vision using targets
Pryor, Method for Automatically Handling, Assembling and Working on Objects U.S. Pat. No. 4,654,949
Pryor, U.S. Pat. No. 5,148,591, Vision target based assembly
In what is called xe2x80x9cvirtual realityxe2x80x9d, a number of other devices have appeared for human instruction to a computer. Examples are head trackers, magnetic pickups on the human and the like, which have their counterpart in the invention herein.
References from this field having similar goals to some aspects of the invention herein are:
U.S. Pat. No. 5,297,061 by Dementhon et al
U.S. Pat. No. 5,388,059 also by Dementhon, et al
U.S. Pat. No. 5,168,531: Real-time recognition of pointing information from video, by Sigel
U.S. Pat. No. 5,617,312 Computer system that enters control information by means of video camera by Iura et al, filed Nov. 18, 1994
U.S. Pat. No. 5,616,078: Motion-controlled video entertainment system, by Oh; Ketsu,
U.S. Pat. No. 5,594,469: Hand gesture machine control system, by Feeman, et al.
U.S. Pat. No. 5,454,043: Dynamic and static hand gesture recognition through low-level image analysis by Freeman;
U.S. Pat. No. 5,581,276: 3D human interface apparatus using motion recognition based on dynamic image processing, by Cipolla et al.
U.S. Pat. No. 4,843568: Real time perception of and response to the actions of an unencumbered participant/user by Krueger, et al
Iura and Sigel disclose means for using a video camera to look at a operators body or finger and input control information to a computer. Their disclosure is generally limited to two dimensional inputs in an xy plane, such as would be traveled by a mouse used conventionally.
Dementhion discloses the use objects equipped with 4 LEDs detected with a single video camera to provide a 6 degree of freedom solution of object position and orientation. He downplays the use of retroreflector targets for this task.
Cipolla et al discusses processing and recognition of movement sequence gesture inputs detected with a single video camera whereby objects or parts of humans equipped with four reflective targets or leds are moved thru space, and a sequence of images of the objects taken and processed. The targets can be colored to aid discrmination.
Pryor, one of the inventors, in several previous applications has described single and dual (stereo) camera systems utilizing natural features of objects or special targets including retroreflectors for determination of position and orientation of objects in real time suitable for computer input, in up to 6 degrees of freedom.
Pinckney has described a single camera method for using and detecting 4 reflective targets to determine position and orientation of an object in 6 degrees of freedom. A paper by Dr. H. F. L. Pinckney entitled Theory and Development of an on line 30 Hz video photogrammetry system for real-time 3 dimensional control presented at the Symposium of Commission V Photogrammetry for Industry, Stockholm, August 1978, together with many of the references referred to therein gives many of the underlying equations of solution of photogrammetry particularly with a single camera. Another reference relating to use of two or more cameras, is Development of Stereo Vision for Industrial Inspection, Dr. S. F. El-Hakim, Proceedings of the Instrument Society of America (ISA) Symposium, Calgary Alta, Apr. 3-5, 1989. This paper too has several useful references to the photogrammetry art.
Generally speaking, while several prior art references have provided pieces of the puzzle, none has disclosed a workable system capable of widespread use, the variety and scope of embodiments herein, nor the breath and novelty of applications made possible with electro-optical determination of object position and/or orientation.
In this invention, many embodiments may operate with natural features, colored targets, self-illuminated targets such as LEDS, or with retroreflective targets. Generally the latter two give the best results from the point of view of speed and reliability of detectionxe2x80x94of major importance to widespread dissemination of the technology.
However, of these two, only the retroreflector is both low cost, and totally unobtrusive to the user. Despite certain problems using same, it is the preferred type of target for general use, at least for detection in more than 3 degrees of freedom. Even in only two degrees, where standard xe2x80x9cblobxe2x80x9d type image processing might reasonably be used to find ones finger for example, (ef U.S. Pat. No. 5,168,531 by Sigel), use of simple glass bead based, or molded plastic corner cube based retroreflectors allows much higher frequency response (eg 30 Hz, 60 Hz, or even higher detection rates) from the multiple incidence angles needed in normal enviornments, also with lower cost computers under a wider variety of conditionsxe2x80x94and is more reliable as well.(at least with todays PC processing power).
Numerous 3D input apparatus exist today. As direct computer input for screen manipulation, the most common is the xe2x80x9cMousexe2x80x9d that is manipulated in x and y, and through various artifices in the computer program driving the display, provides some control in z-axis. In 3 dimensions (3D) however, this is indirect, time consuming, artificial, and requires considerable training to do well. Similar comments relate to joysticks, which in their original function were designed for input of two angles.
In the computer game world as well; the mouse, joy stick and other 2D devices prevail today.
The disclosed invention is optically based, and generally uses unobtrusive specialized datum""s on, or incorporated within, an object whose 3D position and/or orientation is desired to be inputted to a computer. Typically such datums are viewed with a single tv camera, or two tv cameras forming a stereo pair. A preferred location for the camera(s) is proximate the computer display, looking outward therefrom, or to the top or side of the human work or play space.
While many aspects of the invention can be used without specialized datum""s (e.g. a retroreflective tape on ones finger, versus use of the natural finger image itself), these specialized datum""s have been found to work more reliably, and at lowest cost using technology which can be capable of wide dissemination in the next few years. This is very important commercially. Even where only two-dimensional position is desired, such as x, y location of a finger tip, this is still the case.
For degrees of freedom beyond 3, we feel such specialized datum based technology is the only practical method today. Retroreflective glass bead tape, or beading, such as composed of Scotchlite 7615 by 3M co., provides a point, line, or other desirably shaped datum which can be easily attached to any object desired, and which has high brightness and contrast to surroundings such as parts of a human, clothes, a room etc, when illuminated with incident light along the optical axis of the viewing optics such as that of a TV camera. This in turn allows cameras to be used in normal environments, and having fast integration times capable of capturing common motions desired, and allows datums to be distinguished easily which greatly reduces computer processing time and cost.
Retroreflective or other datums are often distinguished by color or shape as well as brightness. Other target datums suitable can be distinguished just on color or shape or pattern, but do not have the brightness advantage offered by the retro. Suitable Retroreflectors can alternatively be glass, plastic or retroreflective glass bead paints, and can be other forms of retroreflectors than beads, such as corner cubes. But the beaded type is most useful. Shapes of datums found to be useful have been for example dots, rings, lines, edge outlines, triangles, and combinations of the foregoing.
It is a goal of this invention to provide a means for data entry that has the following key attributes among others:
Full 3D (up to 6 degrees of freedom, eg x, y, z, roll, pitch, yaw) real time dynamic input using artifacts, aliases, portions of the human body, or combinations thereof
Very low cost, due also to ability to share cost with other computer input functions such as document reading, picture telephony, etc.
Generic versatilityxe2x80x94can be used for many purposes, and saves as well on learning new and different systems for those purposes.
Unobtrusive to the user
Fast response, suitable for high speed gaming as well as desk use.
Compatible as input to large screen displaysxe2x80x94including wall projections
Unique ability to create physically real xe2x80x9cAliasxe2x80x9d or xe2x80x9csurrogatexe2x80x9d objects
Unique ability to provide realistic tactile feel of objects in hand or against other objects, without adding cost
A unique ability to enable xe2x80x9cPhysicalxe2x80x9d and xe2x80x9cNaturalxe2x80x9d experience. It makes using computers fun, and allows the very young to participate. And it radically improves the ability to use 3D graphics and CAD systems with little or no training.
An ability to aid the old and handicapped in new and useful ways.
An abiltiy to provide meaningful teaching and other experiences capable of reaching wide audiences at low cost
An ability to give life to a childs imagination thru the medium of known objects and software, with out requiring high cost toys, and providing unique learning experiences.
What is also unique about the invention here disclosed is that it unites all of the worlds above, and more besides, providing the ability to have a common system that serves all purposes well-at lowest possible cost and complexity.
The invention has a unique ability to combine what amounts to 3D icons (physical artifacts) with static or dynamic gestures or movement sequences. This opens up, among other things, a whole new way for people, particularly children, beginners and those with poor motor or other skills to interact with the computer. By manipulating a set of simple tools and objects that have targets appropriately attached, a novice computer user can control complex 2D and 3D computer programs with the expertise of a child playing with toys!
The invention also acts as an important teaching aide, especially for small children and the disabled, who have undeveloped motor skills. Such persons can, with the invention, become computer literate far faster than those using conventional input devices such as a mouse. The ability of the invention to use any desired portion of a human body, or an object in his command provides a massive capability for control, which can be changed at will. In addition, the invention allows one to avoid carpal tunnel syndrome and other effects of using keyboards and mice. One only needs move through the air so to speak, or with ergonomically advantageous artifacts.
The system can be calibrated for each individual to magnify even the smallest motion to compensate for handicaps or enhance user comfort or other benefits.(eg trying to work in a cramped space on an airplane). If desired, unwanted motions can be filtered or removed using the invention. (in this case a higher number of camera images than would normally be necessary is typically taken, and effects in some frames averaged, filtered or removed altogether).
The invention also provides for high resolution of object position and orientation at high speed and at very low or nearly insignificant cost. And it provides for smooth input functions without the jerkiness of mechanical devices such as a sticking mouse of the conventional variety.
In addition, the invention can be used to aid learning in very young children and infants by relating gestures of hands and other bodily portions or objects (such as rattles or toys held by the child), to music and/or visual experiences via computer generated graphics or real imagery called from a memory such as DVD disks or the like.
The invention is particularly valuable for expanding the value of life-size, near life size, or at least large screen (eg. greater than 42 inches diagonal) TV displays. Since the projection can now be of this size at affordable cost, the invention allows an also affordable means of relating in a lifelike way to the objects on the screenxe2x80x94to play with them, to modify them, and other wise interrelate using ones natural actions and the naturally appearing screen sizexe2x80x94which can also be in 3D using stereo display techniques of whatever desired type.