There are various applications in which a machine is to be controlled by information derived from a visual scene. For example, in the field of robotics, it may be desirable for a robotic arm to be able to select an item from a bin. If the robot is provided with a visual capability (e.g. a camera system), the robot must be able to process the information derived from the camera in such a way that the arm can be moved in three dimensions to recognize and/or enegage a selected item. There are many other applications where it is desired to obtain three-dimenional visual information in a form which can be used by a machine, such as a computer, for control, identification or detection purposes (among others). Non-robotic uses for such information include computer assisted manufacturing and/or design systems, security systems, collision avoidance systems, guidance systems and range finders. There are innumerable other situations where reliable three-dimensional visual information can be used advantageously.
The prior art has encountered substantial problems in attempting to provide machine usable information corresponding to a three-dimensional field. In practice, adjacent and/or interleaved information in the visual field may represent objects that are widely displaced in range so that separation of information from different objects becomes quite difficult. If it were possible to determine the range of each object in the visual field it would then be possible to greatly simplify the separation of objects. To date the complexity of computations in existing systems has precluded real-time operation in all but very simple vision fields. Auto focus systems for cameras are not applicable as a solution to this problem.
A number of different approaches for obtaining three-dimensional ranging data are being or have been investigated as generally noted below.
A binocular vision system employs two separate camera units having correlated outputs so that the relative displacement of image coincidence may be located or identified. An enormous amount of computation is required so that the system is quite slow and costly. Many other problems such as unreliable ranging of repetitive patterns also exist.
Structured lighting may be employed to project grids or the like on a subject from different angles while one or more cameras record the images. In such a system, images are post processed to obtain depth information and, while less analysis is required than in the above-noted binocular correlation, moving objects cannot be processed and the system is impractical for substantial vision field volumes or under normal illumination. Difficulty is also encountered in illuminating concave surfaces of objects in a vision field.
Attempts have also been made to employ a scanned pencil beam of light such as a laser beam over a vision field. The difficulties of this system are substantially the same as the structural lighting system noted above except that normal lighting may be employed, however such a system is very slow.
The present invention provides three-dimensional (3D) visual recognition which is faster and less costly than that available with known prior art techniques.
In addition, the present invention has a better response to complex target images and moving targets without critical mechanical alignment requirements as compared to the known prior art.
An important feature of the invention resides in the capability of providing "adaptive" (programmable) control of various optical and electronic parameters to aid image processing. Some of these adaptively controlled functions may include focus, focal length, effective aperture (depth of field), electronic filtering, optical filtering, spatial-temporal light modulation, pattern recognition and memory plane loading. Data produced in accordance with the invention can be analyzed for characteristics such as shape, color, texture, plane orientation, geometric motion, and range, among others.