Conventional robot control techniques process sensor data using classical mechanics, kinematics and closed loop control. The result is then used to generate robot motor commands to position the robot or manipulator for further action. Robot control through conventional techniques requires frequent evaluation of trigonometric functions which can be burdensome on the computers controlling the robot. In addition, closed loop control generally requires position sensors such as resolvers or encoders; these sensors usually require calibration with changes in the environment or in hardware. Finally, conventional robot control requires a target position to be mapped into a realizable set of servo commands.
An alternative robot control concept for processing sensor data and generating commands based on research in neural nets offer advantages in faster processing owing to simpler data representations and simpler command mechanisms without position sensors. These concepts result in human like hierarchical spatial representations of sensory data in machine memory. The techniques attempt to mimic the operation of the human brain in robot control. The resulting techniques allow open loop control sensors and actuators.
Previous work has taken two forms. The first is a computational model of the human saccadic system and associated spatial representations. The purpose of these models was to verify neuroscientific theories about brain function by reproducing experimental data. Many details of these models are not needed to build a robust system for robot control. Eliminating the reproduction of experimental data purpose of the model allows distilling these models down to the essentials, resulting in fast, simple, and robust spatial coordinates systems that can be used for invariant internal representations of external objects. Additionally, these representations can be used to drive eye and head movements for accurate foveation. The representation of a target is the commands necessary to place the target centered in a particular frame of reference.
The second form of previous work that relates to the present apparatus and method is robot control based on learning mappings of sensory data to motor/joint spaces. This is open loop control in that generated motor commands does not depend on motor or joint position measurements. The advantage is the robot processor does not have to calculate the commands to achieve pointing based on target position. The most successful work of this form uses methods for learning inverse kinematics for mapping pixel positions in binocular camera input to the changes in neck and eye joints necessary to foveate a target.
There is a disadvantage or characteristic in the previous work that affects robot control. The purpose of previous work was to develop computational models that reliably recreated the behavior of real biological systems. The same characteristics of a successful model can be a hindrance to a robotic active vision system. First, the models are based on neural networks, and, therefore, assume efficient massively distributed computational resources. Actual robots may be better controlled by a small number of centralized processors. Second, the models contain various auxiliary modules that correspond to specific brain regions. The modules contribute to the computations in the same way that the corresponding brain regions do in biology. While necessary for a realistic model, these add unnecessary complexity to a robotic control system. Finally, the methods used in these models for learning are based on adaptive neural learning. Much faster and more robust “non-bio-inspired” methods exist in the machine learning literature.
An issue that can complicate these models is the fact that eye muscles and neck muscles move at speeds that differ by an order-of-magnitude or more (Tweed 1997). For the present apparatus and method to control servo-based robotic systems, it must adjust the existing techniques to accommodate the fact that a robot's servo “muscles” can all move at roughly the same speed.
Additional prior work is related to other “biologically-inspired” methods for robotic active vision control. These methods typically learn how to accurately foveate a visible target with inverse kinematics. They employ state-of-the-art online learning methods to learn a mapping of eye pixel coordinates to motor commands (Cameron 1996; Shibata et al. 2001; Vijayakumar et al. 2002). Although these methods are effective at learning to saccade quickly, they do not translate to an invariant target representation.
Other prior work has dealt with body-centered, movement-invariant representations for a robotic working memory (Peters et al. 2001; Edsinger 2007). The limitations of this work relate to the fact that it uses a “flat” or single point of view representation, instead of the multi-leveled hierarchy. These limitations are driven by storing all target information at the body-centered level. By storing all targets in a single coordinate representation, the system must perform many redundant translations in order to perform computations in other coordinate representations. This can slow down reaction time and introduce errors. As important, different control objectives may be achieved easier in one control frame than others. By limiting target representations to a single body-centered level, the system lacks the ability to easily perform computations or reasoning in the most advantageous coordinate representation.
The previous work in computational neuroscience has developed detailed computational models for the brain's spatial representation hierarchy (Greve et al. 1993; Grossberg et al. 1993; Guenther et al. 1994). These models imitate the way the brain combines stimulations on individual eyes into an “ego-centric” representation, which is then mapped into an eye-invariant, head-centered coordinate system (Greve et al. 1993; Grossberg et al. 1993). Likewise, further models describe how the brain maps head-centered representations into a head-invariant, body-centered coordinate system (Guenther et al. 1994).
The work of Grossberg et al. describes the spatial representations necessary, but not in a way that can be implemented efficiently on a real robotic system. Schaal's work on learning inverse kinematics can control reactive eye and head movements, but lacks the ability to preserve information about the target in an invariant representation. Finally, the work of Peters provides an invariant representation, but one that is not amenable to tasks, like eye movements, that must take place in different coordinate systems.
There is a need for target representations and a control methodology that allows simpler control methods without position sensors and complicated closed loop control.