The world is full of data, objects and actions, called referents, that have, or can usefully be interpreted as having, relative spatial locations. It is useful to be aware of these relative locations since this information is important to the correct operation of many systems.
A referent may be a sensor, such as an image sensor, audio sensor, pressure sensor, and generally any sensor which provides a signal. A referent may be a point in the world at which such a sensor is aimed. A referent may be any point that is related to an action. A referent may be a source of data.
The relative spatial location of a referent can be represented by a locator embedded in a n-dimensional spatial representation.
A referent maybe a source or filter of information, called a signal, such that this signal may be used to effect positioning of the associated locator in a spatial representation.
For example, image processing and machine vision systems generally utilise an array of light sensitive sensors (e.g. CCDs). In this case, the referents are sources in a perceived scene which activate individual light sensors. Locators are the pixels that attempt to represent the relative positions of these referents. The changing value of each pixel is the signal of its associated light sensor. In order to make intelligent interpretations of the signal from each individual light sensor, it is necessary for the system to be aware of the relative spatial positions of the light sensors. If the system is not aware of the relative spatial positions of the sensors, it may be unable to produce an organised image. Present systems are therefore given as a priori information all the relative spatial positions of the sensors, in notionally orthogonal matrices implicit in the standards established for handling photographic, video and other sense data. In the example of a CCD array, the de-multiplexed signals from the camera are arranged in a two-dimensional orthogonal matrix, each signal being assigned a unique position in a two dimensional space. These matrices are often arrays or lists of individual pixels. In order to read the signal of a sensor its position is given, in either coordinate or list position form, as an argument to an image function. This function then returns the value. The position is thus fixed and indexical. The sensor's signal is a secondary characteristic, in the traditional formulation.
The provision of a priori spatial information in image processing and other sensor systems is such an inherent requirement that it is almost taken for granted by most systems. Little thought has been given to whether it would be advantageous or possible to provide the spatial information in any other way apart from pre-programming the system.
There are, however, as the present applicants have realised, significant problems involved with the requirement to provide the system with a priori information on the spatial positions of the sensors. With systems, such as robots for example, which incorporate sensor sub-systems such as machine vision sub-systems, it is desirable that the system be able to operate as independently as possible. The problem is that the accuracy and consistency of a priori spatial information cannot be relied upon. For example, the relative spatial positions of the sensors may change so that the system becomes de-calibrated; spatial information about the sensor array may be only an approximation of the spatial arrangement of the referents that, due to, for example, lens distortions, is inaccurate in parts of the visual field; visual apparatus parts may be replaced with new parts of slightly different specifications; designs may be altered. Consequently, regular servicing and calibration is required to maintain reliability, but this is expensive and often difficult because the system may be remote, hard to access, or extremely small and delicate.
Even where a system is easy to access it is not desirable that regular servicing be required. The more independent the system, generally the less costly it is to operate.
Further, there are also situations where the a priori spatial information may not be available. Consider a large number of sensors randomly distributed across a space which is required to be imaged by the sensors—a number of sensors dropped onto the sea floor or another planet's surface, for example. Because the sensors are randomly positioned, a priori relative spatial information is not available.
These problems apply not just to vision sensors in image processing and machine vision systems, but to any referents where knowledge of spatial position is necessary for operation of a system.
The requirement of man-made sensor systems for such a priori information can be contrasted with biological systems which deal very well with physical space without the requirement for a priori information. Biological vision systems, for example, are able to build a picture of the physical space that they are viewing, without having to make any assumptions about the positional arrangement of the biological vision sensors (e.g. retinal cells).
Essentially, for independent operation of any system, it is preferable to avoid the dependence on a priori spatial information. Preferably, the system should be able to derive spatial information via its own sensor sub-system. There is therefore a need for an improved apparatus and method for estimating the relative spatial positions of referents.
Further, the present applicants have realised that it is not only sensor systems that usefully require spatial information. They have also appreciated that there are other referents which can be usefully represented by locators positioned in space. One significant example of referents which can be usefully represented in this way includes motor commands for moving systems which incorporate spatial sensor sub-systems. The present applicants propose the use of spatial representations for motor command referents in earlier filed provisional application number PR6616, filed on 27 Jul. 2001, and from which the present application claims priority.
Systems which include a motor sub-system are many and include robots used in industry (e.g. production line robots); robots used in remote areas (e.g. for exploration under sea or off-planet); vision systems used in commercial, research or military applications (e.g. to track moving objects, such as missiles, intruders, animal or human research subjects, contaminants in liquids, motor traffic); surveillance systems used for security purposes (e.g. unmanned cameras in pre-determined locations); low cost robots used for toys and other systems. Such systems usually have a motor sub-system which is arranged to cause motion of the sensor sub-system.
There are two absolute requirements for the satisfactory performance of a system incorporating a motor sub-system and sensor sub-system. The requirements are:    1. The sensor sub-system must include a plurality of sensors. As discussed above, in order for the system to build up a representation of the environment being sensed by the sensor sub-system, it must be given or able to infer the spatial positions of the sensors relative to each other.    2. The motor sub-system must include one or more motor commands each of which will affect motion of the sensor sub-system. In order for the system to properly control motion, it must be given or able to infer the effect of motor commands on the sensor sub-system. Without this, it cannot anticipate what any particular motor command will do, and therefore cannot select intelligently from the plurality of available motor commands.
The first requirement for sensor sub-systems has already been discussed above.
With regard to the second requirement (effect of motor commands on the sensor sub-system), in conventional systems, a priori knowledge of the effect of motor commands on the sensor sub-system or motion of system-controlled components is incorporated into the system. The system is therefore pre-programmed with knowledge of what a particular motor command will do.
For example, some machine vision systems include an array of sensors such that a visual field is divided into a portion of relatively low resolution at the periphery, and a portion of relatively high resolution at a central area known as the “fovea”. The entire visual field can be used to rapidly detect visually interesting points that the system can then turn to look at for subsequent high resolution processing by using motor commands to place any point of interest at the fovea. Such machine vision systems are very useful, for example, for systems which require the capability to track moving objects such as missiles, intruders, animal or human research subjects, contaminants in liquids, and motor traffic.
In order to ensure that a machine vision system operates correctly, the motor sub-system (utilised to move the sensors so that an object of interest can be resolved in the high resolution area) must be correctly calibrated to the sensor sub-system. If the goal of the vision system is to concentrate its fovea on an object of interest, then given the location of the object relative to the fovea the system must correctly select a particular motor command, so that after that motor command is performed the object of interest is resolved in the fovea of the vision sub-system.
Presently, calibration is carried out with a priori knowledge of the effect of the motor commands on the motor sub-system and sensor sub-system. A problem arises, however, that the calibration may be incorrect, or more importantly, over time (considering wear and tear of mechanical parts) the system may decalibrate. Further, prior calibration to attempt to provide a system which will be reliable may be very expensive (e.g. in providing the best mechanical parts, fine tolerance parts, etc).
For the same reasons as were given in the case of sensor systems, regular maintenance, servicing, and recalibration of motor systems is undesirable.
There is a need for a system which includes a motor sub-system and a sensor sub-system, which enables automatic continuous calibration of the motor sub-system to the sensor sub-system.