In an increasing number of applications, there is a need to entrust computers with the task of making decisions automatically, in particular with the task of recognizing data events extracted from sensor data. Examples of such sensor data are 1-D signals such as speech, seismic, sonar, and electrocardiographic waveforms, or 2-D signals such as text and images provided by imaging sensors (e.g. TV, laser radar, X-ray or NMR scanners). Examples of events extracted are words (from a speech waveform), characters (from text), object silhouettes (from images), or any other entity that is of potential interest. Note that herein there is a distinction between the act of extracting an event from data and that of recognizing the event, or more exactly the entity from which the event is produced. Of particular concern is the problem of recognizing an extracted and decomposed event by matching the event and its parts to reference models
Following is additional background on event recognition by way of a description of one of the application areas i.e. computer vision for such recognition. In general, computer vision is the field of computer science and artificial intelligence, concerned with the design of computer programs capable of automatically identifying images acquired by imaging sensors that can be highly specialized. Computer vision is a well established and rapidly growing discipline. Practical applications of computer vision include industrial inspection, autonomous navigation, and automatic target recognition (ATR).
In what follows, the ATR application is further discussed for purposes of illustration. Generally, the role of an ATR system is to detect and recognize military targets in imagery provided by one or more sensors. Examples of such sensors are CO.sub.2 or GaAs laser radars, real- or synthetic- aperture millimeter wave radars, forward looking infrared imagers in any one of a number of bands, and video cameras.
An ATR system may also have access to other information such as terrain maps, navigational data, suspected target locations and target types (based on the constraints of the terrain and the scenario 3of interest), and meteorological data. Ideally, the output of an ATR system should primarily consist of a prioritized list of detected and recognized targets, their location, orientation and other important attributes.
The Department of Defense is very interested in ATR systems for a number of applications. For example, in the tactical arena, there is interest in sensor-carrying platforms that would locate targets, such as tanks, howitzers (self-propelled guns) and armored personnel carriers (APCs), in a limited geographical area where these objects may appear in large numbers. These platforms can either be manned or unmanned, and ground based or airborne. And ATR systems can function either entirely automatically from raw data to reports or commands to a weapons system, or as data screeners considerably reducing the flow of data coming from the sensors. In the strategic case, airborne platforms would look for mobile missile launchers in a large geographical area, a problem akin to looking for a needle in a haystack. Here also, the ATR systems could be used as simple data screeners or as elements of a fully automatic reconnaissance or weapon system. Obviously, fully automated ATR systems will be a crucial component of so called "smart" weapons, i.e. autonomous ground or air vehicles that would be launched from stand-off positions to search and engage specific targets.
Building a comprehensive ATR system is difficult for many reasons. First, a target's appearance in an image changes when the target's orientation with respect to the sensor's line of sight is modified. Second, a target's appearance is also affected by camouflaging, obscuring objects, the time of day or night, and/or weather conditions. Third, a target's appearance differs from one sensor to another (e.g., from a laser radar to a forward looking infrared imager) and across imaging modalities of the same sensor (e.g. range and intensities of a laser radar). Finally, even for a given type of image, the target's appearance will change with the characteristics of the sensor, typically with the angular resolution and, in the case of ranging sensors, with the range precision.
The current tendency in ATR-sensor-system design is to mount multiple imaging sensors on the same platform, all aimed in the same direction, either boresighted or perfectly registered by sharing the same optics. Even for a given sensor, multiple imaging modalities are possible (e.g., intensity, range, and Doppler in the case of a laser radar). Therefore, both a single sensor and a complex sensor configuration will provide a large variety of images of whatever lies in the field of view of the sensors. Of course, each type of image calls for a specialized set of ATR algorithms to extract and process the information specific to that type. One of the challenges of the multi sensor ATR problem is that of fusing the information extracted from each image type. This fusion can take place at several levels, ranging from the pixel level to the reasoning level. How to select the best information fusion strategy is an emerging area of research both in computer vision and in other fields.