1. Technical Field
The present invention relates to the fusion of information from multiple sources for use in decision making.
2. Description of the Related Art
Complex decision making and prediction problems have to deal with and analyze multiple sources of data. Such problems have become more prevalent due to increase both in number and type of sensors. For instance, in a surveillance and threat scenarios, the environment may be monitored by multiple cameras covering different views of the same environment, sound sensors, motion sensors, and possible other sensors (such as RFID). One expects to improve performance of decision making, detection and prediction tasks by using observations from these multiple sources of information (about the same system), although technically it is difficult. Such problems occur in many different fields (such as vegetation monitoring where diverse sources of information include remote satellite sensed images from different sensors and resolutions, ground images used for verification; in bioinformatics detection of a disease can be based on information from proteins sequences, DNA sequences, results biomedical tests, symptoms described by the patient, etc.). This inventions relates to methods for robust decision making from multiple sources of data when data from one or more sources is missing. The proposed methods exploit the correlations between data sources when the information was available when the decision making models where constructed.
In principle, an analysis based on a multiplicity of data sources is appealing. Combining data from multiple sources leads to better decision making by amplifying the database and building support for true indicators that may be weak or not evident in parts of the data. Data combinations can, for example, suppress noise in data and thereby reduce false alarms. However, such real cases are rarely simplistic. In reality, ambiguity often dominates corroborations, multisource data is incomparable, and conflicting indicators observed in multiple sources make it hard to arrive at a consensus decision. Without the aid of reliable methods that can deal with complex heterogeneous data, the advantages of multisource data do not translate to better decisions. So, in order to leverage the multisource data, advanced decision-making algorithms are needed.
Most current methods using multisource data focus on integrating data from multiple platforms, each representing individual sources. This is a definite and important first step in making sense of diverse data. Its scope appears to be limited to rendering data to a unified structure and format for enabling analysts to access the data and convert it to information for making decisions.
This can work well for simple cases where data from various sources is supplementary in nature and differs mostly in format or is not linked. It does not address most real cases where data from multiple sources is complementary in nature and differs in both type and format, such as data collected from video, image, speech, satellite surveillance, and other sensors. However, the end goal of multisource data analysis is to convert data to information in a form that facilitates decision making. Furthermore, it is often difficult for experts to sort through and analyze such a diversity and volume of multisource data from multiple sensors. Thus, making use of such data requires advanced decision-making methods that can deal with the various dimensions of complexity involved with multisource data.
A comprehensive understanding and inference about complex systems, whether natural or artificial, can, however, be made through observations by multiple types of sensors. Each such sensor brings out a different perspective of the underlying complex system. Due to the diverse nature of observations (such as monitoring a system by images in multiple spectra, parameter values, operational conditions, etc.), data from such sensors is often diverse and uncomparable. As a consequence, data from each sensor needs to be analyzed in its own right.
On the other hand, since the overall data relates to a single system, a sensitive, accurate, and robust analysis must consider each data source (sensor) in the context of data generated by other sources. To illustrate the point, again consider the example of recognizing a terrorist threat. Such an endeavor requires an analysis of communication data (telephone records, email exchanges, etc.), financial transactions, travel information, and social and background information about involved individuals and groups; each data source is very diverse and uncomparable. As another example, landcover mapping might involve analyzing data obtained from radar, known relief features, satellite imagery at various spectra, etc. Such examples arise in almost all fields and have a common feature in that, although the signal in any single source of data may be weak and noisy, one can build more reliable decisions by simultaneously considering data from multiple sources to allow data from one source to be analyzed in the context of data from other sources.
All currently known multisource information fusion methods use methods developed for single-source data and can be divided into two types based on the stage at which they combine information from different data sources.
The first category, information fusion (shown schematically in FIG. 1), is actually just data aggregation/concatenation, and it involves applying single-source data analysis methods to the combined/stacked data from multiple sources. In doing so, however, one combines uncomparable data and looses the semantics of individual data sources. A more serious disadvantage is the inability to learn accurate models due to the curse of dimensionality, which is magnified as a result of concatenating data from multiple sources. This limits the ability to create decision models that will generalize well to unseen data encountered when the system is deployed.
The second category of methods, shown schematically in FIG. 2, can be called decision fusion methods, which involve applying single-source data analysis methods individually to each of the multiple data sources followed by a fusion of results from each of the sources. This second category of methods completely ignores information from other data sources while analyzing data from any one source and, thus, it fails to take advantage of multiple data sources.
These two approaches to multisource information fusion are thus technically deficient as they are unable to create accurate decision models. And none of these methods are capable of handling a situation when only partial data information is available, i.e., data from all the sources of information (e.g., sensors) is not present for making the decision.