(1) Technical Field
The present invention relates to techniques for determining and assessing the quality of missing information for decision-making. More specifically, the present invention relates to techniques for feature discovery and information source discrimination to assist in efficiently and cost effectively gathering information for decision-making processes (e.g., classification).
(2) Discussion
Typically classifiers are queried with a complete input description and respond by predicting a class membership (e.g. query: “furry”, “alive”, “has a heart”; response: “mammal”). This framework is passive in nature. That is, the classifier behaves as if it has no control over what information it receives.
In contrast, the majority of real-world classification operations involve extensive decision-making and active information gathering. For example, a doctor trying to diagnose a patient must decide which tests to perform based on the expected costs and benefits of the tests. The doctor is not given a static and complete featural description of a patient's state. Instead, the doctor must actively gather information. Furthermore, the doctor cannot gather every possible piece of information about the patient. Cost issues rule out this possibility.
The same general principal applies to any situation where a partial information set exists and the gathering of further information has the potential to become cost-prohibitive. Examples of such systems and their underlying cost-basis include radar systems for classifying objects, where energy expended, radar antenna allocation, risk of source detection, and time are example cost factors; medical diagnosis, as previously mentioned, where financial cost, risk to patient well-being, and time are example cost factors; and investment/economic recommendation systems, where financial cost and time are example cost factors.
In addition to the above situations, scenarios involving spatially distributed networks of inexpensive, small and smart nodes with multiple onboard sensors are an important class of emerging networked systems for a variety of defense and commercial applications. Since a network of sensors often has to operate efficiently in adverse environments using limited battery power and resources, it is important that these sensors process information hierarchically and share information such that a decision is made progressively. It would be desirable to address this problem by activating only those nodes that can provide relevant information to aid in progressive decisions. However, techniques developed to-date for feature selection are generally static in nature in that they select a subset of features from a larger set and perform classification operations thereon without being able assess and verify the cost/benefit of the information provided.
Thus, a need exists for a system that aids in classification tasks in which the available information is incomplete and where it is desirable that the system gather further information efficiently in a cost beneficial way to aid in optimum classification/decision-making. It would be desirable that such a system perform an accurate cost/benefit analysis of possible information sources in order to determine the next information to gather in order to augment a set of partial information to achieve a desired classification accuracy level.
The following references are provided as additional general information regarding the field of the invention.
1. R Battti, “Using mutual information for selecting features in supervised neural net learning,” IEEE Trans. On Neural Network, vol. 5, no. Jul. 4, 1994, pp. 537-550.
2. S. C. A. Thomopoulos, “Sensor selectivity and intelligent data fusion,” Proc. Of the IEEE MIT'94, Oct. 2-5, 1994, Las Vegas, Nev. pp. 529-537.
3. J. Manyika and H. Durrant-Whyte, “Data fusion and sensor management: An information thoretic approach,” Prentice Hall, 1994.
4. J. N. Kapur, “Measures of information and their applications,” John Wiley, Eastern Limited, 1994.
5. T. Pan, “Entropic thresholding: A new approach,” Signal processing, Vol. 2, 1981, pp. 210-239.
6. A. Papoulis, “Probability, Random variables and Stochastic Processes,” Second edition, McGraw Hill 1984, pp. 500-567.
7. G. A. Darbellay, I. Vajda, “Estimation of the information by an adaptive partitioning of the observation space,” IEEE Transactions on Information Theory, vol. 45, no. May 4, 1999, pp. 1315-1321.
8. L. R. Rabiner and B-H. juang, “Findamentals of Speech Recognition,” Prentice Hall, 1993, Chapter 6
9. H.-P. Bernhard and G. A. Darbellay, “Performance analysis of the mutual information function for nonlinear and linear signal processing” Proc. Of ICASSP '99, vol. 3, 1999, pp. 1297-1300.
10. W. H. Press, S. A. Teukolsky, W. T. Vetterling and B. P. Flannery, “Numerical Recipes in C,” Cambridge University Press, 1992, pp. 632-635.