Extracting information from data is a fundamental problem underlying many applications. In the health sciences field, e.g., it may be preferable to extract from census and surveys causal relations between habits and health. A medical doctor may prefer to diagnose the health of a patient based on clinical data, blood tests and, currently, genetic information. Pharmaceutical companies generally investigate the effects on health of drugs in various dosages and combinations. Financial companies generally assess, based on the available data, the probability that many credit-lines default within the same time-window. Market analysts attempt to quantify the effect that advertising campaigns have on sales. Weather forecasters prefer to extract from present and past observations the likely state of the weather in the near future. Climate scientists are pressed to estimate long-time trends from observations over the years of quantities such as sea-surface temperature and the concentration of CO2 in the atmosphere. Clearly, the list of applications can be lengthy.
In many of these applications, the fundamental “data problem” can be posed as follows: a set of m joint observations of n variables is provided, and it may be important to estimate the probability that a function of these variables may be within a certain range in a new observation. Thus, the financial analyst dealing in credit derivatives generally seeks the probability of joint default; the medical doctor, the likelihood that some reported symptoms and measurements are associated with a certain disease; the weather forecaster—the likelihood that the pattern of today's measurements anticipates tomorrow's rain, etc. Thus, there remains a need for estimating a likelihood based on the various data.