The present invention relates to the field of problem solving. Specifically, the present invention relates to computational systems and methods that actively seek multiple, distinct solutions to a given objective function.
A standard approach to problems in science and engineering is to pose the problem as a mathematical optimization problem, and to use an optimization method for finding the nest solution. A usual method is to formulate an objective function that measures the quality of possible solutions. In some cases, the objective function may be solved analytically; in many other cases, a computational algorithm is required to find the best solution. Many practical problems are intractable in the sense that there is no feasible way to find the optimal solution. In these cases, one must settle for sub-optimal solutions. Many important problems share the additional characteristic that finding a single solution may not be sufficient.
An example of such a problem relates to protein folding. Recent developments have suggested that multiple alternative solutions to the protein folding problem may be of interest for some proteins. The proteins involved in protein conformational disorders, such as Alzheimer's disease, Huntington disease, cystic fibrosis, amyotropic lateral sclerosis, share the ability to adopt multiple different stable conformations. What is needed is an algorithm that reports all likely protein conformations. Many other problems in computational biology have multiple solutions of interest. For example in molecular docking problems, there may be several suitable docking sites for a given pair of molecules. Identifying a set of distinct binding sites would permit a more thorough analysis of possible in vivo effects of a given drug candidate, as well as the selection of candidates based on the total number of possible binding modalities. In phylogenetic analysis, it has been shown that many sequences have multiple optimal maximum-likelihood trees. Such data sets support two or more significantly different phylogenies. Finding a set of distinct near-optimal trees would enable the biologist to integrate the analysis based on molecular data with other forms of information (e.g., morphology, habitat, etc.), providing a more comprehensive analysis of all the available data.
The need to identify multiple solutions obviously extends beyond problems in computational biology. Examples could be drawn from nearly any area of science or engineering. For examples, in design optimization, an engineer may be interested in finding all equally good designs. A knowledge analyst may be interested in finding all sufficiently “interesting” patterns in a large data set. If “solutions” to the objective function correspond to failure modes, then finding all solutions is often a critical requirement, as the next two examples illustrate.
A test engineer may need to identify all failure modes for a complex device such as an airplane or submarine. Given a suitably realistic simulation of the device and its environment, it is possible to design intelligent test systems that systematically explore the effects of environmental conditions and potential combinations of system faults. Casting this as an optimization problem requires an objective function that measures both the probability of the environmental conditions and the severity of the resulting system failure. In a previous study, an intelligent test method based on a genetic algorithm (GA) was developed to find potential failure modes for autonomous vehicles. (See A. C. Schultz, J. J. Grefenstette, and K. A. De Jong., Test and evaluation by genetic algorithms, IEEE Expert, 8(5):9-14, 1993). A standard GA was able to identify unforeseen failure modes, but was not designed to completely explore the space of system failures. An approach that could identify all distinct failure modes would significantly improve the process of ensuring the reliability of the complex systems being tested.
A similar approach may be applied to problems in homeland defense. For example, suppose one is trying to find the best way to protect a given resource, say, a city's transportation infrastructure, from terrorist attack. One approach is to play “devil's advocate” and find the most effective way that a potential terrorist might disrupt the transportation network, and then to design appropriate protective measures. However, this approach might leave the “second-best” terrorist attack plan undefended. Of course, one could just repeat the devil's advocate analysis again, but a more systematic approach would try to identify all possible attack routes before starting to plan the best defense.
As these examples suggest, what is needed is a broad application for computational systems and methods designed to actively seek multiple, distinct solutions to a given objective function.