Many real-world optimization problems have enormously large potential solution sets. Random searching or enumeration of the entire search space of such sets is not practical. As a result, efforts have been made to develop optimization methods for solving the problems efficiently. To date, however, known optimization methods have substantial limitations.
One class of optimization methods that have shown some promise is the so-called genetic optimization method or algorithm. This method evolves a population of potential solutions to a given problem. Genetic optimization methods are described in detail in “Adaptation in natural and artificial systems,” J. Holland, University of Michigan Press, Ann Arbor Mich. (1975), and “Genetic Algorithms in search, optimization, and machine learning,” D. Goldberg, Addison-Wesley publishing, Reading Mass. (1989), both of which are incorporated herein by reference. Genetic optimization methods are generally useful for manipulating a large number of promising partial solutions. The first population of solutions may be generated at random. By means of a measure of quality of solutions, usually expressed in the form of one or multiple functions, better solutions are selected from the first population. The selected solutions undergo the operators of selection, mutation and crossover in order to create a second population of new solutions (the offspring population) that fully or in part replace the original (parent) population. The process repeats until the termination criteria (e.g., convergence to a singleton) are met.
While genetic optimization methods may be useful for application to some problems, they have proven less useful for others. Many real-world problems, for example, can be decomposed into sub-problems of less difficulty and solved quickly, accurately, and reliably, by propagating and combining partial solutions corresponding to the different sub-problems with operators of genetic optimization methods. The application of traditional genetic optimization methods to decomposable problems, however, has met with limited success.
Traditional genetic optimization methods have been impractical for use with decomposable problems, and particularly for complex decomposable problems, for a number of reasons. For example, conventional genetic optimization methods are not capable of “learning” how to properly combine sub-solutions via crossover, and they do not feature cross-over that is expressive enough to apply to the decomposed problem. Decomposition is generally expressed on a single level only, with crossover operating only on very near neighbors thereby limiting its usefulness.
As a result, traditional optimization methods application to decomposable problems has typically required accurate and detailed design of the problem decomposition before application of the method. High levels of effort are therefore required for solution design, adding cost and time to the solution. Further, error rates remain high when sufficient information is not available to encode the problem decomposition. These disadvantages are particularly acute when addressing problems of appreciable difficulty and/or complexity, such as hierarchically decomposable problems where dependencies, independencies, and other relationships may exist across multiple levels. For more information regarding the class of problems categorized as hierarchical, reference is made to “Sciences of the Artificial,” by Herbert Simon, The MIT Press, Cambridge, Mass. (1981); herein incorporated by reference.
As a result of these disadvantages, methods have been proposed to limit the need to precisely pre-code the problem decomposition. In particular, efforts have been made to develop genetic optimization methods that “learn” a problem as it is encountered through “linkage learning”—discovery of relationships between variables. A few classes of such methods have been proposed. One approach is based on introducing additional operators into the genetic optimization method to evolve representation of the solutions in addition to the solutions themselves. This practice has met with limited success. Among other difficulties, it has been discovered that in such methods the influence driving the optimization to accomplishing good representation is of much lower magnitude than the influence driving the optimization to seeking high-quality solutions. Consequently, premature convergence may occur before a proper representation of the global optimum is learned.
A second proposed approach is based on performing perturbations to a single position or multiple positions and recording the statistics of the resulting change in the quality of each solution. The gathered information is then analyzed to create groups of variables that seem to be correlated. Crossover is modified to agree with the discovered relationships. Among other problems, however, these methods tend to be inefficient due to the number of perturbations required. Cost and required run times are thereby increased.
A third approach is based on probabilistic model building during genetic optimization to learn the problem structure. An example of such a proposed method is the so-called Bayesian optimization method or algorithm. The Bayesian optimization method is described in detail in “Linkage problem, distribution estimation, and Bayesian networks,” by Pelikan, Goldberg, and Cantu-Paz, IlliGAL Report No. 98013, Urbana Ill., University of Illinois at Urbana-Champaign, Illinois Genetic Algorithms Laboratory (1998) (“the Pelikan reference”), incorporated herein by reference. The psuedo-code of the Bayesian optimization method is:                1) An initial solution set is generated at random.        2) A promising set of solutions is then selected from the initial solution set.        3) A Bayesian network is then constructed to model the promising solutions and subsequently guide the further search.        4) A metric as a measure of quality of networks and a search algorithm can be used to search over the networks in order to maximize/minimize the value of the used metric.        5) New strings are generated according to the joint distribution encoded by the constructed network.        6) The new strings are added into the old population, replacing some of the old ones.        7) If completion criteria are not met, the process repeats itself using the partially replaced initial population.        
While these proposed methods may offer some advantage over previous methods, many disadvantages with known methods remain. For example, known methods such as the Bayesian optimization method tend to be limited in their ability to learn the problem structure at hand. The learning of the problem, in fact, is often limited to learning relationships that exist only on a single level. Thus, while such methods may be useful for solving relatively simple problems that can be described by relations on a single level, they have proven much less practical for more complex problems with an example being hierarchically decomposable functions of appreciable complexity. For such problems, known methods such as the Bayesian optimization do not scale up well, may converge too early or too late, may converge at less than an optimal solution set, and/or may crash.
In addition, known methods such as the Bayesian optimization method are disadvantageous in their inability to determine multiple solutions to a problem, or to address problems that have symmetry in their solutions. Indeed, by their genetic and evolutionary nature, most known optimization methods tend to focus on one promising solution above all others and continue to evolve it. Such tendencies are disadvantageous when addressing problems having multiple solutions that are difficult to accurately differentiate using only a fitness function. Further, for complex problems that may be decomposed on multiple levels, it may not be possible to determine which of a variety of sub-problem solutions are preferable until a higher level solution is investigated. In such cases, most known optimization methods are inadequate. Such problems are particularly acute for problems that have symmetry or multiple optima, when known methods such as the Bayesian method will tend to eliminate all but a single search area early in the iterative solution process.
Unresolved problems in the art therefore exist.