There has been extensive research in the field of computer programming to automatically design or evolve computer programs operable to solve problems posed as design specifications, or to solve the closely related problems of automatically designing or evolving complex systems, such as circuits, that satisfy posed requirements. Methods such as genetic programming, evolutionary programming, and other variants may be applied to such problems. In such methods, a user would supply a set of algorithm specifications that determine the form the algorithm is to take, such as an instruction set, initial programs written using the instruction set, an environment for the program, a fitness measure, major and minor parameters for the particular evolutionary algorithm chosen, decisions on whether to incorporate certain methods, and a termination condition.
Then an iterative process may be followed to search through the space of possible programs looking for a program that may satisfy the design constraints entered by a user. Typically, this occurs in a number of steps. First, if initial programs were not supplied, an initial set of programs are created by a random process. Then the initial set (or “population”) of programs are typically executed in the environment; their fitness is assessed; new programs are produced out of the old ones (by methods such as crossover and mutation); these new programs are added to the population and some existing less fit programs may be eliminated. Such a process is iterated until the termination condition is satisfied. It has been observed that such processes (the details of which depend on the algorithmic specifications) tend to produce fitter programs over time, sometimes resulting in a program that satisfies the design constraints.
Although there has been extensive research on this field, the economic impact of such methods to date may be considered disappointing. This may be for the following reasons: first, the search space is too large, and second, the fitness landscape is too irregular.
The search space is the space of programs that can be built over the instruction set respecting any architectural or typing constraints that are imposed. The methods are attempting to evolve through this space by finding increasingly fitter programs until they find one satisfying the design conditions. But the number of programs that can be built is typically truly vast. For example, if there are ten possible instructions in the instruction set (and often there are hundreds), the number of possible programs that are only 100 instructions long (and this would be a fairly short program) would be 10100. These methods may thus be seen as attempting to find a needle in an unimaginably large haystack. Moreover, it is often the case that a modification of a program may create another program whose fitness is largely uncorrelated with the first, which makes it hard to evolve smoothly.
Biological evolution used a seemingly related process to design creatures, possibly ultimately resulting in human intelligence. But evolution used 4 billion years. Published estimates indicate that something in the neighborhood of 1035 creatures have lived and died, each potentially contributing to evolution. Each of these may be considered analogous to a candidate program in a run of a genetic or evolutionary programming algorithm. However, it is not common for genetic programming experiments to evaluate as many as 108 candidate programs and it is hard to foresee any computers within the next 20 years that would allow use of many as 1020. The number of candidates that can be considered in evolutionary or genetic programming runs drops sharply when the evaluation of fitness is complex or requires interaction with real world processes outside of a computer simulation, to the point where considering 104 candidates may become prohibitively expensive or time consuming for many problems of practical interest. For those fitness evaluation procedures that require human interaction (which might be useful or necessary for many practical problems, such as a recommender system or a user interface that evolves to fit the preferences of an individual user), the number of candidates that may reasonably be considered can drop into double or even single digits.
Typically, restrictions on the architecture of programs that can be evolved in genetic or evolutionary programming are often undesirable, because, given a particular set of restrictions, it may be difficult to know a priori that a program solving one's problem may even exist in the search space. Thus many of the methods are directed toward increasing the flexibility of the programs that can be discovered to ensure that some program is in principle discoverable that would solve the problem. But by expanding the flexibility of the programs that can be discovered, the search space is further enlarged, which may make it even harder to find a solution. Moreover, the methods may have missed a critical feature that may have greatly aided biological evolution in designing such powerful creatures. Such methods may deal with one environment, one fitness measure, one termination condition at a time. They are proposed, and applied, as a means to deal with one problem at a time. They do not extract, store, and utilize data that enables them to perform better as evolution methods on later different problems. But evolution faced a long series of different problems, having different environments, fitness measures, and data. It may have made cumulative discoveries that facilitated later progress, including facilitating its ability to rapidly evolve to solve new and different problems. For example, evolution may have discovered gene networks of Hox genes, facilitating the construction of certain kinds of body types. Then evolution may have been free to experiment with higher level constructions, such as adding legs, or lengthening legs, or rearranging body plans in a macroscopic fashion. That is, once evolution had created sub-routines for performing certain kinds of constructions, experimentation with rearrangements of relatively large, meaningful parts may have been facilitated, corresponding to a faster, more meaningful search of program space. The same genetic circuitry that evolved for one reason in one creature, facing one environment and set of problems and data, was later rearranged and slightly modified to solve new problems in new environments in other creatures. Often modules produced for solving one problem were re-utilized, in modified fashion, to solve other entirely different ones.
Genetic programs produce hierarchic programs monolithically from one environment. This greatly limits them because the search space for very complex problems is much too vast to be so solved, so that in practice genetic programming can only solve relatively small problems.
The basic problem is that new program discovery may inherently only be possible for programs of a certain small size, because for larger problems the search spaces become too big, and the computational complexity of finding successful programs too large. To solve deep problems, it may be necessary to make cumulative progress, discovering a series of powerful modules that solve intermediate problems, and building a solution to a deep problem on these modules.