1. Field of the Invention
The present invention relates to the field of process and solution improvement, especially solution improvement and facilitation by way of computer based analysis, as by algorithms using genetic algorithms and/or evolution algorithms.
2. Background of the Art
Genetic Algorithms (GA)
The study of genetic algorithms originated with John Holland in the mid-1970s. His original genetic algorithm is approximately the same as the “Simple Genetic Algorithm” now found in the literature. Since this GA is used as a starting point for almost all new work, it is worth describing it in detail.
A simple GA represents solutions using strings of bits. These bits may encode integers, real numbers, sets, or whatever else is appropriate to the problem. Advocates argue that the use of bit-strings as a universal representation allows for a uniform set of simple operators, and simplify the task of analyzing GA properties theoretically. Detractors argue that bitwise operators are often not appropriate for particular problems, and that analytic ease is too high a price to pay for performance. Today, most practical GA systems use problem-specific representations (integers to represent integers, character strings to represent sets, and so on), and customize operations for these representations.
The operators provided by the simple GA were 1-point crossover, mutation, and inversion. These were inspired directly by natural systems. Today, inversion has largely been dropped, and several different forms of crossover and mutation are used.
Selection in the simple GA was based directly on fitness: given a population of individuals, the probability of a particular individual passing its genes into the next generation was directly proportional to its fitness. Various ranking and selection schemes are now used instead of raw fitness in order to ensure that genetic drift does not occur, i.e. that good genes are less likely to disappear because of a bad accident.
As the previous paragraph hinted, simple GA systems use generational update schemes. These are like the life cycles of many plant and insect species: each generation produces the next and then dies off, so that an individual in generation g never has a chance to breed with one in generation g+1. As we shall see, continuous update schemes are also possible, in which children are gradually mixed into a single, continuously-evolving, population.
The simple GA also used a single population, in which any individual could potentially breed with any other. Starting in the 1980s, many groups (including the Edinburgh group) began experimenting with multiple populations because of the availability of parallel computing hardware. Surprisingly, multiple populations turned out to be better in most cases, even when simulated on conventional machines. The reason is that they permit speciation, a process by which different populations evolve in different directions (i.e., toward different optima). This helps maintain the diversity of the total population. One of the strengths of early improvements on Holland's genetic algorithms were their built-in support for multi-population GAs, and the way in which it handles such things as work distribution automatically.
Finally, the simple GA was a weak method. In the optimization community, this term means that it did not use any knowledge about the evaluation function, such as its derivative, to guide its search. Genetic algorithms are therefore particularly well-suited to problems with discontinuous or poorly-behaved evaluation functions.
GA's begin by randomly generating, or seeding, an initial population of candidate solutions. Each candidate is an individual member of a large population of size M. For the purposes of this discussion, think of each individual as a row vector composed of N elements. In GA parlance, individuals are often referred to as chromosomes, and the vector elements as genes. Each gene will provide storage for, and may be associated with, a specific parameter of the search space. As a simple example of two unknowns, we may think of each individual as a parameter vector V, in which the each V contains a point in the x-y plane, V=[x y]. With this example in mind, the entire population may be stored and manipulated as two-column matrix: The first column represents a population of x-axis values, and the second column a population of y-axis values, and each of the M rows of the matrix is a solution vector V of length N=2.
The individuals (chromosomes) in genetic algorithms are usually constant-length character sequences (vectors V of constant size). In the traditional GA, these vectors are usually sequences of zeros and ones, but in practice may be anything, including a mix of integers and real numbers, or even a mix of numbers and character strings. The actual data encoded in the vectors is called the representation scheme. To keep the discussion simple and concrete, the chromosomes in this article will be real, continuous parameters with two elements, V=[x y]. Given these two-element chromosomes, the objective is to search for the (x,y) point that maximizes the scalar-valued fitness function z=f(V)=f(x,y). Starting with the initial random population of vectors, a GA then applies a sequence of operations to the population, guided only by the relative fitness of the individuals, and allows the population to evolve over a number of generations. The goal of the evolutionary process is to continually improve the fitness of the best solution vector, as well as the average population fitness, until some termination criteria is satisfied.
Conventional GAs usually apply a sequence of operations to the population based on the relative fitness of the members. The operations typically involve reproduction, in which individuals are randomly selected to survive into the next generation. In reproduction, highly fit individuals are more likely to be selected than unfit members. The idea behind reproduction is to allow relatively fit members to survive and procreate, with the hope that their genes are truly associated with better performance. Note that reproduction is asexual, in that only a single individual is involved.
The next operation, crossover, is a sexual operation involving two (or even more!) individuals. In crossover, highly fit individuals are more likely to be selected to mate and produce children than unfit members. In this manner, highly fit vectors are allowed to breed, with the hope that they will produce ever more fit offspring. Although the crossover operation may take many forms, it typically involves splitting each parent chromosome at a randomly-selected point within the interior of the chromosome, and rearranging the fragments so as to produce offspring of the same size as the parents. The children usually differ from each other, and are usually different from the parents. The effect of crossover is to build upon the success of the past, yet still explore new areas of the search space. Crossover is a slice-and-dice operation which efficiently shuffles genetic information from one generation the next.
The next operation is mutation, in which individuals are slightly changed. In our case, this means that the (x,y) parameters of the chromosome are slightly perturbed with some small probability. Mutation is an asexual operation, and usually plays a relatively minor role in the population. The idea behind mutation is to restore genetic diversity lost during the application of reproduction and crossover, both of which place relentless pressure on the population to converge.
The relentless pressure towards convergence is driven by fitness, and offers a convenient termination criteria. In fact, after many generations of evolution via the repeated application of reproduction, crossover, and mutation, the individuals in the population will often begin to look alike, so to speak. At this point, the GA typically terminates because additional evolution will produce little improvement in fitness. Many termination criteria may be used, in which the most simple is to just stop after some predetermined number of generations.
Since Holland, other researchers have made progress towards the optimization (e.g., minimizing the dollar cost of a set of operations) of large constrained applications consisting of multiple resources to be assigned in desired sequence to many tasks. This is a very difficult problem. Techniques such as linear programming, other classical mathematical methods, and simple Genetic Algorithms (GA) have been used for relatively simple process optimizations for several applications. The GA so used proceeds from random initial solutions (a population) and uses the GA operations: selection, crossover, mutation, and fitness calculation to increasingly “evolve” towards better populations of solutions. These GA operations work well on certain types of problems, and not so well on others. Of this latter class, large multi resource/multi assignment constrained ordered resource assignment problems are of particular difficulty as they are typically multi chromosomal and contain both Bin Packing (BP) and Traveling Salesmen Problem (TSP) attributes. Even if heuristics are used (e.g. spatially adjacent near neighbors) to prune the search space, constraints can disrupt the new search space and thus nullify any advantage from using ad hoc heuristics.
A method of surmounting GA difficulties has been addressed by U.S. Pat. No. 5,319,781 to G. Syswerda, entitled “Generation of Schedules Using a Genetic Procedure”. Syswerda specified a single single-valued chromosome consisting of a permutable task list. As Syswerda noted, his patent was based primarily on the method of problem encoding and the use of a deterministic scheduler, deviating from the simple binary encoding of prior art. In Syswerda, each task is assigned its resources in order from the permuted list by a deterministic scheduling method that resolves all constraints. This method works very well if the number of resources per task is constant. It constrains the search space very efficiently. An example of using this method for an application is referenced in U.S. Pat. No. 6,233,493 to Jonathan Cherneff; in which product development sequences could be optimized using the Syswerda method in part. However, if the number of resources per task is variable e.g., the number of painters assigned to paint a house, than the Syswerda method of using a deterministic method of assigning resources to tasks is insufficient.
To surmount the difficulty of optimization where applications require a variable number of resources per task, this invention uses a different problem encoding method of dual valued chromosomes. Because this invention's encoding method necessarily (i.e., mathematically) enlarges the search space, this invention also uses a new method of reproduction named: Genetically Adapted Search Agents (GASA). The GASA was invented to adaptively narrow the enlarged search space to speed solution convergence. In addition, to avoid the difficulty of using GA crossover whereby illegal solutions may arise, this invention uses GASA methods to avoid illegal crossover.
The present invention is a method of assigning resources to tasks in such a manner as to minimize the cost of the resources required to execute or accomplish the tasks. A particular assignment of resources to tasks is called a chromosome. This invention uses a new method of using genetic algorithms to progressively improve a set of chromosomes, each of which is a solution to a problem that is desired to be optimized.