The present invention relates generally to the field of distributed decision problem solving and to problems which require information retrieved from multiple, logically interrelated, distributed databases. More specifically, the invention relates to utilization of co-evolutionary algorithms executing in a distributed information architecture having co-evolutionary agents and mobile software agents for performing the problem solving in a network-efficient manner.
Previous work at the Electronics Agile Manufacturing Research Institute (EAMRI) at Rensselaer Polytechnic Institute has recognized the critical role which information infrastructure plays in the design and manufacturing organizations. Research at EAMRI has resulted in the development of a virtual design module, described in U.S. Pat. No. 6,249,714, for using evolutionary agents emanating from a central location to produce optimized designs.
An understanding of evolutionary algorithms and distributed problem solving methods is helpful to the discussion of the invention.
Evolutionary algorithms (EAs) include genetic algorithms, evolutionary programming, evolution strategies and genetic programming. Genetic algorithms are discussed in David E. Goldberg, “Genetic Algorithms In Search, Optimization, and Machine Learning.” Addison-Wesley, Massachusetts 1989 and John H. Holland, “Adaptation in Natural and Artificial Systems: an introductory analysis with applications to biology, control, and artificial intelligence.” The MIT Press, Cambridge, Mass., 3d ed. 1994. L. J. Fogel et al., “Artificial Intelligence Through Simulated Evolution.” John Wiley, New York 1966, provide a good explanation of evolutionary programming. More detail about evolution strategies is found in Thomas Back, “Evolutionary Algorithms in Theory and Practice.” Oxford University Press, New York 1996. J. Koza, “Genetic Programming: On the programming of computers by means of natural selection.” The MIT Press, Cambridge, Mass. 1992 provides information about the title topic.
The principles of each of these related techniques define a general paradigm that is based on a simulation of natural evolution. Evolutionary algorithms perform their search by maintaining at any time t a population: (t)={P1(t), P2(t), P3(t), . . . , Pp(t)} of individuals. Genetic operators that model simplified rules of biological evolution are applied to create a new and desirably more superior population (t+1). The genetic evolution process continues until a sufficiently good population is achieved, or some other termination condition is satisfied.
Each Pi(t)ε(t) represents, via an internal data structure, a potential solution to the original problem. Choice of an appropriate data structure for representing solutions is very much an art, rather than a science, due to the plurality of data structures suitable for a given problem. However, choice of an appropriate representation is often a critical step in a successful application of evolutionary algorithms. Effort is required to select a data structure that is compact, minimally superfluous, and can avoid creation of infeasible individuals in the population.
As an example, if the problem domain requires finding an optimal integer vector from the space defined by dissimilarly bounded integer coordinates, it is more appropriate to choose as a representation an integer-set-array instead of a representation capable of generating bit strings. An integer-set-array is an array of bounded sets of integers, while a representation that generates bit strings can create many infeasible individuals, and is certainly longer than a more compact sequence of integers.
Closely linked to choice of representation of solutions, is the choice of a fitness function: ψ: (·)→, that assigns credit to candidate solution. Individuals in a population are assigned fitness values according to some evaluation criterion. Fitness values measure how well individuals represent solutions to the problem. Highly fit individuals are more likely to create offspring by recombination or mutation operations. Weak individuals are less likely to be selected for reproduction, and so they eventually die out. A mutation operator introduces genetic variations in the population by randomly modifying some of the building blocks of individuals.
Evolutionary algorithms are essentially parallel by design, and at each evolutionary step a breadth search of increasingly optimal subregions of the options space is performed. Evolutionary search is a powerful technique of solving problems, and is applicable to a wide variety of practical problems that are nearly intractable with other, conventional optimization techniques. Practical evolutionary search schemes do not guarantee convergence to the global optimum in a predetermined finite time, but they are often capable of finding very good and consistent approximate solutions. However, they are shown to asymptotically converge under mild conditions, as described in R. Subbu et al., “Modeling and convergence analysis of distributed coevolutionary algorithms,” Proceedings of the IEEE Congress on Evolutionary Computation, San Diego, Calif. 2000 and R. Subbu, “Network Decision Support based on Distributed Coevolutionary Algorithms,” PhD thesis, Rensselaer Polytechnic Institute, Troy, N.Y. 2000.
Distributed coevolutionary computations are a further innovation of evolutionary algorithm problem solving. Parallel and distributed implementations of evolutionary algorithms typically follow a coarse-grained approach of evolving independent populations on multiple nodes and occasionally migrating individuals between nodes. Alternatively, the fine-grained approach can be used in which individuals are distributed among multiple nodes where they have localized interactions. These approaches are discussed in greater detail in M. Capcarrere et al., “A statistical study of a class of cellular evolutionary algorithms,” Evolutionary Computation vol. 7, issue 3, 1999.
In each of the coarse- and fine-grained methods, each node in the system can potentially directly manipulate variables in all n dimensions. These models have primarily been pursued for the purpose of speeding computations in large-scale problems and for simultaneously alleviating the problem of premature convergence.
Coevolutionary algorithms are distributed and consist of distinct distributed algorithm components that considered together follow various models of cooperation or competition. In this model, different subspaces of the feasible space are explored concurrently by the algorithm components. If a problem is such that the subproblem solved by each algorithm component is independent of the others, that is, the problem is decomposable, then each algorithm component can evolve without regard to the other components. From the perspective of optimization analysis, in such cases, each algorithm component optimizes in a landscape disjoint from the landscapes corresponding to the other algorithm components.
However, many problems exhibit complex interdependencies, and from a coevolutionary perspective, it has been suggested that the effect of changing one of the interdependent subcomponents leads to a deformation or warping of the landscapes associated with each of the other interdependent subcomponents. Kauffman et al. discuss this theory in “Co-evolution to the edge of chaos: Coupled fitness landscapes, poised states, and co-evolutionary avalanches,” Artificial Life II, 1990. Recent studies have increased the interest in application of coevolutionary systems to problem solving.
Husbands et al., “Experiments with an ecosystems model for integrated production planning” Handbook of Evolutionary Computation, Edited by Back et al., Oxford University Press 1997, propose a coevolutionary distributed genetic algorithm for integrated manufacturing planning and scheduling. In this scheme, each species concentrates on identifying a feasible set of process plans for a single component to be manufactured. The species interact because they utilize shared manufacturing resources. The individual fitness functions of each species take into account the need to utilize shared resources, and are based on various manufacturing costs. In order to resolve conflicts between species, an arbitrator species is simultaneously evolved. The fitness of the arbitrator species depends on its ability to resolve conflicts such that manufacturing delays are minimized. The individuals in each species compete internally in order to generate good process plans, and species compete at a higher level for shared manufacturing resources.
Others, especially Potter et al., “Cooperative Coevolution: an Architecture for Evolving Coadapted Subcomponents,” Evolutionary Computation, vol. 8, issue 1, 2000, have proposed a coevolutionary model in which multiple species evolve independently, enter into temporary collaborations with certain members of the other species, and are rewarded based on the success of the collaboration in solving a problem. A collaboration of all species is required to realize a coherent and complete problem solution. In this model, typically the best individual from each species is chosen as the representative that will collaborate with individuals of the other species. Thus, for evaluating the fitness of each individual in a given species, the best representatives from each of the other species are utilized to form the complete solution, following which the solution is evaluated. The fitness is assigned strictly to the individual being evaluated and is not shared with the representatives from the other species that participated in the collaboration. This method which utilizes only the best individual from each population for fostering across-species collaborations could be characterized as “greedy”, due to the lack of sharing between representatives. The greedy method is effective in its way because of its simplicity. However, this simple pattern of interaction between species leads to entrapments in local optima. Other collaborations schemes that include random selections of representatives are also possible, and lead to improved results over the “greedy” method.
A competitive coevolutionary model applicable to scheduling problems is proposed by F. Seredynski, “Competitive coevolutionary multi-agent systems: The application to mapping and scheduling problems,” Journal of Parallel and Distributed Computing, vol. 47, issue 1, 1997. Seredynski considers game-theoretic models of limited interaction between individuals in competing populations. Individuals in a population have limited interaction with individuals in the neighboring populations, and seek to maximize their fitness based on local evaluations. Seredynski demonstrates the successful emergence of global behaviors achieved purely through local cooperation.
Coevolutionary approaches that follow a clearly adversarial model are based on the biological belief that an adaptive change of a species introduces a new challenge to the competing species. The challenge of the first change then causes a second adaptive change by the competing species, which in turn causes a response in the first species and so on. See, C. D. Rosin, “Coevolutionary Search Among Adversaries,” PhD thesis, Univ. of Calif. San Diego, San Diego, Calif. 1997. In these adversarial systems, the fitness of an individual in a population is based on a competition with members from the other population. Rosin has applied the adversarial coevolutionary model to various problems, including for instance, to the design of drugs that are robust across some drug resistance mutations, and to game playing.
Distributed problem solving (DPS) is another problem solving method in which a cooperative solution to a problem is generated by loosely coupled agents operating according to a decentralized computational model. In this case, loosely coupled means that the agents spend the majority of their time computing and working the problem rather than communicating with other agents. The distributed problem solving model does not have a centralized data store, and no one agent has enough information to make a complete decision. Each agent in the system requires assistance from at least one other agent in the decision-making process. Further, agents may be physically and logically distributed over a computing environment. The problem is solved by intelligently combining subproblems solved by agents into an overall solution.
The fundamental areas of interest in distributed problem solving are the decomposition and coordination of computation among a society of agents so that structural demands of the task domain are matched. See, B. Chandrasekaran, “Natural and social system metaphors for distributed problem solving: Introduction to the issue,” IEEE Transactions on Systems, Man, and Cybernetics, SMC-11 (1), 1981.
One of the earliest projects in distributed problem solving is the Contract-Net-Protocol, described in Smith, “The contract net protocol: High-level communication and control in a distributed problem solver, ” IEEE Transactions on Computers, C-29 (12), 1980 and Smith et al., “Frameworks for cooperation in distributed problem solving,” IEEE Transactions on Systems, Man, and Cybernetics, SMC-11 (1), 1981. In Contract-Net-Protocol, computing nodes coordinate their activities through contracts. A “manager” node announces a task for which multiple eligible “contractor” nodes respond with bids. Contractor nodes receive pieces of the contract after a negotiation process, and in case a contractor requires assistance with its part of the problem, it assumes the role of a manager and subcontracts its part of the problem to other nodes. The original problem is solved in a top-down manner by a network of contractor nodes. The method for task decomposition is specified a priori, and the Contract-Net framework is best suited to problems that can be hierarchially decomposed into nearly independent subtasks.
Others have proposed DPS systems which work effectively despite inconsistencies. See, Lesser et al., “Functionally accurate, cooperative distributed systems,” IEEE Transactions on Systems, Man, and Cybernetics, SMC-11 (1), 1981. In the “functionally accurate, cooperative” approach proposed by Lesser et al., nodes cooperatively exchange and integrate partial and tentative results to construct a complete solution. Nodes make progress in problem solving by using whatever information they can find. This work is motivated by the argument that consistency maintenance at all times is very expensive in practical systems. This model, however, leads to the possibility that agents might propagate and use incorrect partial results leading to unpredictable system performance.
In Lesser, “A retrospective view of FA/C distributed problem solving,” IEEE Transactions on Systems, Man, and Cybernetics, vol. 21, issue 6, 1991, the issue of functionally accurate cooperative distributed problem solving is revisited. Lesser presents some techniques that may reduce the unpredictability of systems that propagate and use incorrect partial results. Lesser proposes an increase in sophistication of local control in each agent so available information about a local search is more efficiently utilized, the exchange of meta-level information between agents so their local searches can be made while having a more global view, and satisfying control, in which less than optimal, but acceptable, levels of coordination between agents are used.
Another approach provides an architecture for solving distributed search problems using heuristics and constraint satisfaction methods, and applies it to decentralized job-shop scheduling. See, Sycara et al., “Distributed constrained heuristic search,” IEEE Transactions on Systems, Man, and Cybernetics, vol. 21, issue 6, 1991.
One application in particular for distributed problem solving is the design and fabrication of new products. Only ten or fewer years ago, the designing, testing and manufacturing of products took place within the homogeneous environments of relatively large companies. Now, more often than not, the fabricating, assembling and testing of new products happens in widely distributed and radically different settings. The product designer, component suppliers, and manufacturers are often many miles apart, and sometimes separated by countries and continents. The computing systems used by each group are often different, using different CAD/CAM systems and different computer platforms from PC's to workstations to communicate with each other. Despite the best efforts of all involved, there are usually delays in filling orders, the end-products are more costly, less reliable and take longer to make than is possible.
Although advanced network backbones and technologies for communications exist, new product design and fabrication has in some ways seemed to remain a manual process. Many times an optimal product may be designed by engineers who then transmit their design to suppliers for cost estimates and manufacturing reviews, which reveal that the design is far from optimal in cost and production time.
Some e-engineering systems are beginning to improve the design and realization processes. Some systems rely upon collaborative design processes by providing shared design files and collaborative design development among distributed engineering participants. Other systems provide for rapid communications and file transfers for design assessment and checking among distributed design contributors including potential customers as well as fabrication services and other supply chain contractors. Still other systems merely provide access to databases with time-critical data needed by design engineers.
Unfortunately, each of these prior art approaches are limited solutions to the inherently complex and coupled problem of making ideal design, supplier and manufacturing choices with respect to multiple criteria including cost and time. Fundamental to this complexity is that each choice or assignment has the potential to affect overall product cost, time to manufacture and distribute, and therefore assignments cannot be considered independent of one another. This complexity makes it increasingly impossible to make these selections based purely on prior art techniques.
The Virtual Design Environment class of systems automates significant portions of the overall decision task by accessing information available at multiple, logically interrelated, distributed databases and evaluating and consequently optimizing the assignments. These systems rely heavily on the underlying network infrastructure and middle-ware to radically alter the design development and realization process.
The Virtual Design Environment (VDE) described in U.S. Pat. No. 6,249,714 and developed in part by the inventors of this invention is a distributed, heterogeneous information architecture having centralized evolutionary computations. The VDE uses evolutionary agents, modular computer programs that generate and execute queries among distributed computing and database resources and support a global optimization of integrated design-supplier-manufacturer planning decisions.
A prototype VDE has been evaluated using design and supplier data based on a real commercial electronic circuit board product for Pitney Bowes of Stamford, Conn., and data from three commercial manufacturing facilities. The VDE simultaneously selects parts for a design, selects suppliers for the parts and makes manufacturing decisions. Suppliers and manufacturing resources are distributed, and information about parts, suppliers and manufacturing resources is available through network databases. During the course of the evolutionary optimization, the VDE generates virtual designs, or complete integrated planning decisions, that are evaluated against an evaluation function based on cost and time models. These computations require information collected dynamically over the network. As the evolution proceeds, successive generations of virtual designs are created, and the population systematically converges towards promising integrated planning decisions.
The VDE functions in a network environment with distributed information sources that need to be accessed for decision-making. In this network environment, inter-node communication delays are a primary factor in system performance. The prototype VDE is conceptually based on a centralized optimization model, where computation is performed only at one node, while information resident at various network nodes is frequently accessed in order to perform fitness evaluations of the alternative planning decisions. Such a centralized planner that requires information from multiple network-distributed databases is prone to large delays due to remote access latencies.
Thus, there is a need to provide an effective problem solving system which overcomes these delays while providing the functionality of the VDE.