Present discovery strategies for the identification of useful products are intrinsically time consuming processes. Whilst predictive in silico screening can provide some guide as to the products that are likely to be useful, such methods have not yet fully developed. It is still necessary to prepare a range of products in order to determine which products actually have the appropriate characteristics to make them useful. Thus, typical discovery processes invariably require the preparation of many products, each of which is individually analysed and tested.
The traditional discovery process will generally take the form of a series of batch processes. An initial group of products will be prepared, and a subset of products will be identified having promising characteristics. These products will inform the preparation of a second group of products, with the expectation that further promising products will be identified, some of which will be superior to the originally identified subset of products. Further sets of products may be prepared, and each subsequent preparation is intended to identify products of superior activity. Once a product is identified as having the right combination of features for use, a subsequent scale up synthesis is undertaken to provide useful quantities of material, for example for further testing or for use. Discovery processes that look to identify improved methods of synthesis are conducted in a similar manner.
It has long been recognised that the discovery process requires improvement. One of the more productive areas of development in recent years has been the way in which the data obtained from the initial lead set of products is used to inform the preparation of later products. Here, researchers are making increasing use of sophisticated data mining and management techniques in order to develop an understanding of product features that are likely to contribute to a desirable activity. Thus, there is a recognition that discovery techniques require robust procedures to administer and schedule the large amounts of experimental data that are generated. Moreover it is necessary to comprehend and model this organized data, and provide a global search strategy for identifying potentially useful products (see Corma et al. Chem. Phys. Com. 2002, 3, 939-945).
Genetic algorithm search methods are now increasingly used as a tool to direct the production of useful products. The genetic algorithm is inspired by the natural evolutionary concepts of selection and reproduction. In a discovery process the user sets a specific set of performance criteria for the idealised product he wishes to prepare. A series of test products is prepared and analysed, and each product is assigned a fitness value against the performance criteria. The value forms the basis for the natural selection made by the algorithm: products having a particular fitness value will be selected, whilst others will be discarded. Reproduction is accomplished by cloning, crossover, and mutation of product inputs in order to generate new and unexpected solutions that meet or exceed the performance criteria set by the user. The genetic algorithm approach to product design has been shown to work well in complex and coupled multivariable systems (see Zhu et al. Appl. Phys. Express 2012, 5, 012102). These methods attempt to balance speed, robustness and versatility in discovery processes (see Pham Comp. Chem. Eng. 2012, 37, 136-142). The use and refinement of genetic algorithms in chemical and engineering processes is now well established.
The use of genetic algorithms to inform future product preparations has undoubtedly assisted the discovery process. However, problems still remain. It is still the case that the researcher uses a batch synthesis approach to develop his product. Here, the researcher may prepare a training set of products, which are subsequently suitably analysed and tested. From this set, the genetic algorithm is provided with the information necessary for the subsequent batch preparation. A typical batch process employing a genetic algorithm is described by Zhu et al. in their preparation of a high-efficiency III-V nitride light-emitting diode (Zhu et al. Appl. Phys. Express 2012, 5, 012102). Another example includes the work of Kreutz et al. on the development of methane oxidation catalysts (Kreutz et al. J. Am. Chem. Soc. 2010, 132, 3128-3132). Fundamentally, the batch preparation is still cumbersome and time consuming, even though it is intelligently directed.
Moreover, genetic algorithms are usually employed in optimisation processes, which may be regarded as a form of a limited discovery process. An optimisation procedure takes an original lead product and attempts to improve its properties. The process of optimisation is usually a conservative one: the new products that are produced share many of the structural and compositional parts of the original lead. The optimisation processes is rarely permitted to explore product space that is structurally and compositionally diverse.
There remains a need to further rationalise the discovery process, particularly to increase throughput, and to decrease the time from the generation of the first test product through to the generation of a lead product in useful quantities. There is also a need to provide discovery processes that allow the user to explore a true breadth and depth of product.