The classic “chemical approach” to generating molecules with new functions has been used extensively over the last century in applications ranging from drug discovery to synthetic methodology to materials science. In this approach (FIG. 1, black), researchers synthesize or isolate candidate molecules, assay these candidates for desired properties, determine the structures of active compounds if unknown, formulate structure-activity relationships based on the assay and structural data, and then synthesize a new generation of molecules designed to possess improved properties. While combinatorial chemistry methods (see, for example, A. V. Eliseev and J. M. Lehn. Combinatorial Chemistry In Biology 1999, 243, 159–172; K. W. Kuntz, M. L. Snapper and A. H. Hoveyda. Current Opinion in Chemical Biology 1999, 3, 313–319; D. R. Liu and P. G. Schultz. Angew. Chem. Intl. Ed. Eng. 1999, 38, 36) have increased the throughput of this approach, its fundamental limitations remain unchanged. Several factors limit the effectiveness of the chemical approach to generating molecular function. First, our ability to accurately predict the structural changes that will lead to new function is often inadequate due to subtle conformational rearrangements of molecules, unforeseen solvent interactions, or unknown stereochemical requirements of binding or reaction events. The resulting complexity of structure-activity relationships frequently limits the success of rational ligand or catalyst design, including those efforts conducted in a high-throughput manner. Second, the need to assay or screen, rather than select, each member of a collection of candidates limits the number of molecules that can be searched in each experiment. Finally, the lack of a way to amplify synthetic molecules places requirements on the minimum amount of material that must be produced for characterization, screening, and structure elucidation. As a result, it can be difficult to generate libraries of more than roughly 106 different synthetic compounds.
In contrast, Nature generates proteins with new functions using a fundamentally different method that overcomes many of these limitations. In this approach (FIG. 1, gray), a protein with desired properties induces the survival and amplification of the information encoding that protein. This information is diversified through spontaneous mutation and DNA recombination, and then translated into a new generation of candidate proteins using the ribosome. The power of this process is well appreciated (see, F. Arnold Acc. Chem. Res. 1998, 31, 125; F. H. Arnold el al. Curr. Opin. Chem. Biol. 1999, 3, 54–59; J. Minshull et al. Curr. Opin. Chem. Biol. 1999, 3, 284–90) and is evidenced by the fact that proteins and nucleic acids dominate the solutions to many complex chemical problems despite their limited chemical functionality. Clearly, unlike the linear chemical approach described above, the steps used by Nature form a cycle of molecular evolution. Proteins emerging from this process have been directly selected, rather than simply screened, for desired activities. Because the information encoding evolving proteins (DNA) can be amplified, a single protein molecule with desired activity can in theory lead to the survival and propagation of the DNA encoding its structure. The vanishingly small amounts of material needed to participate in a cycle of molecular evolution allow libraries much larger in diversity than those synthesized by chemical approaches to be generated and selected for desired function in small volumes.
Acknowledging the power and efficiency of Nature's approach, researchers have used molecular evolution to generate many proteins and nucleic acids with novel binding or catalytic properties (see, for example, J. Minshull et al. Curr. Opin. Chem. Biol. 1999, 3, 284–90; C. Schmidt-Dannert et al. Trends Biotechnol. 1999, 17, 135–6; D. S. Wilson et al. Annu. Rev. Biochem. 1999, 68, 611–47). Proteins and nucleic acids evolved by researchers have demonstrated value as research tools, diagnostics, industrial reagents, and therapeutics and have greatly expanded our understanding of the molecular interactions that endow proteins and nucleic acids with binding or catalytic properties (see, M. Famulok et al. Curr. Opin. Chem. Biol. 1998, 2, 320–7).
Despite nature's efficient approach to generating function, nature's molecular evolution is limited to two types of “natural” molecules—proteins and nucleic acids—because thus far the information in DNA can only be translated into proteins or into other nucleic acids. However, many synthetic molecules of interest do not in general represent nucleic acid backbones, and the use of DNA-templated synthesis to translate DNA sequences into synthetic small molecules would be broadly useful only if synthetic molecules other than nucleic acids and nucleic acid analogs could be synthesized in a DNA-templated fashion. An ideal approach to generating functional molecules would merge the most powerful aspects of molecular evolution with the flexibility of synthetic chemistry. Clearly, enabling the evolution of non-natural synthetic small molecules and polymers, similarly to the way nature evolves biomolecules, would lead to much more effective methods of discovering new synthetic ligands, receptors, and catalysts difficult or impossible to generate using rational design.