There is an increasing need to find new molecules which can effectively modulate a wide range of biological processes, for applications in medicine and agriculture. A standard way for searching for novel bioactive chemicals is to screen collections of natural materials, such as fermentation broths or plant extracts, or libraries of synthesized molecules using assays which can range in complexity from simple binding reactions to elaborate physiological preparations. The screens often only provide leads which then require further improvement either by empirical methods or by chemical design. The process it time-consuming and costly but it is unlikely to be totally replaced by rational methods even when they are based on detailed knowledge of the chemical structure of the target molecules. Thus, what we might call "irrational drug design"--the process of selecting the right molecules from large ensembles or repertoires--requires continual improvement both in the generation of repertoires and in the methods of selection.
Recently there have been several developments in using peptides or nucleotides to provide libraries of compounds for lead discovery. The methods were originally developed to speed up the determination of epitopes recognized by monoclonal antibodies. For example, the standard serial process of stepwise search of synthetic peptides now encompasses a variety of highly sophisticated methods in which large arrays of peptides are synthesized in parallel and screened with acceptor molecules labelled with fluorescent or other reporter groups. The sequence of any effective peptide can be decoded from its address in the array. See for example Geysen et al., Proc. Natl.Acad. Sci.USA, 81:3998-4002 (1984); Maeji et al., J.Immunol.Met., 146:83-90 (1992); and Fodor et al., Science, 251: 767-775 (1991).
In another approach, Lam et. al., Nature, 354:82-84 (1991) describes combinatorial libraries of peptides that are synthesized on resin beads such that each resin bead contains about 20 pmoles of the same peptide. The beads are screened with labelled acceptor molecules and those with bound acceptor are searched for by visual inspection, physically removed, and the peptide identified by direct sequence analysis. In principle, this method could be used with other chemical entities but it requires sensitive methods for sequence determination.
A different method of solving the problem of identification in a combinatorial peptide library is used by Houghten et al., Nature, 354:84-86 (1991). For hexapeptides of the 20 natural amino acids, 400 separate libraries are synthesized, each with the first two amino acids fixed and the remaining four positions occupied by all possible combinations. An assay, based on competition for binding or other activity, is then used to find the library with an active peptide. Then twenty new libraries are synthesized and assayed to determine the effective amino acid in the third position, and the process is reiterated in this fashion until the active hexapeptide is defined. This is analogous to the method used in searching a dictionary; the peptide is decoded by construction using a series of sieves or buckets and this makes the search logarithmic.
A very powerful biological method has recently been described in which the library of peptides is presented on the surface of a bacteriophage such that each phage has an individual peptide and contains the DNA sequence specifying it. The library is made by synthesizing a repertoire of random oligonucleotides to generate all combinations, followed by their insertion into a phage vector. Each of the sequences is cloned in one phage and the relevant peptide can be selected by finding those that bind to the particular target. The phages recovered in this way can be amplified and the selection repeated. The sequence of the peptide is decoded by sequencing the DNA. See for example Cwirla et al., Proc. Natl.Acad. Sci.USA, 87:6378-6382 (1990); Scott et al., Science, 249:386-390 (1990); and Devlin et al., Science, 249:404-406 (1990).
Another "genetic" method has been described where the libraries are the synthetic oligonucleotides themselves wherein active oligonucleotide molecules are selected by binding to an acceptor and are then amplified by the polymerase chain reaction (PCR). PCR allows serial enrichment and the structure of the active molecules is then decoded by DNA sequencing on clones generated from the PCR products. The repertoire is limited to nucleotides and the natural pyrimidine and purine bases or those modifications that preserve specific Watson-Crick pairing and can be copied by polymerases.
The main advantages of the genetic methods reside in the capacity for cloning and amplification of DNA sequences, which allows enrichment by serial selection and provides a facile method for decoding the structure of active molecules. However, the genetic repertoires are restricted to nucleotides and peptides composed of natural amino acids and a more extensive chemical repertoire is required to populate the entire universe of binding sites. In contrast, chemical methods can provide limitless repertoires but they lack the capacity for serial enrichment and there are difficulties in discovering the structures of selected active molecules.