Recognition and binding of ligands regulate almost all biological processes, such as immune recognition, cell signalling and communication, transcription and translation, intracellular signalling, and catalysis, i.e., enzyme reactions. There is a longstanding interest in the art to identify molecules which act as agonists or which can agonize or antagonize the activity of ligands such as hormones, growth factors, and neurotransmitters; which induce B-cell (antibody-mediated) or T-cell (cell-mediated) immunity; which can catalyze chemical reactions; or which can regulate gene expression at the level of transcription or translation.
Of particular interest are protein or peptide ligands. These comprise the majority of hormones, growth factors, neuroactive molecules, and immune epitopes. Furthermore, as discussed infra, most efforts at creating antagonists or agonists of receptor-mediated biological activity, or antibody or T-cell epitopes, have centered on peptides. The development of pharmaceutical agents keyed to the receptor binding sites, however, has been greatly hampered by the difficulty in determining the sequence of the peptide ligands. The sheer number and variety-of such peptide sequences has made this an unattainable goal on any basis except by laboriously isolating a specific complex, identifying the location of the epitope, and sequencing that epitope. The problem is further complicated by the fact that often the epitope consists of amino acid residues that are not contiguous in the primary sequence.
Some researchers in the field have attempted to circumvent this time-consuming process by determining the amino acid sequence of a protein based on the nucleotide sequence of its complement. Proteins are large peptides composed of amino acids; each amino acid is encoded by one or more codons of three nucleic acid residues. For example, peptide A, containing the amino acid glutamine, would be encoded by a codon of the three nucleic acid residues: cytosine, adenine and guanine. The complement to this codon would be guanine (which binds to cytosine), thymine (which binds to adenine) and cytosine and it would code for an amino acid in peptide B. According to the complementarity theory, peptide B would bind to peptide A. In particular, Bost and Blalock (1989, Methods in Enzymology 168:16-28) have suggested that any given peptide will bind to another peptide that is encoded by a complementary sequence of nucleic acid residues and, with this information, have predicted the amino acid sequence of a complementary peptide. They have used the sequence to synthesize a peptide and to test its ability to bind.
This approach did not provide the solution to the problem, however, because the affinity of binding between the complementary peptides was generally very low and required complementary peptides larger than 15 residues. Moreover, this approach requires knowledge of either the amino acid sequence or the nucleic acid sequence of the binding partner of a protein of interest. Furthermore, this approach will not work for epitopes that consist of amino acid residues that are not contiguous in the primary sequence.
Recently, there have been several reports on the preparation of peptide libraries and their use in identifying peptide ligands that can bind to acceptors. One approach uses recombinant bacteriophage to produce large libraries. Using the "phage method" (Scott and Smith, 1990, Science 249:386-390; Cwirla, et al., 1990, Proc. Natl. Acad. Sci., 87:6378-6382; Devlin et al., 1990, Science, 249:404-406), very large libraries can be constructed (10.sup.6 -10.sup.8 chemical entities), but the genetic code and the biological system imposes severe inherent limitations on the versality and diversity of the system. A second approach uses primarily chemical methods, of which the Geysen method (Geysen et al., 1986, Molecular Immunology 23:709-715; Geysen et al. 1987, J. Immunologic Method 102:259-274) and the recent method of Fodor et al. (1991, Science 251, 767-773) are examples. The methodology of Geysen et al. provides for a limited number of peptides (10.sup.3 -10.sup.4) can be synthesized on polyethylene pins in a few days. The method of Fodor et al. utilizes a "light-directed spatially addressable parallel chemical synthesis" technique. This technique is also limited by the relative lack of-development of photochemical peptide synthesis methods.
Large scale parallel concurrent peptide synthesis techniques have also been developed. Houghton reported synthesizing hundreds of analogous peptides simultaneously in polypropylene mesh packets (tea bag method) (Houghton, 1985, Proc. Natl. Acad. Sci U.S.A. 82:5131-5135). Berg et al. (1989, J. Am. Chem. Soc. 111:8024-8026) reported a novel polystyrene-grafted polyethylene film support that is suitable for peptide synthesis in parallel fashion. Both techniques used standard Boc amino acid resin with the standard deprotecting, neutralization, coupling and wash protocols of the original solid phase procedure of Merrifield (1963, J. Am. Chem. Soc. 85:2149-2154).
Furka et al. (1988, 14th International Congress of Biochemistry, Volume 5, Abstract FR:013) described a method to produce a mixture of peptides by separately coupling each of three different amino acids, then mixing all of the resin. The procedure described by Furka et al. provides no satisfactory method to isolate a peptide of interest from the plurality of peptides produced.
Although useful, as a practical matter the chemical techniques of Geysen, Fodor, Houghton, Berg and Furka and co-workers allow the synthesis and testing of only hundreds to a few thousand peptides at a time. These techniques are quite limited in light of the millions of possible peptide sequences, one or more of which might correspond to the binding sites between the entities of interest. With 20 known common amino acids, in any sequence of five amino acids, there are 20.sup.5, or about 3.2.times.10.sup.6, possible amino acid combinations. None of the procedures enable the synthesis of this many peptides at one time. Further multiplicity results by varying peptide chain length. Similarly, conventional peptide synthesis, such as that described in Stewart and Young (1984, Solid Phase Synthesis, Second Edition, Pierce Chemical Co., Rockford, Ill.) does not provide a method for the synthesis of thousands to millions of peptides at a time.
In addition, none of the other conventional peptide synthesis methods provide for the synthesis of a library of peptides bound to solid phase support that is truly random. A truly random peptide library is one with a good statistical distribution of all the molecular species such that the library contains approximately equimolar ratios of all individual species of peptides.
The synthesis of a truly random peptide generally cannot be accomplished by simultaneously adding various amino acids into a single reaction vessel because the coupling rates for various amino acids differs tremendously during solid phase peptide synthesis (SPPS) (Ragnarsson et al., 1971, Acta Chem. Scand. 25:1487, 1489; Ragnarsson et al., 1974, J. Org. Chem. 39:3837-3842). For example, the coupling rate of Fmoc-glycine to a growing peptide is much faster than that of Fmoc-valine, probably due to steric hindrance from the bulky side chain of valine. If one were to mix all 20 activated eukaryotic L-amino acids with the resin during each cycle of coupling, the most rapidly reacting amino acids would be preferentially incorporated into the peptide, and equimolar ratios of each peptide species would not be obtained. Furthermore, each of the possible nucleophiles will have different reactivities.
In addition, none of the prior peptide synthesis methods provides for the synthesis of a library of greater than 10.sup.5 peptides in which a single peptide species attached to a single solid phase support. The representation of only one species on a support would greatly enhance current techniques for isolating peptides.
Thus, there is a need in the art for a library of truly random peptide sequences, and oligonucleotide sequences, i.e., bio-oligomer sequences in which a single bio-oligomer species can be readily and quickly isolated from the rest of the library. There is also a need in the art for a method for quickly and inexpensively synthesizing thousands to millions of these truly random bio-oligomer sequences.