The instant invention relates to a method for preparing mixtures of cyclic peptides using novel solid phase methods such that a variety of products are prepared, in groups, possessing diversity in size, length, (molecular weight), and structural elements. These are then analyzed for the ability to bind specifically to an antibody, receptor, or other ligate. Such a collection may provide a ligand library containing specific ligands for any ligate even though there are a greater number of conformations available to any one sequence. A peptide library containing known and random peptide sequences can be easily surveyed for strong ligands and provides a powerful new tool for the cell biologist for studying molecular recognition. Moreover, the present invitation provides a means for displaying the peptides in these libraries on a selected surface allowing the library to be surveyed without the need to pick through peptides one at a time. Such a tool provides a means of recognizing a new class of agonists, antagonists, enzyme inhibitors, virus blockers, vaccine development, and other pharmaceuticals.
The use of peptide libraries for generating structurally diverse compounds is impressive. Using only the 20 common amino acids, one can readily generate 400 dipeptides or 6.4.times.10.sup.6 hexapeptides. (See Table I following).
TABLE I ______________________________________ Listing of Theoretical Numbers of Cyclic Peptides of N Ring Size Using 20 Common Amino Acids. Ring Size (N) Multiples Number of peptides* ______________________________________ 3 20 .times. 20 .times. 20 8,000 4 20 .times. 20 .times. 20 .times. 20 160,000 5 20 .times. 20 .times. 20 .times. 20 .times. 20 3,200,000 6 20 .times. 20 .times. 20 .times. 20 .times. 20 .times. 20 64,000,000 ______________________________________ *Does not include redundant peptides due to summary (e.g., cyclo(AlaAla-Ala).
With nineteen other D amino acids, the number rises to over 1500 dipeptides and over 3.5 billion linear hexapeptides; the magnitude of this latter number can best be realized by calculating that if only 1 milligram of each peptide were present, the weight of the 6.5 billion peptides would nevertheless exceed three tons.
One of the first practical application of multiple peptide synthesis is attributed to Geysen, H. M., et al., Proc. Natl Acad. Sci . U.S.A., 81, 3998-4002, (1984) and is referred to as the "pin method." This procedure involves the use of Merrifield solid phase peptide synthesis, in which one end of a growing polypeptide chain is covalently linked to an insoluble support. In this case, a comb-like arrangement of dozens of pins are dipped into differentially filled aliquots of amino acids, thereby providing a defined discrete set of synthetic peptides, with a unique sequence at the head of each pin. Thus a large group of linear peptides could be rapidly synthesized and, if necessary, deprotected and assayed while still attached to the solid support. Alternatively, the peptides could be cleaved from the support and independently characterized and tested.
The second major development in multiple synthesis involved work by Houghten, R. A., Proc. Natl. Acad. Sci. U.S.A., 82, 5131-5135 (1985). In this procedure, different peptides are simultaneously prepared by solid phase methods, but individual peptide sequences are segregated from one another, when required, by being held in solvent-permeable plastic bags (the "tea-bag" method). Selected packets are removed and subjected to their own chemistries (adding different amino acids at one position) and then recombined when common amino acids are used to minimize the synthetic workload. Thus, a series of 10 linear peptides of sequence A-B-X.sub.n -D-E can be prepared with X.sub.1 -X.sub.10 merely be separating the 10 packets after adding a common amino acid, "D", then recombining the packets after adding X.sub.n to carry out couplings of B and A.
Further studies have been conducted by Houghten, Fodor and Hruby which describe the preparation of linear peptide libraries using modified peptide synthesis methods, while the synthesis group used genetic engineering techniques to prepare the peptide mixtures. The common theme of all these approaches is that literally millions of synthetic linear peptides can be prepared, more or less simultaneously, and that these peptide libraries can then be screened, collectively or in smaller groups, against specific antibodies or in bioassays for discovery of new lead compounds for various clinical problems. In this way, diversity of structures, meeting or exceeding the diversity found in natural products from rain forests or ocean dwellers, can potentially provide a rich new source of compounds for analysis by the pharmaceutical industry.
All of the above methods produce linear peptides. Most techniques use the most common twenty amino acids in their examples; however, this is not an inherent limitation. The use of unnatrual amino acids (optically active isomers) provides a new challenge in terms of product characterization. Most analytical methods for peptide structural characterization, including amino acid analysis and peptide sequencing, are most useful when applied to compounds containing only the twenty natural amino acids.
However, Kerr et al. (J. Amer. Chem Soc., 115, 2529-2531 (1993)), used a dimeric sequence formed using the trifunctional amino acid lysine with two difference sequences on each of the dimeric arms. One is composed of a least some unusual amino acids while the second arm contains a "coded" sequence composed only of common amino acids that may be readily sequenced. If the pairs of sequences in a peptide library are carefully coordinated, this approach combines the greater diversity available by using unnatural amino acids, but also contains a resident and covelantly lined reading code that can be used for sequencing. Obtaining the sequence of the unique natural amino acid arm automatically reveals the sequence of its paired arm since the molecules are prepared concomitantly.
The use of peptide libraries generates structurally diverse compounds, wherein, each of those linear hexapeptides contains at least two backbone rotatable bonds which, in linear form, are quite free to rotate. In a hexapeptide, if only 60.degree. angle increments are considered, one must incorporate another 6.sup.6 or 46,000 other variables for each peptide, since linear peptides are inherently flexible. This suggests that one is likely to see both many false positives and false negatives since conformational as well as structural diversity now plays such a major role.
While some recent versions of peptide combinational libraries may approach the theoretical maximum (Avogadro's number) of different structures in a molar unit of library product, the flexibility which is common to linear products is a significant limitation compared to the variety of conformational constraints that are found in most potent pharmaceuticals.
In order to synthesize a single defined peptide sequence those skilled in the art generally use the Merrifield method to "grow" peptide chains attached to solid supports. The process of synthesizing these individual peptides has been automated. Commercially available equipment can be used to synthesize peptides of one hundred or more amino acids in length. To obtain peptides of arbitrary length, the resulting peptides can be ligated with each other by using appropriate protective groups on the side chains and by employing techniques permitting the removal of the synthesized peptides from the solid supports without deprotecting them. Thus, the synthesis of individual peptides of arbitrary length is known in the art.
Combinatorial peptide libraries described to date have involved linear sequences. This is the case whether chemical (primarily solid phase peptide synthesis) or biological (combinatorial DNA libraries) techniques have been used for preparing the linear compunds.
Linear peptides are generally flexible molecules with entropic limitations on achieving productive biologically active conformers. For this reason many authors have described the advantages of using various types of conformational and topographical constraints to reduce these degrees of freedom (for example, see F. J. Hruby, Life Sci.,31, 189 (1982) and V. J. Hruby et al., Biochemical J., 268, 249, (1990)). These constraints may involve amide bond replacements, backbone and side chain alkyl substituents (to fix .phi., .psi., and .chi.-space), and use of heterocyclic amino acids such as proline (Pro) or tetrahydroisoquinoline carboxylic acid (Tic), among many other possibilities.
All of the above techniques focus on linear peptide mixtures; however, a preferred method of constraining peptides involves cyclization. Cyclic peptides may be prepared in which the ring is formed by oxidation of the naturally occurring cysteine residues yielding a disulfide bridged structure. This technique mimics the most common form of cyclization found among naturally occurring peptides and proteins but does not provide a convenient means of preparing other types of cyclic structures.
In order to prepare cyclic peptides, the most common technique used to employ amino acids with orthogonally protected functional groups such that some are removable selectively in the presence of others. Those skilled in the art can use these techniques to prepare peptides in solution in which the amino terminus is cyclized to the carboxyl terminus to form a ring. A naturally occurring example is the antibiotic gramicidin. Alternatively pairs of cysteine residures are oxidized to disulfide bonds to form one or more rings; the familiar naturally-occurring cyclic peptide hormone ocytocin is an example of such a structure, such as has been prepared by O'Neil et al., Protein, 14, 509-515 (1992)); however, this example is limited to cases of disulfide forming cyclic hexapeptides.
An alternate approach to solving the problem of structural diversity is to utilize other means of forming cyclic peptides include side chain-to-side amide bonds or side chain-to-backbone linkages. By combining the use of Merrifield (or related) solid phase methods with various solution cyclization procedures, it has proven possible to prepare many examples of known peptide hormones or their analogs.
If cyclization is of the head-to-tail variety, several advantages accrue. First the molecule is far more likely to have a reduced number of conformational states available to it. This can often lead to more potent and/or more selective ligands to biological receptors or to tighter binding to antibody molecules. Examples in the literature include a .delta.-selective opioid analog known as DPDPE, a mini-somatostatin cyclic hexapeptide (Merck), and a potent and selective endothelin antagonist, BQ-123 (Banyu).
A second advantage of head to tail cyclic peptides is that the molecule is virtually resistant to two of the three major types of proteolytic enzymes. Neither aminopeptidases nor carboxypeptidases are activated since cyclization simultaneously removes both amino (NH.sub.3 +) and carboxylate (COO--) termini. The molecule's resistance to endopeptidases is also likely to be affected but not in any predictable fashion.
Because of the above factors, any lead compounds obtained in a drug screening assay of cyclic peptide libraries are far more likely to resemble the final target peptide-inspired pharmaceuticals. This is especially true if D-amino acids and other unusual amino acids are incorporated into the library pool, giving rise to an even greater diversity of potential target analogs.
Finally, small cyclic compounds (with 4, 5, 6 amino acid residues) are more likely to possess decreased conformational flexibility than libraries of greater ring size. Nevertheless, the greater structural diversity inherent in the larger rings could provide more examples of lead structures and are included in the disclosure, even though the preferred embodiment involves compounds containing 408 residues in the ring, with 5 and 6 residue rings being preferred in view of their greater ease in forming cyclic structures (see Table II below).
TABLE II ______________________________________ Calculated Combinations of Cyclic Peptides (Head-to-Tail) Cyclizations); Based on Asp-Resin No. of residues Linkage With 19 amino acids With 38 amino acids in ring to resin at variant positions at variant positions ______________________________________ 4(cyclic Asp tetrapeptides) Asp;Gln 6,859 109,744 Asn;Gln 27,436 438,976 5(cyclic Asp pentapeptide) Asp;Gln 130,321 2,085,136 Asn;Gln 521,284 8,340,544 6(cyclic Asp hexapeptide) Asp;Gln 2,476,099 3,010,936,300 Asn;Gln 9,904,396 12,043,745,000 ______________________________________
Although the synthesis of a particular peptide may be routine, it is necessarily laborious. This presents a large practical problem in a situation where it is not previously known which of a mulitiplicity of peptides is, in fact, the preparation desired. While it is theoretically possible to synthesize all possible candidates and test them with whatever assay is relevant (immunoreactivity with specific antibody, interaction with a specific receptor, particular biological activity, etc.), to do so using the foregoing method would be impractical. In general, the search for suitable peptides for a particular purpose has been conducted on in cases where there is some prior knowledge of the most probable successful sequence. Therefore, methods to systematize the synthesis of a multiplicity of peptides for testing in assay systems would have great benefits in efficiency and economy, and permit extrapolation to cases where nothing is known about the desired sequence.
Three such methods have so far been disclosed. One of them, that of Houghten, R. A., Proc. Natl. Acad. Sci. U.S.A., 82, 5131-5135 (1985), is a modification of the above Merrifield method using individual polyethylene bags. In the general Merrifield methods, the C-terminal amino acid of the desired peptides is attached to a solid support and the peptide chain is formed by sequentially adding amino acid residues, thus extending the chain to the N-terminus. The additions are carried out in sequential steps involving deprotection, attachment of the next amino acid residue in protected form, deprotection of the peptide, attachment of the next protected residue, and so forth.
In the Houghten methods, individual polyethylene bags containing C-terminal amino acids bound to solid support can be mixed and matched through the sequential attachment procedures so that, for example, twenty bags containing different C-terminal residues attached to the support can be simultaniously deprotected and treated with the same protected amino acid residue to be next attached, and then recovered and treated uniformly or differently, as desired. The result of this is a series of polyethylene bags each containing a different peptide sequence. Although each bag will contain many peptides, all of the peptides in any one bag are the same. The peptides in each bag can then be recovered and individually biologically tested.
An alternative method has been devised by Geysen, H. M., et al., Proc. Natl. Acad. Sci. U.S.A., 81, 3998-4002 (1984). See also WO86/06487 and WO86/00991. This method is a modification of the Merrifield system wherein the C-terminal amino acid residues are bound to solid supports in the form of polyethylene pins and the pins treated individually or collectively in sequence to attach the remaining amino acid residues. Without removing the peptides from support, these peptides can then efficiently be effectively individually assessed for the desired activity, in the case of the Geysen work, interact with a given antibody. The Geysen procedure results in considerable gains in efficiency of both the synthesis and testing procedures, while nevertheless producing individual different peptides. It is workable, however, only in instances where the assay can be practically conducted on the pin-type supports used. If solution assay methods are required, the Geysen approach would be impractical.
A third method described by Huebner and Santi, U.S. Pat. No. 5,182,366, describes a procedure in which linear peptide mixtures may be prepared using mixed resins such that any particular resin has attached to it a single unique sequence. This is achieved by a series of splitting and recombining of resin pools such that only one amino acid is coupled at a single time in each pool, but following that completed reaction, the pools are recombined such that a large number of the statistically calculable representations are prepared by extending this through n steps.
In principle the methods of preparing linear peptide libraries might be extended to the preparation of cyclic peptide mixtures (see Table III below).
TABLE III ______________________________________ Proteced Amino Acids Used in Example 3; Total of 1296 Peptides INITIAL ATTACHMENT; Boc-Asp-OFm (bound to resin) Cycle 1 Cycle 2 Cycle 3 Cycle 4 ______________________________________ Boc-Gly Boc-Arg Boc-Lys (GlZ) Fmoc-Ala Boc-Tyr (Cl, Bzl) Boc-Gly Boc-Ala Fmoc-Phe Boc-Val Boc-Trp (For) Boc-Thr (Bzl) Fmoc-Tyr (OBut) Boc-Leu Boc-Phe Boc-Phe Fmoc-Ile Boc-Ser (Bzl) Boc-Met Boc-Pro Fmoc-His (Fmoc) Boc-Glue Boc-Gln Boc-Glue (OChx) Fmoc-Gly (OCHx) ______________________________________
But most peptide cyclizations are carried out in solutions under high dilution conditions in order to avoid dimerization and polymerization. Furthermore, attempting to cyclize mixtures of linear sequences would predictably lead to virtually intractable products with numerous unidentifiable and inseparable components.
Most peptide libraries involve the use of Merrifield solid phase synthesis to insure synthetic ease and feasibility. This approach is normally not compatible with end to end cyclization, since the C-terminal residue is usually the point of attachment, thereby precluding cyclization without first cleaving the peptide. Cleaving the peptide from the solid support creates two new problems: 1) making possible dimeric and oligomeric linear and cyclic structures, and 2) rendering the process of characterizing and retaining groups of identical peptides nearly impossible.
It is known that peptide libraries can provide a novel pool of target compounds. It is also known that there are many advantages to cyclic peptides, but most known methods of preparing peptide libraries are not amenable to the preparation of cyclic peptides. In a paper describing linear peptides (Lam et al., Nature, 354, No. 7 (1991), the authors mention the possibility of cyclic peptides, but that reference does not teach the preparation of these. In a paper by O'Neil et al., Proteins, 14, 509-515 (1992), cyclic peptide mixtures are constructed on the surfaces of phages but the compounds are not proven to be cyclic and the ring is formed by disulfide oxidation of cysteine.
Using peptide synthesis techniques, the present invention describes an innovative method for preparing libraries of cyclic peptides using resin-bound cyclization that is compatible with the presence of other peptides on accompanying beads and even on the same bead. The present invention further demonstrates the feasibility of this technique by synthesizing, both individually and collectively, cyclic peptides and by fully characterizing the products to demonstrate their structural and stereochemical integrity. This invention also solves the problem of dimerization and it is demonstrated that the resin-bound cyclization produces an monomeric cyclic components.