The recognition and binding of ligands regulates almost all biological processes, such as immune recognition, cell signaling and communication, transcription and translation, intracellular signaling, and catalysis, i.e., enzyme reactions. There is a long-standing interest in the art to identify and synthesize natural or unnatural ligand molecules which act as agonists or which can agonize or antagonize the activity of ligands such as hormones, growth factors, and neurotransmitters; which induce B-cell (antibody-mediated) or T-cell (cell-mediated) immunity; which can catalyze chemical reactions; or which can regulate gene expression at the level of transcription or translation. A large proportion of such ligands are proteins, peptides, and peptidomimetics.
The traditional approach to ligand and drug discovery relies heavily on a mixture of serendipity and hard work. Screening natural products from animal and plant tissues, or the products of fermentation broths, or the random screening of archived synthetic molecules have been the most productive avenues for the identification of new lead compounds.
However, recent trends in the search for novel pharmacological agents have focused on the preparation of combinatorial libraries as potential sources of new leads for drug discovery. At the heart of this new field of “combinatorial chemistry” is a collection of differing molecules which can be prepared either non-biosynthetically or biosynthetically and screened for biological activity in a variety of formats. Through the use of non-biosynthetic techniques, e.g., encoding, spatially addressing and/or deconvolution, combinatorial libraries of peptides, peptidomimetics and non-peptide-based molecules can be synthesized by batch processes and, importantly, the molecular identity of individual members of the library can be ascertained in a drug screening format (e.g. Lam et al. (1993) Gene 137, 13-16; Dooley et al. (1994) Science 266, 2019-2022). While non-biosynthetic libraries have the advantage of being unrestricted to biological monomers (such as natural amino acids and nucleotides) and their derivatives, they have the disadvantage of being limited in the number of molecules that may be screened within several weeks: usually 105 to 108 at most, which is too few molecules for favoring the identification of high affinity ligands for a target of interest (Roberts (1999) Curr. Op. Chem. Biol. 3, 268-273; Wilson et al. (2001) PNAS 98, 3750-3755). Biosynthetic libraries, however, often do not suffer from this limitation because there are examples of such libraries that enable 1015 different peptide, RNA or DNA molecules to be screened within several weeks (Roberts, supra). This is achieved by reiterative selection and amplification of individual biosynthetic library members, often with associated mutagenesis steps (e.g. affinity maturation, mutagenic PCR, or DNA shuffling (Roberts, supra)) in a process analogous to Darwinian evolution, sometimes termed directed evolution.
Many prior methods that allowed the isolation of proteins from partially or fully randomized pools did so through an in vivo step. Methods of this sort include monoclonal antibody technology (Milstein, Sci. Amer. 243:66 (1980); and Schultz et al., J. Chem. Engng. News 68:26 (1990)), phage display (Smith, Science 228:1315 (1985); Parmley and Smith, Gene 73:305 (1988); and McCafferty et al., Nature 348:552 (1990)), peptide-lac repressor fusions (Cull et al., PNAS 89:1865 (1992)), and classical genetic selections. Each of these methods relies on a topological link between the protein and the nucleic acid, since only nucleic acids can be replicated. Thus, the information of the protein is retained and can be recovered in readable, nucleic acid form.
Alternative protein selection technologies are performed without in vivo steps. The stalled translation method, often termed ribosome display, is a technique in which selection is for some property of a nascent protein chain that is still complexed with the ribosome and its mRNA (Kawasaki U.S. Pat. No. 5,658,754; Tuerk and Gold, Science 249:505 (1990); Irvine et al., J. Mol. Biol. 222:739 (1991); Korman et al., PNAS 79:1844-1848 (1982); Mattheakis et al., PNAS 91:9022-9026 (1994); Mattheakis et al., Meth. Enzymol. 267:195 (1996); and Hanes and Pluckthun, PNAS 94:4937 (1997)). The mRNA-protein fusion method or mRNA display (Nemoto et al. (1997) FEBS Lett. 414, 405-408; Yanagawa et al. U.S. Pat. No. 6,228,994; Szostak et al. U.S. Pat. Nos. 6,281,344, 6,261,804, 6,258,558, 6,214,553, and 6,207,446; Roberts and Szostak (1997) PNAS 94, 12297-12302) covalently couples the mRNA directly to its protein product via a DNA/puromycin linker. A method for synthesizing “naked” mRNA-peptide fusions that is not compromised by the presence of stop codons is to synthesize peptides in micelles in such a way that they can dissociate from the ribosomes and then rebind to their specific mRNAs (e.g. proteins containing streptavidin sequences will bind to biotinylated mRNA; Doi and Yanagawa (1999) FEBS Lett. 457, 227-230).
The prior art “natural” (L-) peptide library techniques, however, suffer from a number of disadvantages. First, the libraries, which consist almost entirely of chiral monomers (amino acids) lack the enantiomers of the chiral monomers. For example, with L-peptide libraries, while the 20 naturally occurring amino acids provide a wide range of steric, electronic and functional groups, the chirality of the C-alpha carbon effectively limits the three-dimensional shape space which is accessible by the prior art display technology. L-peptide libraries also lack a number of common organic chemistry functional groups which may be helpful for forming non-covalent or covalent complexes with targets (e.g. alkene, alkyl urea, alkyl halide, and ketone), and lack the enormous additional shape diversity achievable with “unnatural” amino acids (either previously synthesized or theoretical). Moreover, as therapeutic agents, peptides with natural L-amino acids are often less preferable than their unnatural enantiomers (D-peptides) or analogs because L-peptides can be limited in use by poor pharmacokinetic profiles due to in vivo processing. For example, L-peptides can be rapidly degraded by proteases after administration to an animal, thus requiring a higher effective dose. Furthermore, pharmaceutical peptides can elicit strong immunogenic responses in patients, further contributing to their rapid clearance and also causing inflammatory reactions that may be toxic. One approach to preventing the degradation of the therapeutic peptide has been to generate non-hydrolyzable peptide analogs such as retro-inverso analogs (c.f., Sisto et al. U.S. Pat. No. 4,522,752), retro-enantio analogs (c.f., Goissis et al. (1976) J Med Chem 19:1287-90); trans-olefin derivatives (c.f., Shue et al. (1987) Tetrahedron Letters 28:3225); and phosphonate derivatives (c.f., Loots et al., in Peptides: Chemistry and Biology, (Escom Science Publishers, Leiden, 1988, p. 118). However, in most instances the backbone of the peptide is altered in order to render the peptidomimetic resistant to proteolysis. In doing so, the resulting peptidomimetic can suffer from decreased bioactivity through loss of certain binding contacts between the natural peptide backbone and target receptor, as well as changes in the steric space relative to the peptide due to alteration in dihedral angles and the like. Another problem is that almost all L-peptides do not cross biological membranes readily because of their hydrophilicity. In contrast, D-peptides and peptides containing other unnatural amino acids (peptidomimetics) such as N-methyl amino acids have increased resistance to proteases, and the peptidomimetic drug Cyclosporin A can cross membranes and is orally available, in part because it contains several N-methyl peptide linkages which are more hydrophobic than natural peptide linkages (Zawadzke and Berg (1992) J Am Chem Soc 114:4002; Walsh et al. (1992) J. Biol. Chem. 267, 13115-13118). Unfortunately, chemically synthesized (non-biosynthetic) peptidomimetic libraries, such as D-peptide libraries (Lam et al., supra; Dooley et al., supra) suffer the limitation of library size discussed above, and methodological tricks to overcome the size limit of peptidomimetic libraries, such as mirror-image phage or ribosome display (Schumacher et al. (1996) Science 271, 1854-1857; Eckert et al. (1999) Cell 99, 103-115; Forster et al. PCT publication WO97/35194, are limited by the onerous requirement of chemically synthesizing an enantiomeric target.
Proteins, peptides and peptidomimetics are currently synthesized in three different ways, each with their own inherent limitations:
1. Synthetic peptide chemistry can be used routinely for the synthesis in high yield and purity of very diverse peptidomimetics of up to about 30 residues in length (Eckert et al., supra).
However, the method is inefficient or impractical for longer products because of inefficient coupling steps, purification problems, and folding difficulties. There are also synthetic restrictions because of the need for compatible protecting groups for all of the reactive side chains in a desired product. Furthermore, synthetic peptidomimetics cannot be genetically encoded for reiterative selection, amplification, and mutation (evolution), limiting the complexity of synthetic peptidomimetic libraries to about 108 molecules, too few for optimal drug discovery.
2. In vivo translation using living cells is widely used for the efficient synthesis and posttranslational modification of short or long proteins from a genetically encoded natural or recombinant DNA sequence.
However, synthesis may be inefficient if the gene product is toxic, and there may be difficult purification and refolding problems, particularly if the protein is expressed in inclusion bodies. Most importantly, the method suffers from the inability to incorporate multiple unnatural amino acids selectively or control the post-translation modification process (e.g. protease-catalysed processing or degradation).
3. In vitro translation with crude cell extracts generally overcomes the toxicity problem (but does not control post-translational modifications), may result in easier purification and folding, and allows the selective incorporation of a single unnatural amino acid per protein using an artificial suppressor tRNA (Noren et al. (1989) Science 244,182-188).
However, the incorporation of an unnatural amino acid by this approach usually suffers from much lower yields than in vivo systems because it relies on inherently inefficient suppressor tRNAs competing with termination factors. Although over one hundred different unnatural amino acids have been incorporated on an individual basis (e.g. Mendel et al. (1995) Annu. Rev. Biophys. Biomol. Struct. 24, 435-462), this strategy has been restricted to selective incorporation of only a single unnatural amino acid per protein at only one of the three termination (nonsense) codons (the UAG codon) because of competition at amino acid (sense) codons from natural amino acids catalysed by the tRNA charging and proofreading activities of the twenty different aminoacyl tRNA synthetases, and because an attempt to use a second termination codon (UGA) failed due to readthrough by the ribosome (Cload et al. (1996) Chem. and Biol. 3, 1033-1038).
Many attempts to incorporate unnatural amino acids selectively at sense codons in a generalizable manner have also failed. For example, in the most commonly used method for unnatural amino acid incorporation, where a high-specific-activity, radioactive-isotope derivative of a natural amino acid is incorporated by in vitro translation to synthesize a radiolabelled protein, it is well known that the specific activity of the radioactive amino acid is always substantially reduced by competition for incorporation by the unlabelled version of the amino acid present in the crude translation system, despite withholding the unlabelled version from the added unlabelled amino acid pool. Analogous analog dilution results are obtained by the Promega company using their commercially available kit for incorporation of another reporter group, biotin-labelled lysine (literature accompanying Transcend™ non-radioactive translation systems). Furthermore, filtration of a crude translation extract to remove natural amino acids followed by supplementation with all of the natural amino acids except lysine and supplementation with a lysine tRNA charged with an amino acid analog resulted in incorporation of lysine analog to lysine at a ratio of only 1:3 to 1:4 (Crowley et al. (1993) Cell 73, 1101-1115). While a low selectivity of amino acid analog incorporation is sufficient for certain applications (Rothschild et al., U.S. Pat. No. 5,643,722) it is clearly incompatible with many applications such as that requiring the amplification and characterization of genetically encoded specific peptidomimetic sequences. It has proved possible to incorporate two different unnatural amino acids using two different frameshifting suppressor tRNAs (Hohsaka et al. (1999) JACS 121, 12194-12195), and many identical unnatural amino acids have been incorporated using an inhibitor specific for Phe aminoacyl-tRNA synthetase (Baldini et al. (1988) Biochem. 27, 7951-7959). However, both of these methods are not generalizable in the manner necessary for the incorporation of many different unnatural amino acids into a single peptidomimetic. In order to overcome these restrictions inherent in crude and in vivo translations, an elaborate strategy for expansion of the genetic code based on orthogonal tRNAs and orthogonal unnatural nucleic acid base pairs has been proposed, but development beyond a single in vitro-engineered termination codon (Bain et al. (1992) Nature 356, 537-539) has proved to be too challenging technically (Service (2000) Science 289, 232-235).
We envisioned that this problem potentially may be solved by using a pure in vitro translation system. Competition between unnatural amino acids and natural amino acids or termination factors could potentially be avoided by the omission of certain components such as certain amino acids, tRNAs, aminoacyl tRNA synthetases and/or termination factors. Unfortunately, the minimal requirements for mRNA-dependent polypeptide synthesis have been difficult to define because of the large number of macromolecules involved. Reconstitution of translation from purified components has been achieved for E. coli, but the number of translation factors required remains controversial.
The first purified translation system, constructed by the Weissbach laboratory, efficiently translated four E. coli mRNAs with strong dependencies on high salt-washed ribosomes, initiation factors (partial IF1 dependency), elongation factor Tu (EF-TuH), and groups of aminoacyl-tRNA synthetases, and partial dependencies on met-tRNAifMet formyltransferase and elongation factor G (EF-G), with no dependency on elongation factor Ts (EF-Ts) or termination factors (Kung et al. (1978) Arch. Biochem. and Biophys. 187, 457-463). Because of the difficulties in maintaining so many purified components and in removing trace contaminants, the search for additional general translation factors was facilitated by simplifying the system to di- or tripeptide synthesis from fMet-tRNAifMet and one or two elongator aminoacyl-tRNAs, thereby avoiding the requirement for aminoacyl-tRNA-synthesizing enzymes (Weissbach et al. (1984) Biotechniques 2, 16-22).
When a second group, led by Ganoza, extended the latter simplified approach to longer peptides using in vitro-charged total tRNA and release factors, translation of bacteriophages MS2 and f1 were found to be dependent on three additional factors, termed EF-P, W and rescue (Green et al. (1985) Biochem. Biophys. Res. Com. 126, 792-798; Ganoza et al. (1985) PNAS 82, 1648-1652). The absence of these factors resulted in innefficient processivity. For example, there was a predominance of di-, tri-, tetra- and pentapeptide pausing or premature termination products in hexapeptide synthesis reactions. A possibly related translation factor termed deaD/W2 (several kD bigger than W) and also EF-P have been cloned, are necessary for maximal growth, and are homologous to eukaryotic initiation factors (Aoki et al. (1991) Nucleic Acids Res. 19, 6215-6220; Aoki et al. (1997) J. Biol. Chem. 272, 32254-32259; Lu et al. (1999) Int. J. Biochem. Cell Biol. 31, 215-229).
In apparent conflict with the results of Ganoza, two other groups have reported synthesis of short peptides from aminoacyl-tRNA substrates using purified components without the addition of EF-P, W, W2 or rescue, although these two groups did not directly document the processivity of their systems or the purity of their ribosomes (Stade et al. (1995) Nucleic Acids Res. 23, 2371-2380; Pavlov et al. (1997) EMBO J. 16, 4134-4141). If the discrepancy is real, one can only speculate as to the explanation. For example, because EF-P, W, and rescue can be purified from ribosome preparations (Ganoza et al. (1996) Biochemie 78, 51-61), it is possible that the ribosomes used by the latter two groups, prepared by very different procedures from that used by Ganoza's group, were contaminated with EF-P, W, W2 and/or rescue. This is problematic because contamination with EF-P, W, W2 and/or rescue likely implies contamination with more abundant proteins, such as aminoacyl-tRNA synthetases and termination factors, that could cause unwanted reactions. Alternatively, EF-P, W, W2 and/or rescue may only be required for efficient processivity in Ganoza's system.
The ability to synthesize peptides or proteins from a pure translation system without added EF-P, W (sometimes called W2) and rescue is desirable, if possible, because these proteins are not well understood in terms of function, resulting in difficulty in assaying their activities and therefore following the purification of active protein. Furthermore, there is controversy with respect to the actual size of W (or W2) and whether W and W2 represent derivatives of the same proteins, and the gene for rescue is yet to be cloned.