Protein engineering is a powerful tool for modification of the structural catalytic and binding properties of natural proteins and for the de novo design of artificial proteins. Protein engineering relies on an efficient recognition mechanism for incorporating mutant amino acids in the desired protein sequences. Though this process has been very useful for designing new macromolecules with precise control of composition and architecture, a major limitation is that the mutagenesis is restricted to the 20 naturally occurring amino acids. However, it is becoming increasingly clear that incorporation of unnatural amino acids can extend the scope and impact of protein engineering methods.
Non-natural amino acids carrying a wide variety of novel functional groups have been globally replaced for residue-specific replacement or incorporation into recombinant proteins. Biosynthetic assimilation of non-canonical amino acids into proteins has been achieved largely by exploiting the capacity of the wild type synthesis apparatus to utilize analogs of naturally occurring amino acids (Budisa 1995, Eur. J. Biochem 230: 788-796; Deming 1997, J. Macromol. Sci. Pure Appl. Chem A34; 2143-2150; Duewel 1997, Biochemistry 36: 3404-3416; van Hest and Tirrell 1998, FEBS Lett 428 (1-2): 68-70; Sharma et al., 2000, FEBS Lett 467 (1): 37-40). However, there are situations in which single-site substitution or incorporation by non-natural amino acids is required. Such a methodology would enable the tailoring in a protein (the size, acidity, nucleophilicity, hydrogen-bonding or hydrophobic properties, etc. of amino acids) to fulfill a specific structural or functional property of interest. The ability to site-specifically incorporate such amino acid analogs into proteins would greatly expand our ability to rationally and systematically manipulate the structures of proteins, both to probe protein function and create proteins with new properties. For example, the ability to synthesize large quantities of proteins containing heavy atoms would facilitate protein structure determination, and the ability to site specifically substitute fluorophores or photo-cleavable groups into proteins in living cells would provide powerful tools for studying protein functions in vivo.
In recent years, several laboratories have pursued an expansion in the number of genetically encoded amino acids, by using either a nonsense suppressor or a frame-shift suppressor tRNA to incorporate non-canonical amino acids into proteins in response to amber or four-base codons, respectively (Bain et al., J. Am. Chem. Soc. 111: 8013, 1989; Noren et al., Science 244: 182, 1989; Furter, Protein Sci. 7: 419, 1998; Wang et al., Proc. Natl. Acad. Sci. U.S.A., 100: 56, 2003; Hohsaka et al., FEBS Lett. 344:171:1994; Kowal and Oliver, Nucleic Acids Res. 25: 4685, 1997). Such methods insert non-canonical amino acids at codon positions that will normally terminate wild-type peptide synthesis (e.g., a stop codon or a frame-shift mutation). These methods have worked well for single-site insertion of novel amino acids. However, their utility in multisite position specific (versus residue specific) substitution or incorporation is limited by modest (20-60%) suppression efficiencies (Anderson et al., J. Am. Chem. Soc. 124: 9674, 2002; Bain et al., Nature 356: 537, 1992; Hohsaka et al., Nucleic Acids Res. 29: 3646, 2001). This is so partially because too high a stop codon suppression efficiency will interfere with the normal translation termination of some non-targeted proteins in the organism. On the other hand, a low suppression efficiency will likely be insufficient to suppress more than one nonsense or frame-shift mutation sites in the target protein, such that it becomes more and more difficult or impractical to synthesize a full-length target protein incorporating more and more non-canonical amino acids.
Efficient multisite incorporation has been accomplished by replacement of natural amino acids in auxotrophic Escherichia coli strains, for example, by using aminoacyl-tRNA synthetases with relaxed substrate specificity or altered editing activity (Wilson and Hatfield, Biochim. Biophys. Acta 781: 205, 1984; Kast and Hennecke, J. Mol. Biol. 222: 99, 1991; Ibba et al., Biochemistry 33: 7107, 1994; Sharma et al., FEBS Lett. 467: 37, 2000; Tang and Tirrell, Biochemistry 41: 10635, 2002; Datta et al., J. Am. Chem. Soc. 124: 5652, 2002; Doring et al., Science 292: 501, 2001). Although this method provides efficient incorporation of analogues at multiple sites, it suffers from the limitation that the novel amino acid must “share” codons with one of the natural amino acids. Thus for any given codon position where both natural and novel amino acids can be inserted, other than a probability of incorporation, there is relatively little control over which amino acid will end up being inserted. This may be undesirable, since for an engineered enzyme or protein, non-canonical amino acid incorporation at an unintended site may unexpectedly compromise the function of the protein, while missing incorporating the non-canonical amino acid at the designed site will fail to achieve the design goal.
In general, multisite substitution methods are relatively simple to carry out, but all sites corresponding to a particular natural amino acid throughout the protein are replaced. The extent of incorporation of the natural and non-natural amino acid may also vary. Furthermore, multisite incorporation of analogs often results in toxicity when cells are utilized, which makes it difficult to study the mutant protein in living cells. The present invention overcomes these hurdles by allowing for site-specific mutation of amino acids in proteins.
Certain embodiments disclosed herein provide a new technique for the incorporation of replacement amino acids, including naturally occurring amino acids, or non-standard or non-canonical amino acids into proteins that is based on breaking the degeneracy of the genetic code. Specifically, certain embodiments herein allow for high fidelity position-specific substitution or incorporation of non-natural amino acids into proteins.