The genetic code of every known organism, from bacteria to humans, encodes the same twenty common amino acids. Different combinations of the same twenty natural amino acids form proteins that carry out virtually all the complex processes of life, from photosynthesis to signal transduction and the immune response. In order to study and modify protein structure and function, scientists have attempted to manipulate both the genetic code and the amino acid sequence of proteins. However, it has been difficult to remove the constraints imposed by the genetic code that limit proteins to twenty genetically encoded standard building blocks (with the rare exception of selenocysteine (see, e.g., A. Bock et al., (1991), Molecular Microbiology 5:515-20) and pyrrolysine (see, e.g., G. Srinivasan, et al., (2002), Science 296:1459-62).
Some progress has been made to remove these constraints, although this progress has been limited and the ability to rationally control protein structure and function is still in its infancy. For example, chemists have developed methods and strategies to synthesize and manipulate the structures of small molecules (see, e.g., E. J. Corey, & X.-M. Cheng, The Logic of Chemical Synthesis (Wiley-Interscience, N.Y., 1995)). Total synthesis (see, e.g., B. Merrifield, (1986), Science 232:341-7 (1986)), and semi-synthetic methodologies (see, e.g., D. Y. Jackson et al., (1994) Science 266:243-7; and, P. E. Dawson, & S. B. Kent, (2000), Annual Review of Biochemistry 69:923-60), have made it possible to synthesize peptides and small proteins, but these methodologies have limited utility with proteins over 10 kilo Daltons (kDa). Mutagenesis methods, though powerful, are restricted to a limited number of structural changes. In a number of cases, it has been possible to competitively incorporate close structural analogues of common amino acids throughout proteins. See, e.g., R. Furter, (1998), Protein Science 7:419-26; K. Kirshenbaum, et al., (2002), ChemBioChem 3:235-7; and, V. Doring et al., (2001), Science 292:501-4.
In an attempt to expand the ability to manipulate protein structure and function, in vitro methods using chemically acylated orthogonal tRNAs were developed that allowed unnatural amino acids to be selectively incorporated in response to a nonsense codon, in vitro (see, e.g., J. A. Ellman, et al., (1992), Science 255:197-200). Arnino acids with novel structures and physical properties were selectively incorporated into proteins to study protein folding and stability and biomolecular recognition and catalysis. See, e.g., D. Mendel, et al., (1995), Annual Review of Biophysics and Biomolecular Structure 24:435-462; and, V. W. Cornish, et al. (Mar. 31, 1995), Angewandte Chemie-International Edition in English 34:621-633. However, the stoichiometric nature of this process severely limited the amount of protein that could be generated.
Unnatural amino acids have been microinjected into cells. For example, unnatural amino acids were introduced into the nicotinic acetylcholine receptor in Xenopus oocytes (e.g., M. W. Nowak, et al. (1998), In vivo incorporation of unnatural amino acids into ion channels in Xenopus oocyte expression system, Method Enzymol. 293:504-529) by microinjection of a chemically misacylated Tetrahymena thermophila tRNA (e.g., M. E. Saks, et al. (1996), An engineered Tetrahymena tRNAGin for in vivo incorporation of unnatural amino acids into proteins by nonsense suppression, J. Biol. Chem. 271:23169-23175), and the relevant mRNA. This has allowed detailed biophysical studies of the receptor in oocytes by the introduction of amino acids containing side chains with unique physical or chemical properties. See, e.g., D. A. Dougherty (2000), Unnatural amino acids as probes of protein structure and function, Curr. Opin. Chem. Biol. 4:645-652. Unfortunately, this methodology is limited to proteins in cells that can be microinjected, and because the relevant tRNA is chemically acylated in vitro, and cannot be re-acylated, the yields of protein are very low.
To overcome these limitations, new components were added to the protein biosynthetic machinery of the prokaryote Escherichia coli (E. coli) (e.g., L. Wang, et al., (2001), Science 292:498-500), which allowed genetic encoding of unnatural amino acids in vivo. A number of new amino acids with novel chemical, physical or biological properties, including photoaffinity labels and photoisomerizable amino acids, keto amino acids, and glycosylated amino acids have been incorporated efficiently and with high fidelity into proteins in E. coli in response to the amber codon, TAG, using this methodology. See, e.g., J. W. Chin et al., (2002), Journal of the American Chemical Society 124:9026-9027; J. W. Chin, & P. G. Schultz, (2002), ChemBioChem 11:1135-1137; J. W. Chin, et al., (2002), PNAS United States of America 99:11020-11024: and, L. Wang, & P. G. Schultz, (2002), Chem. Comm., 1-10. However, the translational machinery of prokaryotes and eukaryotes are not highly conserved; thus, components of the biosynthetic machinery added to E. coli cannot often be used to site-specifically incorporate unnatural amino acids into proteins in eukaryotic cells. For example, the Methanococcus jannaschii tyrosyl-tRNA synthetase/tRNA pair that was used in E. coli is not orthogonal in eukaryotic cells. In addition, the transcription of tRNA in eukaryotes, but not in prokaryotes, is carried out by RNA Polymerase III and this places restrictions on the primary sequence of the tRNA structural genes that can be transcribed in eukaryotic cells. Moreover, in contrast to prokaryotic cells, tRNAs in eukaryotic cells need to be exported from the nucleus, where they are transcribed, to the cytoplasm, to function in translation. Finally, the eukaryotic 80S ribosome is distinct from the 70S prokaryotic ribosome. Thus, there is a need to develop improved components of the biosynthetic machinery to expand the eukaryotic genetic code. This invention fulfills these and other needs, as will be apparent upon review of the following disclosure.