1. Technical Field
Engineered polypeptides and chimeric polypeptides having incorporated amino acids which enhance or otherwise modify properties of such polypeptides.
2. Description of Related Art
Genetic engineering allows polypeptide production to be transferred from one organism to another. In doing so, a portion of the production apparatus indigenous to an original host is transplanted into a recipient. Frequently, the original host has evolved certain unique processing pathways in association with polypeptide production which are not contained in or transferred to the recipient. For example, it is well known that mammalian cells incorporate a complex set of post-translational enzyme systems which impart unique characteristics to protein products of the systems. When a gene encoding a protein normally produced by mammalian cells is transferred into a bacterial or yeast cell, the protein may not be subjected to such post translational modification and the protein may not function as originally intended.
Normally, the process of polypeptide or protein synthesis in living cells involves transcription of DNA into RNA and translation of RNA into protein. Three forms of RNA are involved in protein synthesis: messenger RNA (mRNA) carries genetic information to ribosomes made of ribosomal RNA (rRNA) while transfer RNA (tRNA) links to free amino acids in the cell pool. Amino acid/tRNA complexes line up next to codons of mRNA, with actual recognition and binding being mediated by tRNA. Cells can contain up to twenty amino acids which are combined and incorporated in sequences of varying permutations into proteins. Each amino acid is distinguished from the other nineteen amino acids and charged to tRNA by enzymes known as aminoacyl-tRNA synthetases. As a general rule, amino acid/tRNA complexes are quite specific and normally only a molecule with an exact stereochemical configuration is acted upon by a particular aminoacyl-tRNA synthetase.
In many living cells some amino acids are taken up from the surrounding environment and some are synthesized within the cell from precursors, which in turn have been assimilated from outside the cell. In certain instances, a cell is auxotrophic, i.e., it requires a specific growth substance beyond the minimum required for normal metabolism and reproduction which it must obtain from the surrounding environment. Some auxotrophs depend upon the external environment to supply certain amino acids. This feature allows certain amino acid analogs to be incorporated into proteins produced by auxotrophs by taking advantage of relatively rare exceptions to the above rule regarding stereochemical specificity of aminoacyl-tRNA synthetases. For example, proline is such an exception, i.e., the amino acid activating enzymes responsible for the synthesis of prolyl-tRNA complex are not as specific as others. As a consequence certain proline analogs have been incorporated into bacterial, plant, and animal cell systems. See Tan et al., Proline Analogues Inhibit Human Skin Fibroblast Growth and Collagen Production in Culture, Journal of Investigative Dermatology, 80:261-267(1983).
A method of incorporating unnatural amino acids into proteins is described, e.g., in Noren et al., A General Method For Site-Specific Incorporation of Unnatural Amino Acids Into Proteins, Science, Vol. 244, pp. 182-188 (1989) wherein chemically acylated suppressor tRNA is used to insert an amino acid in response to a stop codon substituted for the codon encoding residue of interest. See also, Dougherty et al., Synthesis of a Genetically Engineered Repetitive Polypeptide Containing Periodic Selenomethionine Residues, Macromolecules, Vol. 26, No. 7, pp. 1779-1781 (1993), which describes subjecting an E. coli methionine auxotroph to selenomethionine containing medium and postulates on the basis of experimental data that selenomethionine may completely replace methionine in all proteins produced by the cell.
cis-Hydroxy-L-proline has been used to study its effects on collagen by incorporation into eukaryotic cells such as cultured normal skin fibroblasts (see Tan et al., supra) and tendon cells from chick embryos (see e.g., Uitto et al., Procollagen Polypeptides Containing cis-4-Hydroxy-L-proline are Overglycosylated and Secreted as Nonhelical Pro-γ-Chains, Archives of Biochemistry and Biophysics, 185:1:214-221(1978)). However, investigators found that trans-4-hydroxyproline would not link with proline specific tRNA of prokaryotic E. coli. See Papas et al., Analysis of the Amino Acid Binding to the Proline Transfer Ribonucleic Acid Synthetase of Escherichia coli, Journal of Biological Chemistry, 245:7:1588-1595(1970). Another unsuccessful attempt to incorporate trans-4-hydroxyproline into prokaryotes is described in Deming et al., In Vitro Incorporation of Proline Analogs into Artificial Proteins, Poly. Mater. Sci. Engin. Proceed., Vol. 71, p. 673-674 (1994). Deming et al. report surveying the potential for incorporation of certain proline analogs, i.e., L-azetidine-2-carboxylic acid, L-γ-thiaproline, 3,4-dehydroproline and L-trans-4-hydroxyproline into artificial proteins expressed in E. coli cells. Only L-azetidine-2-carboxylic acid, L-γ-thiaproline and 3,4 dehydroproline are reported as being incorporated into proteins in E. coli cells in vivo.
Extracellular matrix proteins (“EMPs”) are found in spaces around or near cells of multicellular organisms and are typically fibrous proteins of two functional types: mainly structural, e.g., collagen and elastin, and mainly adhesive, e.g., fibronectin and laminin. Collagens are a family of fibrous proteins typically secreted by connective tissue cells. Twenty distinct collagen chains have been identified which assemble to form a total of about ten different collagen molecules. A general discussion of collagen is provided by Alberts, et al., The Cell, Garland Publishing, pp. 802-823 (1989), incorporated herein by reference. Other fibrous or filamentous proteins include Type I IF proteins, e.g., keratins; Type II IF proteins, e.g., vimentin, desmin and glial fibrillary acidic protein; Type III IF proteins, e.g., neurofilament proteins; and Type IV IF proteins, e.g., nuclear laminins.
Type I collagen is the most abundant form of the fibrillar, interstitial collagens and is the main component of the extracellular matrix. Collagen monomers consist of about 1000 amino acid residues in a repeating array of Gly-X-Y triplets. Approximately 35% of the X and Y positions are occupied by proline and trans 4-hydroxyproline. Collagen monomers associate into triple helices which consist of one α2 and two α1 chains. The triple helices associate into fibrils which are oriented into tight bundles. The bundles of collagen fibrils are further organized to form the scaffold for extracellular matrix.
In mammalian cells, post-translational modification of collagen contributes to its ultimate chemical and physical properties and includes proteolytic digestion of pro-regions, hydroxylation of lysine and proline, and glycosylation of hydroxylated lysine. The proteolytic digestion of collagen involves the cleavage of pro regions from the N and C termini. It is known that hydroxylation of proline is essential for the mechanical properties of collagen. Collagen with low levels of 4-hydroxyproline has poor mechanical properties, as highlighted by the sequelae associated with scurvy. 4-hydroxyproline adds stability to the triple helix through hydrogen bonding and through restricting rotation about C—N bonds in the polypeptide backbone. In the absence of a stable structure, naturally occurring cellular enzymes contribute to degrading the collagen polypeptide.
The structural attributes of Type I collagen along with its generally perceived biocompatability make it a desirable surgical implant material. Collagen is purified from bovine skin or tendon and used to fashion a variety of medical devices including hemostats, implantable gels, drug delivery vehicles and bone substitutes. However, when implanted into humans bovine collagen can cause acute and delayed immune responses.
As a consequence, researchers have attempted to produce human recombinant collagen with all of its structural attributes in commercial quantities through genetic engineering. Unfortunately, production of collagen by commercial mass producers of protein such as E. coli has not been successful. A major problem is the extensive post-translational modification of collagen by enzymes not present in E. coli. Failure of E. coli cells to provide proline hydroxylation of unhydroxylated collagen proline prevents manufacture of structurally sound collagen in commercial quantities.
Another problem in attempting to use E. coli to produce human collagen is that E. coli prefer particular codons in the production of polypeptides. Although the genetic code is identical in both prokaryotic and eukaryotic organisms, the particular codon (of the several possible for most amino acids) that is most commonly utilized can vary widely between prokaryotes and eukaryotes. See, Wada, K.-N., Y. Wada, F. Ishibashi, T. Gojobori and T. Ikemura. Nucleic Acids Res. 20, Supplement: 2111-2118, 1992. Efficient expression of heterologous (e.g. mammalian) genes in prokaryotes such as E. coli can be adversely affected by the presence in the gene of codons infrequently used in E. coli and expression levels of the heterologous protein often rise when rare codons are replaced by more common ones. See, e.g., Williams, D. P., D. Regier, D. Akiyoshi, F. Genbauffe and J. R. Murphy. Nucleic Acids Res. 16: 10453-10467, 1988 and Höög, J.-O., H. v. Bahr-Lindström, H. Jörmvall and A. Holmgren. Gene. 43: 13-21, 1986. This phenomenon is thought to be related, at least in part, to the observation that a low frequency of occurrence of a particular codon correlates with a low cellular level of the transfer RNA for that codon. See, Ikemura, T. J. Mol. Biol. 158: 573-597, 1982 and Ikemura, T. J. Mol. Biol. 146: 1-21, 1981. Thus, the cellular tRNA level may limit the rate of translation of the codon and therefore influence the overall translation rate of the full-length protein. See, Ikemura, T. J. Mol. Biol. 146: 1-21, 1981; Bonekamp, F. and F. K. Jensen. Nucleic Acids Res. 16: 3013-3024, 1988; Misra, R. and P. Reeves, Eur. J. Biochem. 152: 151-155, 1985; and Post, L. E., G. D. Strycharz, M. Nomura, H. Lewis and P. P. Lewis. Proc. Natl. Acad. Sci. U.S.A. 76: 1697-1701, 1979. In support of this hypothesis is the observation that the genes for abundant E. coli proteins generally exhibit bias towards commonly used codons that represent highly abundant tRNAs. See, Ikemura, T. J. Mol. Biol. 146:1-21, 1981; Bonekamp, F. and F. K. Jensen. Nucleic Acids Res. 16: 3013-3024, 1988; Misra, R. and P. Reeves, Eur. J. Biochem. 152: 151-155, 1985; and Post, L. E., G. D. Strycharz, M. Nomura, H. Lewis and P. P. Lewis. Proc. Natl. Acad. Sci. U.S.A. 76:1697-1701, 1979. In addition to codon frequency, the codon context (i.e. the surrounding nucleotides) can also affect expression.
Although it would appear that substituting preferred codons for rare codons could be expected to increase expression of heterologous proteins in host organisms, such is not the case. Indeed, “it has not been possible to formulate general and unambiguous rules to predict whether the content of low-usage codons in a specific gene might adversely affect the efficiency of its expression in E. coli.” See page 524 of S. C. Makrides (1996), Strategies for Achieving High-Level Expression of Genes in Escherichia coli. Microbiological Reviews 60, 512-538. For example, in one case, various gene fusions between yeast α factor and somatomedin C were made that differed only in coding sequence. In these experiments, no correlation was found between codon bias and expression levels in E. coli. Ernst, J. F. and Kawashima, E. (1988), J. Biotechnology, 7, 1-10. In another instance, it was shown that despite the higher frequency of optimal codons in a synthetic β-globin gene compared to the native sequence, no difference was found in the protein expression from these two constructs when they were placed behind the T7 promoter. Hernan et al. (1992), Biochemistry, 31, 8619-8628. Conversely, there are many examples of proteins with a relatively high percentage of rare codons that are well expressed in E. coli. A table listing some of these examples and a general discussion can be found in Makoff, A. J. et al. (1989), Nucleic Acids Research, 17, 10191-10202. In one case, introduction of non-optimal, rare arginine codons at the 3′ end of a gene actually increased the yield of expressed protein. Gursky, Y. G. and Beabealashvilli, R. Sh. (1994), Gene 148, 15-21.
Failure to provide post-translational modifications such as hydroxylation of proline and the presence in human collagen of rare codons for E. coli may be contributing to the difficulties encountered in the expression of human collagen genes in E. coli. 