1. Field of the Inventive Subject Matter
The inventive subject matter relates to novel compositions, methods, and kits for enhancing the expression, solubility, isolation, and purification of heterologous proteins. Further, inventive subject matter relates to methods for generating proteins with novel N-terminal amino acids, unlike wild-type proteins which always are translated from mRNA with methionine as the N-terminus amino acid.
2. Background
Functional genomic studies have been hampered by the inability to uniformly express and purify biologically active proteins in heterologous expression systems. Despite the use of identical transcriptional and translational signals in a given expression vector, expressed protein levels have been observed to vary dramatically. For this reason, several strategies have been developed to express heterologous proteins in bacteria, yeast, mammalian cells, and insect cells as gene-fusions (see Butt, T. R., S. Jonnalagadda, B. P. Monia, E. J. Sternberg, J. A. Marsh, J. M. Stadel, D. J. Ecker, and S. T. Crooke, 1989, Ubiquitin fusion augments the yield of cloned gene products in Escherichia coli, Proc Natl Acad Sci USA 86:2540-4; Ecker, D. J., J. M. Stadel, T. R. Butt, J. A. Marsh, B. P. Monia, D. A. Powers, J. A. Gorman, P. E. Clark, F. Warren, A. Shatzman, and et al., 1989, Increasing gene expression in yeast by fusion to ubiquitin, J Biol Chem 264:7715-9; Ikonomou, L., Y. J. Schneider, and S. N. Agathos, 2003, Insect cell culture for industrial production of recombinant proteins, Appl Microbiol Biotechnol 62:1-20; and Kapust, R. B., and D. S. Waugh, 1999, Escherichia coli maltose-binding protein is uncommonly effective at promoting the solubility of polypeptides to which it is fused, Protein Sci 8:1668-74). Yet such strategies have proved ineffective or insufficient in practice because of poor expression levels, poor solubility, low yields, or a combination thereof.
The expression of heterologous genes in bacteria is by far the simplest and most inexpensive means available for research or commercial purposes. However, some heterologous gene products fail to attain their correct three-dimensional conformation in E. coli, while others become sequestered in large insoluble aggregates or “inclusion bodies” when overproduced (see Georgiou, G., and P. Valax, 1999, Isolating inclusion bodies from bacteria, Methods Enzymol 309:48-58; and Jonasson, P., S. Liljeqvist, P. A. Nygren, and S. Stahl, 2002, Genetic design for facilitated production and recovery of recombinant proteins in Escherichia coli, Biotechnol Appl Biochem 35:91-105). Major denaturant-induced solubilization methods followed by removal of the denaturant under conditions that favor refolding are often required to produce a reasonable yield of the recombinant protein. Selection of open reading frames (hereinafter “ORFs”) for structural genomics projects has also shown that only about 20% of the genes expressed in E. coli render proteins that are soluble or correctly folded (see Waldo, G. S., B. M. Standish, J. Berendzen, and T. C. Terwilliger, 1999, Rapid protein-folding assay using green fluorescent protein, Nat Biotechnol 17:691-5). These numbers are startlingly disappointing, especially given that most scientists rely on E. coli for initial attempts to express gene products. Several gene fusion systems ostensibly producing fusion proteins incorporating putative expression enhancers such as NusA, maltose binding protein (MBP), glutathione-S-transferase (GST), and thioredoxin (Trx) have been developed (see Jonasson, P., S. Liljeqvist, P. A. Nygren, and S. Stahl, 2002, Genetic design for facilitated production and recovery of recombinant proteins in Escherichia coli, Biotechnol Appl Biochem 35:91-105). All of these systems have clear drawbacks, ranging from inefficient expression to inconsistent cleavage from desired structure.
Thus, there is a need for more effective and efficient protein expression systems. This need is met by the inventive subject matter, using novel compositions and methods which have not heretofore been known. The use of the new SUMO fusions and SUMO proteases disclosed in the inventive subject matter herein circumvents the problems of the prior art, and significantly improves upon previously described expression systems based on Saccharomyces cerevisiae Smt3 and Saccharomyces cerevisiae Ulp1 protease.