Bioindustries, in which physiologically active proteins such as insulins, growth hormones, interferons, enzymes, etc. are produced in microorganisms, have grown rapidly with development of genetic recombination technologies. In recent years, high-speed/high-efficiency production of proteins has taken a very important location in various fields of structural and functional genomics, possession of target proteins for screening new drugs in various post-genome studies, etc. Until now, microorganisms (Escherichia coli, yeast), mammalian cells, etc. have been used for protein production, but protein production systems in organisms, which are the most suitable for the high-speed/high-efficiency protein production which is the heart in the studies of the functional genomics, has been known as an E-coli system which grows rapidly and is the most studied area in microbiological and physiological fields.
The protein production systems using E-coli have an excellent economic efficiency in view of the cost and accommodations, but they have one major problem that a majority of eukaryotic proteins are produced in a form of inclusion bodies, which are precipitates in cells, other than active forms since the proteins are not exactly folded into the active forms when the eukaryotic proteins are produced in cytoplasm of the prokaryotic E-coli. In order to obtain the active proteins from the inclusion bodies, the inclusion bodies should be solubilized in a high concentration of guanidine-HCl, and then refolded into an active form using methods such as dilution, etc. It has been known that large amounts of time and expense are required for finding effective refolding conditions since the refolding mechanism has not been found in full and refolding conditions are different in every protein. Highly expensive apparatuses are required for mass-production of a desired protein due to a low refolding yield of the protein, and it is difficult or impossible to refold a majority of high molecular weight proteins, which is an obstacle to industrial applications of the proteins. The inclusion body is formed since intermolecular aggregation of protein-folding intermediates appears during the folding process even if the active proteins are in the most stable form in a thermodynamic aspect [Mitraki, A. & King, J. (1989) Bio/Technology 7: 690-697]. Another reason is why disulfide bonds in the protein should be suitably formed so that the proteins can be biologically active, but the disulfide bonds in the proteins are not suitably formed in E-coli cytoplasm due to its reducing condition when the proteins are expressed in the E-coli cytoplasm.
As described above, the method, in which the genetically recombinant protein is produced in an active form, will be successfully carried out when the folding and disulfide bonding procedures are satisfied at the same time, and therefore it is difficult to produce a desired protein in the most cases. Also, it has been known that high molecular weight antibody proteins, tissue-type plasminogen activators, factor VIII, etc. are produced in forms of inclusion bodies in an E-coli system, and it is very difficult to obtain the proteins in active forms through the refolding process. In order to solve the above problems caused when a recombinant protein is produced in a form of inclusion body, it is important to express the recombinant protein in E-coli in a soluble form.
Up to now, there have been methods for expressing a recombinant protein in a soluble form: (i) the first one is a method where a recombinant protein is designed to secrete into E-coli periplasm to obtain a soluble form of the protein [Stader, J. A. & Silhavy, T. J. (1970) Methods Enzymol. 165: 166-187], but the method has a low industrial efficiency due to a low expression rate of the protein. (ii) The second one is a method where a soluble form of a recombinant protein is obtained by co-expressing a recombinant protein gene and chaperone genes such as GroEL, Dna K or the like which is involved in the protein folding [Goloubinoff, P. et al. (1989) Nature 337: 44-47], but the method is not general in preventing the formation of inclusion bodies since the method is applicable to specific proteins. (iii) The third one is a method wherein a soluble protein is obtained by selecting a protein, expressed in a soluble form in E-coli, as a fusion partner and fusing the desired recombinant protein with a carboxyl terminus of the fusion partner. Until now, the various proteins have been known as the fusion partners, including maltose-binding protein [Kapust, R. B. & Waugh, D. S. (1999) Protein Sci. 8: 1668-1674], NusA [Davis, G. D. et al. (1999) Biotechnol. Bioeng. 65: 382-388], glutathione-S-transferase [Smith, D. B. & Johnson, K. S. (1988) Gene 67: 31-40], thioredoxin [Lavallie, E. R. et al. (1993) Bio/Technology 11: 187-193], Protein-A [Nilsson, B. et al. (1985) Nucleic Acid Res. 13: 1151-1162], an amino terminal domain of transcription initiation factor IF2 [Sorensen, H. P. et al. (2003) Protein Expr. Purif. 32: 252-259], lysil-tRNA synthetase [Choi, Sung-il & Seong, Beak-Lin (1999) Korea Patent No. 10-203919], etc. However, the fusion partners have problems in view of their applications since they are merely expected to improve protein solubility but not to satisfy the protein folding and disulfide bonding at the same time.
Protein disulfide isomerase (PDI), which is an enzyme for catalyzing a thiol:disulfide bond exchange reaction, is found at a high concentration in endoplasmic reticulum in cells. It has been known that, amongst about 20 protein factors known as protein folding regulators in the cells up to the present date, proteins having a catalytic activity, such as PDI, are by no means common [Rothman, J. E. (1989) Cell 59: 591-601]. The PDI facilitates the exact formation of disulfide bonds in proteins by means of the thiol:disulfide bond exchange reaction, and serves as a chaperone when a high concentration of the PDI is present in the cells [Puig, A. & Gilbert, H. F. (1994) J. Biol. Chem. 269: 7764-7771].
Therefore, there have been attempts to produce active proteins by using the PDI. It was reported that PDI from thermophilic fungi is fused with an amino terminus of a target protein to secret a fusion protein from Bacillus brevis [Kajino, T. et al. (2000) Appl. Environ. Microbiol. 66: 638-642], or that dsbC, which is a kind of the PDI, is co-expressed with a target protein in an oxidizing cytoplasm of mutant E-coli [Bessette, P. H. et al. (1999) Proc. Natl. Acad. Sci. USA, 96: 13703-13708], etc. However, it was revealed that the target protein is produced in a very low yield since it is degraded by proteases secreted by the Bacillus strain when the target protein is secreted from the Bacillus strain. And, it was also revealed that the target protein has a very low expression rate when it is co-expressed with the dsbC in the cytoplasm.
PDI has a problem that active PDI is expressed at a low level since inactive PDI fragments are produced at the same time due to the presence of ribosome binding sites in the PDI protein. The inventors have found the fact that intact PDI protein other than split PDI protein is stably produced in a soluble form in the cytoplasm if the ribosome binding sites in the PDI protein are removed by means of a genetic modification, and they have designed a novel form of a fusion protein system that may satisfy all effects, such as high expression rate, improved solubility, protein folding, disulfide bonding, etc., by using a genetically engineered PDI as a fusion partner, the genetically engineered PDI being obtained by adding an M6 sequence (KIEEGK (SEQ ID NO: 37)) to an amino terminus of the PDI protein. All recombinant proteins, obtained using a fusion protein system with the genetically engineered PDI protein, were stably produced at a high expression rate and with a high solubility, and exhibited the same or more activity than a wild type PDI protein when the recombinant proteins were degraded through enzymatic cleavage.