In causing a transformant host to produce a desired protein by the recombinant DNA technology, it is advantageous in many aspects that the host is capable of secretory expression of the desired protein. Thus, in cases where a desired protein, if directly expressed in the host cell, shows toxicity which is inconvenient for the growth and survival of the host, secretory expression of the desired protein can avoid this toxicity.
In some cases, a desired protein, if accumulated in large amounts in the host cell, may inhibit the growth of the host even when the protein does not show toxicity. Secretory expression can avoid such circumstances as well.
In producing a desired protein on a commercial scale by using the recombinant DNA technology in a system in which the desired protein is accumulated intracellularly to purify the desired protein, it is necessary to disrupt the cells and purify the protein from the disruption mixture. It is difficult to obtain the desired protein in high purity by such a purification method since the product protein is contaminated by many impurities coming from the transformant host.
On the contrary, in producing a desired protein in a secretory expression system, the desired protein can be purified from the culture broth and accordingly the contamination with recombinant host-derived impurities can be minimized. This is a great merit.
Many proteins undergo modifications such as sugar chain addition, disulfide bond formation, activation by limited hydrolysis of inactive precursor proteins and phosphorylation or carboxylation of specific amino acids. These modifications are common to various cells, and among these, sugar chain addition and disulfide bond formation take place in the process of secretion.
Most of natural secreted proteins have sugar chain(s) or disulfide bond(s) intra- or intermolecularly. When the protein is produced in the host cells, disulfide bond(s) to be formed are not formed or are incorrectly formed to thereby cause degeneration and insolublization of the protein (GB Patent 0092182). Therefore, the production of a desired protein in the manner of secretory expression is expected to give the protein, having sugar chain(s) or disulfide bond(s), in a form more close in function and structure to a natural protein as compared with the system in which the protein is accumulated intracellularly.
Some findings are available concerning the properties of signal peptides which are essential for secretory protein expression. The characteristic features of their amino acid sequences are as follows. Basic amino acids are found in relatively large numbers in the vicinity of the N terminus while polar amino acids are found in relatively large numbers in the vicinity of that site on the C terminal side which is digested by signal peptidase. A sequence of hydrophobic amino acids is found in the middle. The basic amino acids in the vicinity of the N terminus are supposed to interact with phospholipids on the cell inside surface, the sequence of hydrophobic amino acids in the middle presumably plays an important role in passage through the cell membrane, and the C-terminal polar amino acids supposedly play a role of the recognition site in digestion by signal peptidase. Such characteristics are very similar in organisms from procaryotes to higher animals and suggest a common protein secretion mechanism [cf. M. S. Briggs and L. M. Gierasch (1986), Adv. Protein Chem., 38, 109-180; G. von Heijne (1984), EMBO J., 3, 2315-2318; G. von Heijne (1984), J. Mol. Biol., 173, 249-251; D. Perlman and H. O. Halvorson (1983), J. Mol. Biol., 167, 391-409].
G. von Heijne studied the amino acid sequences of signal peptides of procaryotes and eucaryotes and reported characteristic features of the amino acid sequence from the sequence of hydrophobic amino acids in the middle to the signal peptidase cleavage site on the C terminal side [G. von Heijne (1986), Nucl. Acids Res., 14, 4683-4690; G. von Heijne (1985), J. Mol. Biol., 184, 99-105; G. von Heijne (1983), Eur. J. Biochem., 133, 17-21]. According to the report, in eucaryotic cells, hydrophobic amino acids, in particular leucine, are detected frequently in the portion from -13 to -6 (the amino acid just preceding the signal peptide cleavage site being numbered -1). Hydrophobic amino acids such as phenylalanine, alanine, isoleucine and valine and further cysteine, methionine and the like are also detected in relatively high frequencies. As for the amino acids from -5 to -1, amino acids with relatively high polarity are found in high frequencies. The sequence constructed by amino acids showing highest detection frequencies is as follows. ##STR1## The signal peptide is cleaved between -1 and +1. Therefore, the amino acids numbered +1 and +2 are the N terminus of mature proteins. Many signal peptides are generally composed of 15-30 amino acids, and a basic amino acid sequence is further required on the upstream N terminus side even for the signal sequence shown above. The basic amino acids include arginine, lysine and histidine, and these may occur either singly or in plurality.
However, when a known signal peptide is used for secretory expression of a heterologous protein, it often occurs that the resulting product is different in structure from a natural form.