With the advance of genetic engineering, heterologous proteins which are used industrially as medicine and the like, have been produced by utilizing animal cells, yeasts and prokaryotes such as E. coli. Especially E. coli has been exploited as a popular host cell to produce foreign proteins since it grows fast and has been studied more thoroughly than any other organisms.
Unfortunately, E. coli lacks cellular components necessary for posttranslational modification processes like glycosylation, disulfide-crosslinking or the like. And foreign proteins produced massively and excessively in E. coli are sequestered into inclusion bodies, which can be easily separated. But in order to obtain active proteins, these inclusion bodies should be solubilized to form primary structure by using high concentration of urea, guanidium HCl or the like and then refolded removing the above reagents.
Generally, the refolding process for preparing a active protein can not be always performed successfully since its result varies according to the cases. For example, proteins having high molecular weight, such as antibodies, tissue plasmingen activator, factor VIII and so on, are not refolded easily to become active proteins. And, it is difficult to produce a recombinant protein on a large scale.
Therefore, it is very important to express foreign proteins as soluble forms in E. coli for improving the problems caused in above cases.
Presently, following methods have been exploited to express foreign proteins as soluble forms effectively.
First, there is a method in which N-terminus of foreign protein is linked to signal peptide so as to secrete foreign protein into periplasm of E. coli as a soluble form (Stader, J. A. and Silhavy, T. J., 1970, Methods in Enzymol., 165: 166-187). Since the foreign proteins are not expressed effectively by the process, this method is not useful industrially.
Second, there is a method in which foreign proteins are expressed with chaperone genes such as groES, groEL, dnaK and the like to obtain soluble proteins (Goloubinoff, P., Gatenby, A. A. and Lorimer, G. H., 1989, Nature, 337: 44-47). But this method is not general to prevent the formation of inclusion body since it is available on only specific proteins.
Third, there is a method in which target proteins are fused at the C-terminus with fusion partner proteins which can be expressed highly in E. coli. Since the target proteins are linked at the C-terminus of fusion partners, translation initiation signal of the fusion partner protein can be exploited usefully. And the solubility of the fused foreign protein increases so that large amount of foreign proteins can be obtained as soluble forms in E. coli. 
Lac Z or Trp E protein have been utilized as a fusion partner protein in order to produce fusion proteins in E. coli. But active-form proteins can not be obtained easily since most fusion proteins were expressed in the forms of inclusion body. Therefore, many researches have been accomplished to obtain novel fusion partner proteins which facilitates the production of active-form proteins. Practically, some fusion partner proteins have been developed, such as glutathione-S-transferase (Smith, D. B. and Johnson, K. S., 1988, Gene, 67: 31-40), maltose-binding protein (Bedouelle, H. and Duplay, P., 1988, Euro. J. Biochem., 171: 541-549), protein A (Nilsson, B. et al., 1985, Nucleic Acid Res., 13 1151-1162), Z domain of protein A (Nilsson, B. et al., 1987, Prot. Eng., 1: 107-113), protein Z (Nygren, P. A. et al., 1988, J. Mol. Recog, 1: 69-74) and thioredoxin (Lavallie, E. R. et al., 1993, Bio/Technology, 11: 187-193).
Although foreign proteins have been expressed by linking the fusion partner described above and prepared as soluble forms, some were expressed as inclusion body or partly as soluble proteins according to the fusion partner protein.
Particularly, thioredoxin has been known to be the most successful protein as a fusion partner. However, in the case of thioredoxin E. coli transformant should be cultured at low temperature such as 15° C. in order to express most fusion proteins as soluble forms. Since E. coli grows very slowly at that temperature, the process using the thioredoxin may be inefficient.
Lysyl-tRNA synthetase (hereinafter it refers to “Lys RS”) and its gene have been investigated as described below, which is preferable for the fusion partner protein and expressed highly in E. coli. 
Although in E. coli aminoacylation is performed by using a specific aminoacyl-tRNA synthetase, two lysyl-tRNA synthetases which are encoded from lys S gene and lys U gene are involved in the aminoacylation independently. lys S gene is expressed constitutively in normal condition and lys U gene is induced by heat shock, low pH, anaerobiosis, L-alanine, L-leucine, L-leucyldipeptide. And amino acid sequences derived from the two genes show 88% of homology.
In addition, the X-ray crystallographical structure of lysyl-tRNA synthetase which is expressed from lys U gene (hereinafter it refers to “Lys U”) was illucidated at the 2.8 Å resolution level (Onesti, S., Miller. A. D. and Brick, P., 1995, Structure, 3: 163-176). Lys U protein is composed of homodimer which has N-terminal domain contacting with tRNA and C-terminal domain of dimer interface showing the enzyme activity (see FIG. 1).
In addition, nuclear magnetic resonance (NMR) structure of N-terminal domain (31-149 amino acid residues) of lysyl-tRNA synthetase which is expressed from lys S gene (hereinafter it refers to “Lys S”) was revealed by Frederic Dardel group (Stephane, C. et al., J. Mol. Biol., 253 100-113). As Lys U protein and Lys S protein share a high degree of identity in the amino acid sequences, the N-terminal structures of the two enzymes are identified to be very similar.
In detail, the N-terminal domain of lysyl-tRNA synthetase has secondary structure of five stranded antiparallel β barrel which is composed of α-helix (H4) located between 3rd and 4th β-sheet and contiguous 3 α-helices. The post-part of N-terminal domain corresponds to OB fold (A1A2A3H4A4A5) which is found in proteins binding with oligosaccharides or oligonucleotides commonly. It has been reported that OB fold was discovered in aspartyl-tRNA synthetase of yeast, β-subunit of heat labile enterotoxin, berotoxin and staphylococcal nuclease (Murzin, A. G., 1993, EMBO J, 12: 861-867).
The N-terminal domain of Lys RS protein of which the structure is described above shows the following characteristics as a fusion partner protein.
When the lys S gene was expressed in E. coil, the Lys S protein has accumulated to 80% of total soluble proteins. Since the Lys S protein is composed of a homodimer of which the contact region is located at the C terminus of the monomer, the fusion protein using intact Lys S protein, or the C-terminal domain of the Lys S protein as a fusion partner, makes a heterodimer with the Lys S protein of E. coil. 
But such a heterodimer is fatal to E. coil. Thus, the C-terminal domain of Lys S protein is not appropriate as a fusion partner protein, and only the N-terminal domain can be exploited as a fusion partner protein. Practically, only the N-terminal domain of Lys S protein (hereinafter it refers to “Lys N”) can be used to express foreign proteins well, to approximately 40% of the total proteins, and produced mostly as a soluble form.
As mentioned above, OB fold located in the N-terminal domain of Lys RS protein has a secondary structure which facilitates protein folding and increases the solubility of fusion proteins expressed.
The present inventors have researched to develop a fusion partner protein which is useful to produce heterologous proteins by recombinant DNA technology. Thus we have demonstrated that the N-terminal domain of lysyl-tRNA synthetase can be utilized as a fusion partner protein to produce foreign proteins massively in a soluble form. And by using the lysyl-tRNA synthetase, we have developed novel E. coli expression vectors and a process for preparing active foreign proteins effectively.