This invention relates to DNAs encoding new fusion proteins, and to use of the DNAs in production of biologically active polypeptides utilizable in the fields of pharmaceuticals and researches as well as in other industries.
Polypeptide substances, e.g., hormones and physiologically active substances as pharmaceuticals, and enzymes for diagnosis, industrial uses and researches, have often been obtained from organisms by means of extraction methods. However, it is difficult to obtain a pure substance in a large amount at a low cost by extraction. Recently, owing to progress of gene recombination techniques, highly pure recombinant proteins have been prepared more economically in a larger amount by use of various cells from organisms such as microorganisms, animals and plants.
However, economical mass production of useful proteins (or polypeptides) has not yet been achieved completely, and development of new techniques has therefore been carried out continuously. In addition, mass production systems developed until now are not capable of producing all kinds of proteins by gene recombination techniques, thus they have in practice been developed individually depending upon the kind of protein.
In an expression system for recombinant proteins using Bacillus brevis, when an exogenous protein is attached to follow a signal peptide for cell wall protein (referred to as xe2x80x9cCWPxe2x80x9d) of the microorganism and the resultant fusion protein is expressed, the exogenous protein with a natural type structure is cut away from the CWP signal peptide to be secreted in a medium (Japanese Patent No. 2082727; JP-A-62-201583; Yamagata, H. et al., J. Bacteriol. 169:1239-1245 (1987); Udaka, J., Journal of Japan Society for Bioscience, Biotechnology, and Agrochemistry, 61:669-676 (1987); Takano, M. et al., Appl. Microbiol. Biotechnol. 30:75-80 (1989); and Yamagata, H. et al., Proc. Natl. Acad. Sci. USA 86:3589-3593 (1989)). When human epidermal growth factor (referred to as xe2x80x9cEGFxe2x80x9d) is expressed in the above expression system, the expression amount is 10-100 fold higher than those of EGF expressed in other expression systems; the expressed protein is secreted in a medium while holding its original activity, therefore separation and purification of the protein is easy; and unlike some E. coli expression systems, this system does not require complicated procedures for conversion of an inactive protein into an active protein. For these reasons, the above-mentioned expression system has attracted attention as a mass production system of recombinant proteins.
However, not all proteins that were linked with the CWP signal peptide were expressed in an amount comparable to that found in EGF, and they were not always cleaved away from the signal peptide to be secreted in a medium.
A means to solve the above problem was suggested by Miyauchi et al. in Lecture Abstracts of the Annual Meeting of Japan Society for Bioscience, Biotechnology, and Agrochemistry, 67:372 (1993). That is, they prepared a gene encoding a fusion protein in which 17 amino acids (but unsuccessful with 9 or 12 amino acids) from the N-terminus of an MWP protein, one of CWPs, have been inserted between an MWP signal peptide and a flounder growth hormone protein, and expressed the gene in a Bacillus bacterium to obtain the fusion protein. The produced protein, however, was a nonnatural type protein with some amino acids added to the N-terminus. Miyauchi et al. suggested that the expression was influenced by the number of amino acids from the N-terminus of the MWP.
Miyauchi et al. neither teach nor suggest production of a polypeptide having the same amino acid composition as that of the corresponding natural type by utilizing introduction of a chemical or enzymatic cleavage site into its sequence. In fact, such a cleavage is difficult because the flounder growth hormone includes some sequences susceptible to chemical or enzymatic cleavage.
In this situation, it will be highly useful for an industrial purpose to develop a technique that facilitates expression and secretion of an exogenous polypeptide in a Bacillus expression system, i.e. a high expression system for recombinant proteins, where a polypeptide has the same sequence as the natural type.
The object of the present invention is to provide a Bacillus expression system comprising a DNA for a fusion protein containing a useful polypeptide sequence, the system having an ability to highly express and secrete the fusion protein which is selectively cleaved to give the polypeptide having a natural type structure.
The present invention provides a DNA comprising a nucleotide sequence encoding a fusion protein, wherein the fusion protein comprises: a sequence consisting of one or more amino acid residues from the N-terminus of a cell wall protein (CWP) from Bacillus bacterium; a sequence consisting of an amino acid residue or amino acid residues for chemical or enzymatic cleavage; and an exogenous polypeptide sequence, said sequences being linked linearly to one another in order, and wherein said nucleotide sequence is ligated to 3xe2x80x2-end of a nucleic acid sequence comprising a Bacillus promoter region.
The word xe2x80x9cone or more amino acid residues from the N-terminus (of a cell wall protein)xe2x80x9d as used herein means a sequence consisting of one or more amino acids from the N-terminal amino acid numbered as 1. For example, the sequence consisting of 3 amino acid residues refers to an amino acid sequence from number 1 to number 3 of the cell wall protein.
The fusion protein may further comprise a Bacillus CWP signal peptide sequence at the N-terminus.
The fusion protein may further comprise a sequence consisting of amino acid residues used as a tag for separation and purification and/or a sequence of amino acid residues used as a linker.
In an embodiment of the invention, the Bacillus bacterium is Bacillus brevis. 
As the amino acid residue for chemical cleavage, exemplified is methionine. In this instance, the fusion protein should not contain additional methionine residues so that the highest specificity can be achieved in a chemical cleavage reaction, for example, with cyanogen bromide.
Amino acid residues for enzymatic cleavage can comprise a sequence capable of cleaving with a protease. Examples of the protease are TEV protease, V8 protease, etc.
In the first preferred embodiment of the invention, the fusion protein comprises: a sequence consisting of one or more amino acid residues from the N-terminus of an MWP protein which is one of CWPs; a sequence consisting of six histidine residues as a tag for separation and purification; an amino acid sequence, Gly Ser Pro Val Pro Ser Gly (SEQ ID NO:1), as a linker; a methionine residue required for chemically cleaving out a polypeptide of interest; and a polypeptide sequence containing no methionine in its amino acid sequence, said sequences being linked linearly to one another in order.
In this instance, the fusion protein may comprise an MWP signal peptide sequence at the N-terminus. And an example of the polypeptide is human proinsulin. The sequence consisting of one or more amino acid residues from the N-terminus of an MWP protein preferably comprises 6, 7, 8, 10, 11, 12, 13, 14, 15, 17, 20 or 50 amino acids.
In the second preferred embodiment of the invention, the fusion protein comprises: a sequence consisting of 10 or 20 amino acid residues from the N-terminus of an MWP protein which is one of CWPs; a sequence consisting of six histidine residues as a tag for separation and purification; a sequence of human epidermal growth factor as a linker; an amino acid sequence, Asp Tyr Asp Ile Pro Thr Thr Glu Asn Leu Tyr Phe Gln (SEQ ID NO:2), required for cleaving out a polypeptide of interest with TEV protease; and a polypeptide sequence that contains no TEV protease recognition sequence in its amino acid sequence and has glycine or serine at the N-terminus, said sequences being linked linearly to one another in order.
In this instance, the fusion protein may further comprise an MWP signal peptide sequence at the N-terminus. As the polypeptide, human somatostatin 28 is exemplified.
In the third preferred embodiment of the invention, the fusion protein comprises: a sequence consisting of 20 amino acid residues from the N-terminus of an MWP protein which is one of CWPs; a sequence consisting of six histidine residues as a tag for separation and purification; an amino acid sequence, Gly Ser Pro Val Pro Ser Gly, (SEQ ID NO: 1)as a linker; an amino acid sequence, Phe Leu Glu, required for cleaving out a polypeptide of interest with V8 protease; and a polypeptide sequence containing no glutamic acid in its amino acid sequence, said sequences being linked linearly to one another in order.
In this instance, similarly, the fusion protein may further comprise an MWP signal peptide sequence at the N-terminus. Human glucagon is useful as the polypeptide.
The present invention also provides a DNA comprising a nucleotide sequence encoding a fusion protein, wherein said fusion protein comprises: a CWP signal peptide sequence from a Bacillus bacterium; a sequence consisting of amino acid residues for enzymatic cleavage; and an exogenous polypeptide sequence, said sequences being linked linearly to one another in order, and wherein said nucleic acid sequence is ligated to 3xe2x80x2-end of a nucleotide sequence comprising a Bacillus promoter region.
In this invention, the signal peptide sequence may be directly followed by a sequence of one or more amino acid residues from the N-terminus of the CWP protein.
Preferably, the Bacillus bacterium is Bacillus brevis. 
In an embodiment of the invention, the sequence consisting of amino acid residues for enzymatic cleavage comprises a sequence capable of cleaving with a protease.
In another embodiment of the invention, the fusion protein comprises: a signal peptide sequence for MWP which is one of CWPs; an amino acid sequence, Asp Tyr Asp Ile Pro Thr Thr Glu Asn Leu Tyr Phe Gln, (SEQ ID NO: 2) required for cleaving out a polypeptide of interest with TEV protease; and a polypeptide sequence that contains no TEV protease recognition sequence in its amino acid sequence, said sequences being linked linearly to one another in order.
In this instance, the signal peptide sequence may be directly followed by a sequence consisting of one or more amino acid residues from the N-terminus of the MWP protein. As the polypeptide, exemplified is a mutant human growth hormone with glycine or serine at the N-terminus.
The present invention further provides a vector comprising each of the DNAs as defined above.
The present invention still further provides a bacterium belonging to the genus Bacillus transformed with the above vector. The preferred bacterium is Bacillus brevis. 
The present invention still yet further provides a process for preparing a recombinant polypeptide, comprising culturing the bacterium as defined above in a medium to accumulate, outside the bacterial cells, a fusion protein comprising an exogenous polypeptide; removing the fusion protein from the medium; cleaving out the polypeptide from the removed fusion protein; and recovering the polypeptide.
This specification includes all or part of the contents as disclosed in the specification and/or drawings of Japanese Patent Application No. 10-87339, which is a priority document of the present application.