The present invention relates to a method and recombinant vectors for producing a predetermined protein which can be secreted in large amounts from a transformed microorganism. The protein so produced is a fusion protein containing an N-terminal secretion sequence derived from preprolysostaphin which is cleaved from the final product by normal processing mechanisms of the host. The invention also relates to an optional method of tagging such fusion proteins such that the transformed microorganism producing the predetermined protein of interest can be readily identified.
Well-established gene cloning techniques have allowed a wide variety of proteins to be expressed in microbial hosts. A major problem with current prokaryotic protein expression systems, which is a significant impediment to the production of commercially useful amounts of specific proteins by biotechnology techniques, is the inability of the transformed microorganism to secrete the cloned gene product (i.e., protein) into the medium in significant quantities.
Transport of proteins through membranes is a highly complex process common to all cellular organisms. The amino terminal portion of the precursors of proteins destined for secretion (i.e., preproteins) is usually a largely hydrophobic amino acid sequence called the signal peptide or signal sequence. Such sequences have been found in preproteins destined for secretion synthesized by both eukaryotes, including yeasts, and prokaryotes.
Gram negative (-) organisms, such as E. coli, although synthesizing proteins containing signal peptides, do not usually secrete the proteins into the medium. Instead, such "secreted" proteins are exported to the periplasmic space between the inner and outer membranes of the Gram(-) bacterial cell wall. Even though E. coli has remained the prokaryotic organism of choice for cloning and other genetic manipulations, secretion of cloned gene products into the culture medium by E. coli from which the protein is generally readily isolatable has remained a formidable obstacle for commercializing biotechnology products. This is largely due to the fact that the outer membrane of Gram(-) bacteria is an effective barrier to release of proteins from the periplasmic space. In general, isolation of proteins produced by E. coli involves cell disruption followed by painstaking purification to separate the desired proteins from unwanted cellular products.
Gram positive (+) bacteria, such as Bacillus species, which have a single membrane and normally secrete a variety of proteins, are better organisms in which to produce large quantities of foreign proteins. On the one hand, a major advantage of Gram(+) production systems is that expressed gene products are secreted directly into the medium from which recovery and purification of the proteins are relatively easy. On the other hand, Bacillus species, particularly B. subtilis, have a drawback in they also secrete large amounts of proteolytic enzymes which can degrade a secreted protein, thereby resulting in significantly reduced yields of the protein. This particular disadvantage associated with the use of Bacillus, however, has been largely overcome by the recent development of substantially protease-free Bacillus organisms. See e.g., Kawamura, F. and R. H. Doi, J. Bacteriol. 160, 442-444 (1984).
Transformed Gram(+) organisms containing a vector carrying DNA encoding a foreign protein are, in principle, excellent hosts for high yield production of that protein. In order to be useful and practical the vectors should not only contain the structural gene sequence encoding the protein of interest, but also a DNA sequence encoding the promoter, ribosome-binding site and amino terminal signal sequence of a protein normally secreted by the host bacteria. For example see, e.g. U.S. Pat. No. 4,711,843 to Chang. The aforementioned promotor, ribosome-binding site and signal sequence allows the foreign protein to be produced and secreted by the host organism. There are, however, several problems associated with the use of such methods to produce large amounts of a specific protein.
The protein that is synthesized by the host is a fusion protein composed of the amino acid sequence of the predetermined protein of interest attached to the signal peptide of a host protein to allow for secretion of the product. In order for the desired protein to be readily separated from the part of the fusion protein needed for secretion, it is necessary to introduce specific cleavage sites into such fusion proteins. This has proven difficult to achieve. Also, as reported by Wickner and Lodish, Science 230: 400-407 (1985), a number of studies have shown that, although the signal sequence is clearly essential for secretion, it may not be sufficient by itself to promote secretion of a protein. The match between the signal sequence and mature secreted protein may be the critical limitation. Alternatively, other parts of the secreted protein may provide information specifying secretion.
It is also desirable, but not necessary, that the protein being secreted be tagged in some convenient way to make it readily detectable once produced. Furthermore, tagging of proteins also allows for identification of an organism producing the protein. The known tags currently used for such purposes are the enzymes .beta.-galactosidase and alkaline phosphatase (see, for example, Matteucci et al., Biotechnology 4: 51-55 (1986). Although both enzymes are excellent tags and are easily detected by use of chromogenic substrates, they both have significant disadvantages rendering their value for large scale production of proteins relatively low.
.beta.-galactosidase cannot be used as a protein tag if secreted proteins are desired since numerous studies have shown that .beta.-galactosidase becomes lodged in the plasma membrane thereby preventing secretion of the tagged protein. Alkaline phosphatase, although exported to the periplasmic space of Gram(-) bacteria, cannot be used in Gram(+) secretion systems because it is readily degraded by extracellular proteases.
Staphylococcal nuclease has a number of advantages which makes it useful as a tag for proteins synthesized and secreted by both Gram(+) and Gram(-) organisms: (a) Digestion of the cloned staphylococcal nuclease gene with Sau3A yields a restriction fragment coding for the mature enzyme without its signal sequence thereby preventing secretion of such a truncated nuclease; (b) The presence of staphylococcal nuclease as part of a secreted protein is detectable by a rapid and sensitive colony assay; (c) Staphylococcal nuclease is relatively resistant to degradation by Bacillis proteases; (d) B. subtilis produces prenuclease which it secretes and correctly processes to mature nuclease; and (e) The N-terminus of mature staphylococcal nuclease is exposed to solvent and far removed from the active site, implying that fusion of foreign protein sequences to the N-terminus of nuclease would have little or no effect on the activity of the enzyme.
Lysostaphin is a bacteriocin which lyses staphylococci. Plasmid pRG5 containing a 1.5 Kb cloned DNA fragment which codes for preprolysostaphin has been described in U.S. patent application Ser. Nos. 852,407 and 034,464 filed in the name of Paul Recsei on Apr. 16, 1986, and Apr. 10, 1987, respectively, which are incorporated herein by reference, and in Reesi, Proc. Natl. Acad. Sci. 84: 1127-1131 (1987). Among other things, the 1.5 Kb gene encodes a 389 amino acid preprolysostaphin containing an amino terminal signal peptide of 36 amino acids. Adjacent to the C-terminal side of the signal peptide are seven tandem repeat prolysostaphin sequences, each containing 13 amino acids, which are removed during post-translational processing of preprolysostaphin to mature enzyme. Mature lysostaphin contains 246 amino acids and has an N-terminal sequence, Ala-Ala-Thr-His-Glu, which begins with amino acids 144-148 of the preprolysostaphin sequence.
Preprolysostaphin sequences have been cloned and expressed in E. coli, B. subtilis, and B. sphaericus 00. All three organisms efficiently and correctly process preprolysostaphin and secrete mature lysostaphin into the culture medium, indicating that the DNA sequence of the preprolysostaphin gene which codes for preprolysostaphin is useful for the construction of a Gram(+) [and Gram(-)] protein production system. Furthermore, predetermined proteins secreted by this production system can be tagged by staphylococcal nuclease, thus making the product and the organisms secreting it readily detectable.