The present invention relates to vector used in a recombinant DNA technique and also to a method for the expression of protein using the said gene.
Production of useful proteins using a recombinant DNA technique is a widely-used art at present. Among that, an expression system using Escherichia coli as a host is the most commonly used expression system and many proteins have been produced by means of recombinants. It is common for the production of useful proteins by such recombinants to use the so-called expression vector which is constructed by arranging the desired genes under the control of promoter recognized by RNA polymerase. Examples of the promoter used for expression vector are lac, trp, tac, gal and ara when E. coli, for example, is used as a host. An example of expression vector utilizing a promoter which is other than that directly recognized by RNA polymerase of E. coli is a pET-system (manufactured by Novagen) [J. Mol. Biol., volume 189, pages 113-130 (1986); Gene, volume 56, pages 125-135 (1987)] which utilizes a promoter recognized by RNA polymerase of bacteriophage T7 infecting to E. coli. In the case of the pET-system, T7RNA polymerase is expressed in E. coli, transcription of the desired gene which is arranged at the downstream of T7 promoter on the expression vector by the said T7RNA polymerase takes place and synthesis of the desired protein occurs by means of a translation system of the host.
However, when the desired protein is expressed in a high level in many E. coli expression system including the pET-system, it often happens that the desired protein gives an insoluble complex (the so-called inclusion body) and the amount of the desired protein of an active type becomes very small. It has been reported that, in some polypeptides, there are examples where the inclusion body is solubilized and then subjected to a refolding operation to give a polypeptide of an active type. Usually, however, its recovered amount is often low and, in addition, it is necessary to investigate an appropriate refolding condition for each of the desired proteins. Therefore, there has been a demand for a system where the protein of an active type is directly expressed in E. coli. 
Formation of an inclusion body is believed to take place in such a manner that, during the stage of an intermediate before the translated polypeptide chain is folded to a correct steric configuration, interwinding with other polypeptides takes place due an intermolecular action whereupon a very big insoluble complex is formed. It has been known that, when incubation of a recombinant E. coli is carried out at the temperature (20-30xc2x0 C.) which is lower than the commonly-used temperature of 37xc2x0 C. in that case, the expressed amount of the protein of an active type increases. That is presumed to be due to the fact that sufficient time is available for folding of the intermediate to a correct structure because of retardation of a translation rate by ribosome and that stability of the expressed protein of an active type increases because of retardation of action of the intracellular protease under a low temperature condition. As such, in the production of protein which is apt to give an inclusion body, a method where a recombinant E. coli is incubated at low temperature has been receiving public attention as an effective means.
On the other hand, when the incubating temperature of E. coli during a logarithmic growth period is lowered form 37xc2x0 C. to 10xcx9c20xc2x0 C., growth of E. coli once stops and, during that period, expression of a group of proteins called cold-shock proteins is induced. The said proteins are classified into group I (10-fold or higher) and group II (lower than 10-fold) depending upon their inducing levels and examples of the proteins of the group I are CspA, CspB, CspG and CsdA. In the case of CspA among them, its expressed amount after 1.5 hours from the temperature shift of from 37xc2x0 C. to 10xc2x0 C. reaches 13% of the total cell protein [Proc. Natl. Acad. Sci. USA, volume 87, pages 283-287 (1990)] and, therefore, utilization of promoter of cspA gene for the production of recombinant protein at low temperature has been attempted.
In addition to the above-mentioned advantage of initiation of transcription in a high efficiency at low temperature by a promoter of cspA gene, the following effectiveness has been shown for the recombinant protein expression system under a low temperature condition using the cspA gene.
(1) When a translatable mRNA transcribed from cspA gene does not code for CspA protein having a function or, to be more specific, when it codes for only a part of N-terminal sequence of CspA protein, such an mRNA inhibits the expression of other E. coli proteins including the cold-shock protein for long time and, during that period, translation of the said mRNA is carried out preferentially [J. Bacteriol., volume 178, pages 4919-4925 (1996)].
(2) In a position at a 12-base downstream from an initiation codon of cspA gene, there is a sequence called a downstream box consisting of 15 bases and it increases the translation efficiency under a low temperature condition.
(3) A 5xe2x80x2-untranslated region consisting of 159 bases existing at from the initiation point of transcription of cspA gene mRNA to the initiation codon gives a negative influence and a positive influence on the expression of CspA at 37xc2x0 C. and at low temperature, respectively.
However, although the promoter of the said gene is surely able to initiate the transcription at low temperature in a high efficiency, it actually acts even at the common incubating temperature (37xc2x0 C.) and it has been suggested that the stability of mRNA transcribed from the said gene regulates the expression of the said gene as well [Molecular Microbiology, volume 23, pages 355-364 (1997)]. Therefore, in an expression vector constructed using the promoter of cspA gene, regulation of expression is incomplete and, in case of a gene whose product is harmful to the host, there are some cases where it is difficult to incubate to such an state that E. coli containing the expression vector can be induced or even the construction of the expression vector is impossible.
For example, in U.S. Pat. No. 5,654,169, there is a description that, even when xcex2-galactosidase gene which is commonly used for evaluation of promoter is inserted into an expression plasmid using the promoter of cspA gene, it is difficult to keep the constructed product in E. coli due to the expressed product.
On the other hand, it has been known that an ability of the promoter of cspA gene to initiate transcription is held at the region which is downstream from xe2x88x9237 from the initiation point of transcription, however, an essential region has not been confirmed yet. Further, the above U.S. Patent shows a region of xe2x88x9240xcx9c96 from the initiation point for transcription for the said gene as an essential region for the function as a promoter of cspA gene. However, the said region contains a region of nearly 100 bases which is transcribed to mRNA and, in addition, does not code f or the protein. As such, the minimum region of cspA promoter which is necessary for achieving a transcription having a good efficiency at low temperature has not been clarified yet.
Accordingly, an object of the present invention is to offer a vector where a transformant for expression of the said gene can be prepared and where the said gene product can be expressed in a high efficiency even under a low temperature condition even in the case of gene whereby construction of an expression system or efficient production of gene product is difficult in the prior art.
The present inventors have attempted that, in order to achieve the above object, a lac operation sequence is inserted into a downstream of promoter of cspA gene whereby, during the construction of plasmid and incubation until the inducible state, gene expression from the said promoter is regulated. By the use of an expression vector having a cspA promoter which can be regulated by a lac operator constructed as such, there present inventors have succeeded in constructing an endo-sulfated-fucose-containing polysaccharide degrading enzyme (Fdase 2) which has been unable to be constructed with an expression vector utilizing a cspA promoter having no lac operator sequence. The present inventors have further found that, when the lac operator is inactivated during incubation of the transformant which was transformed by the said plasmid and the incubating temperature is made low at the same time, the said enzyme can be induced to expressed. This shows that, as a result of introduction of an operator sequence, construction of a low-temperature expression vector which can regulate the expression at ordinary temperature (37xc2x0 C.) is now possible.
The present inventors furthermore determined the minimum necessary region of the cspA promoter for being able to maintain its function whereupon they have accomplished the present invention.
The present invention will be summarized as follows. Thus, the first characteristic feature of the present invention relates to vector which is characterized in containing each of the following elements:
(1) a promoter which shows its action in the host to be used;
(2) regulatory region for regulating the action of the promoter of (1); and
(3) a region which codes for the 5xe2x80x2-untranslated region derived from cold-shock protein gene mRNA or a region which codes for the region where substitution, deletion, insertion or addition of at least one base is applied to the said untranslated region.
The second characteristic feature of the present invention relates to a method for expression of the desired protein which is characterized in containing the following steps.
(1) a step where a host is transformed by the vector of the first characteristic feature of the present invention wherein a gene coding for the desired protein to be expressed is integrated;
(2) a step where the resulting transformant is incubated; and
(3) a step where action of promoter is induced via a function of a regulatory region and, at the same time, incubating temperature is made lower than the ordinary temperature to express the desired protein.
Further, the third characteristic feature of the present invention relates to an isolated cspA promoter which is characterized in containing a base sequence as shown in SEQ ID NO:5 in the Sequence Listing and consisting of a base sequence having 135 or less bases.
Now the present invention will be illustrated more specifically as hereunder.
There is no particular limitation for the promoter in (1) of the first characteristic feature of the present invention but anything may be used so far as it has an activity of initiating the transcription of RNA in the host used. When such a promoter is used together with a region coding for the 5xe2x80x2-untranslated region derived from the cold-shock protein gene mRNA in the above (3), it can be used as a promoter which responds to low temperature. When a high transcription efficiency is desired during the expression induction, promoter derived from the above-mentioned cold-shock protein gene such as cspA, cspB, cspG, csdA, etc. is suitable for the present invention and, among them, promoter derived from cspA gene is particularly preferred.
With regard to the regulatory region for the above (2), there is no particular limitation so far as it is able to regulate the expression of gene located at the downstream of the promoter of (1). For example, when a region which transcribes an RNA complementary to mRNA transcribed by the promoter (i.e. an antisense RNA) is induced into a vector, translation of the desired protein from the gene located at downstream of the promoter can be inhibited. When transcription of antisense RNA is made under the control of an appropriate promoter which is different from that of (1), expression of the desired protein can be regulated. Alternatively, an operator existing in expression regulatory regions of various genes may be utilized as well. For example, lac operator derived from E. coli lactose operon can be used in the present invention. The function of lac operator can be cancelled by an appropriate inducible substance such as lactose or a substance having a similar structure or, preferably, isopropyl-xcex2-D-thiogalactoside (IPTG) whereby the promoter can be acted thereto. Such an operator sequence is usually arranged near the initiation point for transcription at the downstream of the promoter.
The region coding for the 5xe2x80x2-untranslated region derived from the cold-shock protein mRNA as mentioned in (3) is a region which codes for the area of 5xe2x80x2-side from the initiation codon of mRNA. In the cold-shock protein genes of E. coli (cspA, cspB, cspG and csda), such a region has been characteristically found [J. Bacteriol., volume 178, pages 4919-4925 (1996); J. Bacteriol., volume 178, pages 2994-2997 (1996)] and the area of 100 or more bases from the 5xe2x80x2-terminal among the mRNA transcribed from those genes cannot be translated to protein. This region is important for a low temperature dependency of the gene expression and, when the said 5xe2x80x2-untranslated region is added to 5xe2x80x2-terminal of mRNA of any protein, translation from the said mRNA into protein now takes place under a low temperature condition. The 5xe2x80x2-untranslated region derived from the cold-shock protein mRNA may be that where one or more substitution, deletion, insertion or addition is/are applied to the base sequence so far as the function can be maintained.
In the present specification, the term xe2x80x9cregionxe2x80x9d stands for a certain range on nucleic acid (DNA or RNA). The term xe2x80x9c5xe2x80x2-untranlated region of mRNAxe2x80x9d in the present specification stands for a region that, among the mRNA synthesized by the transcription from DNA, which is present at its 5xe2x80x2-side and does not codes for protein. In the present specification, the said region will be referred to as xe2x80x9c5xe2x80x2-UTRxe2x80x9d meaning 5xe2x80x2-untranslated region. Incidentally, unless otherwise stipulated, 5xe2x80x2-UTR stands for a 5xe2x80x2-untranslated region of mRNA of cspA gene of E. coli or a modified one thereof.
In the vector of the present invention, a region coding for the 5xe2x80x2-UTR derived from the above-listed cold-shock protein gene can be used and that derived from cspA gene can be used particularly appropriately. That where the base sequence is partially modified may be used as well and, for example, that where the base sequence of this region modified by introduction of the operation mentioned in the above (2) can be used as well. As will be shown in the Examples later, it is possible to use a region coding for mRNA containing a base sequence as shown in SEQ ID NO:1 in the Sequence Listing such as the region coding for mRNA of a base sequence as shown in SEQ ID NO:2, NO:3 or NO:4 in the Sequence Listing or, further, to use a region containing the region coding for mRNA wherein such a sequence is modified. The region coding for 5xe2x80x2-UTR of cold-shock protein gene is arranged between a promoter of (1) and an initiation codon of gene coding for the protein to be expressed or an operator may be induced onto the said region. For example, the 5xe2x80x2-UTR of the base sequence as shown by SEQ ID NO:2xcx9c4 in the Sequence Listing contains a lac operator sequence in the base sequence and is effective in expression of desired protein which has a selectivity at low temperature.
When a base sequence having a complemnentarity to the anti-downstream box sequence of ribosomal RNA of the host used is contained in the downstream of the 5xe2x80x2-untranslated region in addition to the above constituting elements, the expression efficiency can be increased. In the case of E. coli for example, an anti-downstream box sequence is present at the position of 1467-1471 of 16S ribosomal RNA and it is possible to use a region coding for an N-terminal peptide of cold-shock protein containing the base sequence showing a high complementarity with that sequence. For example, a base sequence as shown in SEQ ID NO:28 of the Sequence Listing or a sequence having a high homology with the sequence can be artificially introduced. It is effective that the sequence having a complementarity with the anti-downstream box sequence is arranged in such a manner that it initiates from the place which is about first to fifteenth base from the initiation codon. Gene coding for the desired protein is integrated into vector so that the said protein is expressed as a fused protein with such an N-terminal peptide or that a base substitution(s) is/are introduced by means of a site-directed mutagenesis to make the gene coding for the desired protein has a complementarity with the anti-downstream box sequence. When integration into vector is carried out so as to express the desired protein as a fused protein, the said peptide may be in any length so far as the desired protein does not lose its activity. Vector for expression of such a fused protein may, for example, be that where the connecting part is improved so as to be able to isolate the desired protein from the said fused protein or that where an improvement is done whereby it is expressed as a fused protein and peptide which can be utilized for purification or detection. Further, the vector in which a sequence for completion of transcription (terminator) is arranged at the downstream of the desired protein gene is advantageous for a high expression of the desired protein because of improvement in stability of the vector.
The vector of the present invention may be any vector which has been commonly used such as any of plasmid, phage and virus so far as it can achieve an object as a vector. Further, the region which is other than the above-mentioned constituting elements contained in the vector or the present invention may, for example, contain replication origin, drug-resisting gene used as a selective marker, regulatory gene necessary for functioning as an operator such as lac Iq gene to lac operator, etc. Furthermore, the vector of the present invention may be integrated onto genome DNA of the host after being introduced into a host.
Expression of the desired protein using the vector of the present invention constructed as a plasmid can be carried out according to the following steps for example. Thus, when gene coding for the desired protein is cloned to the plasmid vector of the present invention so that an appropriate host is transformed by the said plasmid, it is possible to obtain a transformant for expressing the said protein. Since an operation of promoter is suppressed by an operator in such a transformant, the above protein is not expressed under a noninducible state and, even if the above protein is toxic to the host, the above vector can be held in the host in a stable manner.
After the above transformant is incubated at an ordinary incubating temperature such as 37xc2x0 C. under a noninducible state, the action of operator is cancelled to induce a transcription whereby the desired protein is expressed. In that case, when an incubating temperature is made low before or together with the induction of transcription, formation of an inclusion body of the desired protein can be suppressed whereby the desired protein in a form having an activity can be obtained.
The present invention will now be further illustrated by showing the construction of a plasmid vector in a specific manner. Incidentally, in the present specification, E. coli CspA protein, region on the gene participating in expression of the said protein and the promoter region of the said gene will be referred to as xe2x80x9cCspAxe2x80x9d, xe2x80x9ccspA genexe2x80x9d and xe2x80x9ccspA promoterxe2x80x9d, respectively unless otherwise stipulated. Incidentally, the base sequence for natural cspA gene which has been registered as Accession No. M30139 at GenBank gene database has been laid open is shown as SEQ ID NO:6 of the Sequence Listing. In the said sequence, base numbers 426-430 and 448-453 are core sequences of the promoter; base number 462 is a major initiation point for transcription (+1); base numbers 609-611 are SD sequence (ribosome binding sequence); and base numbers 621-623 and 832-834 are initiation codon and termination codon, respectively, of CspA. Accordingly, that which codes for 5xe2x80x2-UTR in the said sequence is the area of base numbers 462-620.
First, construction of foreign gene expression plasmids as expression plasmids utilizing cspA gene which were prepared by construction of a series of plasmid vector pMM031 (pMM031 and pMM031F1) using cspA gene per se and then foreign gene was introduced therein and expression of protein using the said plasmids will be illustrated.
Detailed method for the construction of such expression plasmids is mentioned in Example 1-(1). For example, plasmid pMM031 has such a structure that a region containing a lac promoter between the AflIII-EcoRI sites of plasmid vector pTV118N (manufactured by Takara Shuzo) containing ampicillin-resistant gene, replication origin of pUC plasmid, etc. is substituted with a region consisting of a promoter region of cspA gene, a region coding for 5xe2x80x2-UTR and a region coding for an N-terminal part of CspA of 13 amino acid residues. Incidentally, in a plasmid pMM031F1, codon which codes for asparagine which is 13th one from the N-terminal of CspA changes to that which codes for lysine. The promoter regions of cspA gene used for those pMM031 series are the regions at xe2x88x9267 and thereafter counting from the initiation point for transcription of the said gene containing the region necessary for the function. Further, the region coding for the N-terminal 13 amino acid residues of CspA well contains the downstream box sequence playing a high translation efficiency of cspA gene under a low temperature condition. From those reasons, pMM031 series are expression vectors which can well reflect the high protein-expressing efficiency of cspA gene under a low temperature condition.
The fact that the plasmids of pMM031 series function as expression vectors of a low-temperature induction type and are able to express the useful protein as proteins of an active type was confirmed by the use of a reverse transcriptase derived from Rous-associated virus 2 (RAV-2) as an example in Example 1-(2). However, when E. coli transformed by a reverse transcriptase-expressing vector constructed by the use of pMM031 series was treated at 37xc2x0 C., it was observed that, as compared with the transformants of plasmid of pMM031 series containing no foreign gene, the resulting colonies were small and growing rate of the cells was slow as well. This suggests that, when cspA gene (particularly, promoter of the said gene) is utilized, control of expression at 37xc2x0 C. is insufficient and that is a problem for the production of protein.
It has been shown that the incorrectness of control of expression of cspA gene at 37xc2x0 C. is so serious that, when the protein to be expressed is far highly toxic to the host, construction of the expression plasmid was impossible. As shown in Example 1-(3), when construction of plasmid which expresses an endo-sulfated-fucose-containing polysaccharide degrading enzyme (Fdase 2) was attempted using a plasmid vector of pMM031 series, construction of the said plasmid was impossible due to the toxicity of the expression product to the host. The fact that the expressed Fdase 2 affected the host whereby construction of the expression vector was impossible can be easily predicted from the fact that, as a by-product during the operation of construction, an open reading frame got out of the position due to deletion of one or two base(s) in the gene coding for the said enzyme whereupon the plasmid which was no longer able to express the said enzyme was obtained. Further, with regard to toxicity of Fdase 2 to the host E. coli, it was shown by the fact that no transformant was obtained when the said enzyme-expressing vector was constructed using plasmid pET3d of pET-system (manufactured by Novagen) which was introduced as one of the prior art and then the host E. coli BL21 (DE3) for expression having T7RNA polymerase gene was tried to transform.
Now, the present inventors have developed the new expression vectors which are effective in actual use based upon those results and have found the plasmid vector of the present invention.
Thus, the present inventors have developed a low temperature expressing plasmid vector pMM037 which lowers the expression level under the noninducible state (37xc2x0 C.) and is able to control the expression of the desired protein. The pMM037 has the entirely same structure as pMM01F1 except that a sequence of 31 bases which is designed so as to be able to form a functional lac operator is inserted instead of the region of +2xcx9c+18 of downstream of initiation point (+1) for transcription on pMM031F1 of the pMM031 series. Base sequence of the 5xe2x80x2-UTR coded on the plasmid vector pMM037, i.e. that from the initiation point for transcription until the base immediately before the initiation codon for CspA, is shown in SEQ ID NO:2 of the Sequence Listing.
Method for the construction of this expression plasmid vector is mentioned in Example 2-(1). Thus, it is possible to synthesize a primer CSA+1RLAC (base sequence of the said primer is shown in SEQ ID NO:11 of the Sequence Listing) which is designed so as to form a functional lac operator at the downstream of cspA promoter and contains the sequence of the region of upstream of initiation point of transcription of cspA gene and the region of lac operator. When this primer where 5xe2x80x2-terminal is phosphorylated and the primer CSAxe2x88x9267FN (base sequence of the said primer is shown in SEQ ID NO:7 of the Sequence Listing) used for the construction of the plasmid of pMM031 series are used and a PCR is carried out using a plasmid pJJG02 containing a wild type cspA gene [J. Bacteriol., volume 178, pages 4919-4925 (1996)] as a template, DNA fragments being arranged with lac operator region at the downstream of the promoter of cspA gene are able to be obtained. When a restriction enzyme-recognizing sequence is designed near the terminal of the primer used at that time like the NcoI site on the primer CSAxe2x88x9267FN and the NheI site on the primer CSA+1RLAC, construction and modification thereafter are convenient. The resulting DNA fragments are digested with NcoI and inserted between NcoI and SmaI of the plasmid pTV118N (manufactured by Takara Shuzo) whereby a plasmid pMM034 can be constructed.
The resulting pMM034 is cleaved at the NcoI site and the AflIII site on pTV118N, the terminals are made blunt using Klenow fragments and a self-ligation is carried out whereupon a plasmid pMM035 wherefrom lac promoter derived from pTV118N is removed can be constructed.
After that, a sequence coding for a 5xe2x80x2-untranslated region of cspA mRNA at the downstream of the lac operator region of pMM035 can be inserted. Thus, a PCR is carried out using pMM031F1 constructed in Example 1-(1) as a template and using a primer CSA+20FN (base sequence of the primer CSA+20FN is shown in SEQ ID NO:12 in the Sequence Listing) and an M13 primer M4 (manufactured by Takara Shuzo) whereupon it is possible to obtain DNA fragment containing from the 19th base of downstream of the initiation point of transcription of cspA gene to the multi-cloning site of pMM031F1. The DNA fragment is cleaved at the NheI site arranged on CSA+20FN and at the XbaI site on the multi-cloned site and, after that, fragment is inserted between NheI-XbaI of the already-prepared pMM035 in such a direction that each site is regenerated whereupon pMM037 can be constructed.
Ability of control of expression of desired protein at ordinary temperature (37xc2x0 C.) and efficacy of the expressing ability of the desired protein at low temperature of the resulting plasmid pMM037 can be tested by introducing the gene which codes for the desired protein into a multi-cloning site derived from pTV118N on pMM037.
An expression plasmid having a gene coding for the above-mentioned endo-sulfated-fucose-containing polysaccharide degrading enzyme (Fdase 2) is unable to be even constructed by a plasmid vector of a pMM031 series having no operator. However, the plasmid pMFDA102 which is an expression plasmid constructed by insertion of the said gene into the plasmid vector pMM037 in such a manner that the same open reading frame as in the sequence coding for the N-terminal part of CspA is resulted can be retained in a stable manner in E. coli of a strain which highly expresses the lac repressor such as E. coli JM109. As such, it has been clarified that the above-mentioned plasmid vector has an ability of substantially effective control of expression.
Further, the resulting transformant is incubated at ordinary temperature (37xc2x0 C.) and, when turbidity suitable for induction is attained, the incubating temperature is made lower such as 15xc2x0 C. and, at the same time, an appropriate inducing agent such as isopropyl-xcex2-D-thiogalactoside (hereinafter, referred to as IPTG) of a final concentration of 1 mM is added followed by incubating for an appropriate period. The cells obtained from the culture liquid were analyzed for the proteins expressed therein by means of the SDS polyacrylamide gel electrophoresis (SDS-PAGE) to detect the bands of the said fused polypeptide whereby the ability of pMM037 for expressing the desired protein at low temperature can be confirmed. Alternatively, when the resulting cells are subjected to an ultrasonic treatment or the like to prepare a cell extract and a physiological activity of the desired protein contained in the said cell extract is measured, the amount of the desired protein expressed as an active type can be determined. E. coli which was transformed by the above-mentioned plasmid pMFDA102 expressed the active Fdase 2 protein by the above inducing operation.
Incidentally, the cspA promoter still retained an activity of initiating the transcription at low temperature which is inherent function thereto in spite of the fact that, in the construction of the above plasmid vector pMM037, lac operator was introduced into a position (the position of +2 and thereafter) immediately after the initiation point for transcription of the cspA gene. From this fact, it is understood that the cspA promoter retains its function at the region until the initiation point for transcription. Accordingly, for the function of the cspA promoter, the region from the above position of xe2x88x9237 to the initiation point for transcription or, in other words, the area of base numbers 425-461 in the base sequence of the natural cspA gene shown in SEQ ID NO:6 in the Sequence Listing is essential. The base sequence of the essential region for the cspA promoter is shown in SEQ ID NO:5 of the Sequence Listing.
The plasmid vector pMM037 constructed as such can be modified by introduction of changes such as deletion, addition, insertion and substitution of base and the thing where such a change is introduced into the constituting element of the present invention is within a coverage of the present invention as well. As hereunder, examples of the modification of the vector of the present invention using pMM037 as a fundamental structure carried out by the present inventors will be illustrated.
First, a deletion mutation can be introduced into the region coding for 5xe2x80x2-UTR. As mentioned in Example 3-(1), it is possible to construct a plasmid where SD sequence of cspA gene is connected immediately after the lac operator region of pMM037 or, in other words, a plasmid vector pMM036 coding for the 5xe2x80x2-UTR coded on pMM037 as shown in SEQ ID NO:2 of the Sequence Listing where the portion of base numbers 33-161 is deficient. This pMM036 has the entirely same structure as pMM037 except the deletion mutation introduced into a sequence coding for 5xe2x80x2-UTR.
After that, length of the amino acid residues of the N-terminal part of CspA to be fused with the protein to be expressed can be changed. As shown in Example 3-(2), a region coding for the N-terminal part of CspA on pMM037 is used as a total amino acid sequence coding region (70 amino acid residues) and a multi-cloning site is arranged after that whereby it is possible to construct a plasmid vector pMM038 where the desired gene is expressed as a fused polypeptide with 70 amino acid residues of CspA. This pMM038 has the entirely same structure as pMM037 except that the total length of the sequence coding for CspA expressed as a fused polypeptide is contained.
It is also possible to introduce a substitution mutation into a sequence coding for 5xe2x80x2-UTR. As mentioned in Example 3-(3), it is possible to construct a plasmid vector pMM047 where a mutation by substitution with 6 bases is introduced into a region corresponding to +20xcx9c+26 counting from the initiation point for transcription of natural cspA gene on the region coding for 5xe2x80x2-UTR on pMM037. This pMM047 has the entirely same structure as the pMM037 except the above substitution mutation. Incidentally, E. coli JM109 transformed by the plasmid vector pMM047 has been named and designated as Escherichia coli JM109/pMM047, deposited as of Oct. 31, 1997 at the National Institute of Bioscience and Human Technology, Ministry of Internal Trade and Industry (1-3, Higashi 1 chome, Tsukubashi, Ibaraki-ken, Japan; post office code: 305-8566) as FERM P-16496 and deposited at the same institute as FERM BP-6523 (date of request for transfer to the international deposition: Sep. 24, 1998). A base sequence of the 5xe2x80x2-UTR coded to the plasmid vector pMM047 or that of from the initiation point for transcription to the base immediately before the initiation codon for CspA is shown in SEQ ID NO:3 of the Sequence Listing.
It is also possible that two or more of such a mutation can be introduced at the same time. As shown in Example 3-(4), it is possible to construct a plasmid vector pMM048 where a deletion mutation of 30 bases is further introduced into a sequence coding for the 5xe2x80x2-UTR of the above plasmid pMM047. This pMM048 has the entirely same structure as the pMM037 except that it contains 6-base substitution same as pMM047 and a deletion of a region corresponding to from +56 to +85 counting from the initiation point for transcription and natural cspA gene or, in the other words, the area coding for the base numbers 70-99 in the base sequence shown in SEQ ID NO:3 of the Sequence Listing. A base sequence of 5xe2x80x2-UTR coded to the plasmid vector pMM048 is shown in SEQ ID NO:4 of the Sequence Listing.
From the fact that the above-mentioned plasmid vectors pMM047 and pMM048 retain the ability of expression of protein at low temperature, it is shown that the mutation introduced into 5xe2x80x2-UTR inherent to cspA gene in the construction of those two genes does not affect the function. Therefore, it is shown that the region on 5xe2x80x2-UTR derived from cspA gene essential for the expression of protein at low temperature is regions of +27xcx9c+55 and +86xcx9c+159 counted from the initiation point for transcription of natural cspa gene coded on the plasmid pMM048. A base sequence of the said region is shown in SEQ ID NO:1 of the Sequence Listing.
Ability of control of expression of desired protein at ordinary temperature (37xc2x0 C.) and effectiveness of expressing ability for desired protein at low temperature of modified plasmid vector of pMM037, i.e. pMM036, pMM038, pMM047 and pMM48 can be evaluated, for example, by utilizing xcex2-galactosidase gene (lac Z gene) which is well used for the evaluation of expressing ability of expression vector.
Thus, as mentioned in Example 3-(5), DNA fragments of about 6.2 kbp containing lac Z gene obtained from a plasmid pKM005 [Experimental Manipulation of Gene Expression, pages 15-32, edited by M. Inouye, published by Academic Press, New York, 1983] are inserted into plasmid vector pMM037 and modified plasmids thereof whereupon it is possible to construct a fused xcex2-galactosidase expression vector in which 12 amino acid residues at the N-terminal of CspA and 10 amino acid residues derived from the multi-cloning site are connected at the tenth amino acid residue of xcex2-galactosidase. In the case of pMM038, 70 amino acid residues at N-terminal of CspA and 9 amino acid residues derived from the multi-cloning site code for a fused xcex2-galactosidase connected at the tenth amino acid residue of xcex2-galactosidase. The resulting plasmids containing lac Z gene are named plasmid pMM037lac, pMM036lac, pMM038lac, pMM047lac and pMM048lac, respectively.
E. coli JM109 transformed by each of the plasmids is incubated at ordinary temperature (37xc2x0 C.) and, when the turbidity suitable for induction is available, the incubating temperature is lowered to 15xc2x0 C. for example and, at the same time, an appropriate inducing agent such as IPTG of a final concentration of 1 mM is added, then incubation is further conducted for an appropriate period and the xcex2-galactosidase activity in the resulting incubated solution is measured whereby the ability for expressing the protein at low temperature can be compared. When the cells just before the induction is used, the expressed amount at the noninducible state at 37xc2x0 C. can be compared as well.
The xcex2-galactosidase activity can be measured by a method described in xe2x80x9cExperiments in Molecular Geneticsxe2x80x9d, pages 352-355, edited by J. H. Miller and published by Cold Spring Harbor Laboratory in 1972.
As shown in Table 1, the xcex2-galactosidase activity at 37xc2x0 C. of E. coli transformed by any of the plasmids has the same level as in pTV118Nlac where lac Z gene is introduced into the downstream of lac promoter used as a control and it is now understood that the expression at 37xc2x0 C. is effectively controlled. Incidentally, the xcex2-galactosidase activity at 37xc2x0 C. detected at that time is in such a level that is somewhat induced by lactose, etc. contaminated in LB medium (1% trypton, 0.5% yeast extract and 0.5% NaCl; pH 7.0) used for the incubation.
On the other hand, in E. coli transformed by any of the plasmids, an increase in the xcex2-galactosidase activity was noted by a temperature shift to 15xc2x0 C. and by addition of an inducing agent. This shows that each plasmid has an ability of high expression of the desired protein at low temperature. Incidentally, in the case of the plasmid pMM036 where most of 5xe2x80x2-UTR derived from cspA gene mRNA is lost, the expressing amount of xcex2-galactosidase is low as compared with other plasmids.
Further, the results obtained for pMM047lac and pMM048lac show that the mutation introduced into 5xe2x80x2-UTR of mRNA for which those plasmids code does not badly affect the expression of protein at low temperature or that the region where those mutation is not introduced whose base sequence is shown in SEQ ID NO:1 of the Sequence Listing is essential for its function.
Furthermore, the expressing amount of xcex2-galactosidase at each temperature for the transformants which were transformed by those plasmids was tested and it was found that all of plasmids pMM038lac, pMM037lac and pMM047lac at the temperature of as low as 10xc2x0 C. or 15xc2x0 C. showed higher expressing amount, those at the temperature of 20xc2x0 C. showed the expressing amount of the same level and those at the temperature of 37xc2x0 C. showed the lower expressing amount as compared with the pTV118Nlac (a control). The result shows that the 5xe2x80x2UTR of mRNA for which those plasmids code is effective for the expression of protein at the temperature state of mostly as low as 15xc2x0 C. and lower. On the other hand, the plasmid pMM048lac showed the higher expressing amount at the temperature state of as low as 20xc2x0 C. or lower and showed the similar expressing amount even at 37xc2x0 C. as compared with pTV118Nlac. This shows that, as a result of mutation caused by introduction into pMM048, the said plasmid acquired a high ability for expression of protein both at ordinary and low temperatures.
On the other hand, it goes without saying that, in the vector of the present invention, the region which is other than the constituting element of the present invention is able to have various functions. For example, the vector of the present invention may contain a multi-cloning site substantially containing no termination codon and a transcription terminator for stabilization of plasmid. As mentioned in Example 4, it is possible to construct a series of vectors having each different open reading frame on a multi-cloning site where the site containing an initiation codon of cspA gene is converted to NcoI site or NdeI site, a multi-cloning site connecting to the region coding for the N-terminal of CspA is changed to a sequence containing substantially no termination codon, the downstream thereof has a sequence wherein termination codon appears in any of the three open reading frames and the further downstream thereof contains a transcription terminator region derived from cspA gene. A plasmid containing pMM047 as a fundamental skeleton is named pColdOlNC series (including NcoI site) or pCold01ND series (including NdeI site) plasmid while a plasmid containing pMM048 as a fundamental skeleton is named pCold02NC series or pCold02ND series plasmid. In such a series of plasmids having a multi-cloning site where substantially no termination codon is contained and each open reading frame is different, insertion of foreign gene is easy whereby an expression vector can be easily constructed.
The fact that those pCold01 series and pCold02 series plasmids have similar expressing abilities as their fundamental skeletons, i.e. plasmid pMM047 and pMM048, respectively can be evaluated by utilizing lac Z gene. As shown in Table 5, E. Coli transformed by a plasmid which is prepared by an insertion of lac Z gene into the already-mentioned plasmid shows the similar xcex2-galactosidase expressing pattern as E. coli transformed by pMM047lac or pMM048lac shown in Table 1, respectively and it is shown that the above-mentioned plasmid retains the similar expressing ability as the plasmid pMM047 or pMM048, respectively.
Since all of the vectors of the present invention which are specifically exemplified hereinabove use lac operator as an operator, it is necessary to use a strain of E. coli expressing high amount of lac repressor (lac Iq strain) such as E. coli JM109 when expression of gene is an object. Other operators may be used for the vector of the present invention and, in that case, it is a matter of course that a control method suitable for the said operator is used. Further, as being obvious for the persons skilled in the art, even when lac operator is used as in the case of the above pCold series plasmid, limitation for the host can be made nil by introduction of lac repressor gene (lac I gene) onto this plasmid.
For example, as mentioned in Example 5, it is possible to construct pCold03 series and pCold04 series containing lac I gene. Each of those plasmids has the entirely same structure as pCold01 series and pCold02 series, respectively except that lac I gene is contained therein. It is also possible to similarly construct a plasmid using lac Iq gene, which is a gene expressing a high amount of lac repressor instead of lac I gene, such as pCold05 series and pCold06 series plasmids.
Ability of control of expression of desired protein of the plasmids containing lac I gene or lac Iq gene constructed as such and also effectiveness of ability of expression of desired protein at low temperature thereof can be easily evaluated by insertion of lac Z gene into those plasmids and by the use of E. coli DH5 xcex1 having no lac I gene as a host. Table 6 shows the effect of lac I and lac Iq genes. At 37xc2x0 C. which is a noninducible state, expression is not fully controlled in the case of pCold01NC2lac having no lac I gene and a high xcex2-galactosidase activity is noted. Incidentally, in the case of pCold02 (a derivative of pMM048) having higher expressing ability than pCold01 at 37xc2x0 C., no transformant is obtained using E. coli DH5 xcex1 as a host. On the contrary, in the case of pCold03NC2lac and pCold04NC2lac having lac I gene, expression at 37xc2x0 C. is effectively controlled and, further, in the case of pCold05NC2lac and pCold06NC2lac where lac Iq gene is present, expression is more effectively suppressed whereby an effective control is found to be achieved. In addition, in those plasmids, there is no substantial change in terms of an expressing ability for desired protein in an inducible state. Therefore, it is shown that, when lac I gene or lac Iq gene is introduced into the vector of the present invention containing lac operator as a constituting element, there is no limitation for the host whether or not there is lac repressor.
It is also possible that, in order to improve the expression efficiency of the desired gene, a base sequence (downstream box sequence) having a high complementarity to anti-downstream box sequence existing in 16S ribosomal RNA is introduced into the vector of the present invention. The downstream box sequence existing in the region which codes for the N-terminal part of E. coli CspA has only 67% of complementarity to the above-mentioned anti-downstream box sequence. When this is made into a base sequence having higher complementarity or preferably 80% or more complementarity, it is possible that the gene which is connected in its downstream is expressed in higher efficiency.
Further, a base sequence coding for a tag sequence which is a peptide for making the purification of the expressed desired gene product easier or a protease-recognizing amino acid sequence utilizable for removal of an excessive peptide in the desired gene product such as a tag sequence can be introduced into the vector of the present invention.
With regard to a tag sequence for the purification, histidine tag consisting of several histidine residues, maltose-bonded protein, glutathione-S-transferase, etc. may be used. Protein to which histidine tag is added can be easily purified using a chelating column and, with regard to other tags, they may be also easily purified using a ligand having a specific affinity with them. Examples of the protease which is utilized for removal of an excessive peptide are factor Xa, thrombine and enterokinase and it is possible to introduce a base sequence coding for amino acid sequence which is specifically cleaved by those proteases into the vector of the present invention.
For example, in Example 6, plasmid (pCold07 series and pCold08 series) into which a downstream box sequence which is completely complementary to anti-downstream sequence existing in 16S ribosomal RNA and a base sequence coding for a recognition amino acid sequence of factor Xa and histidine tag consisting of six histidine residues are introduced are mentioned. When the protein expressing ability of the said plasmid is evaluated using lac Z gene, it is shown that the expressed xcex2-galactosidase activity significantly increases as compared with a plasmid having a downstream box sequence exhibiting a low complementarity. In addition, although the expressed amount of xcex2-galactosidase before induction more or less increases, that is within an allowable level and, as mentioned above, that can be effectively suppressed by changing the lac I gene on those plasmids to lac Iq gene.
When pCold07 series or pCold08 series are used, the desired protein is expressed as a fused protein with a peptide coded to a downstream box sequence, histidine tag and a leader peptide containing recognition amino acid sequence of factor Xa. Since the fused protein contains histidine tag, it can be purified using a chelating column by a single step. After that, the said protein is treated with a factor Xa to cleave the leader peptide from the desired protein and then passed through a chelating column once again whereupon only desired protein wherefrom the leader peptide is removed can be obtained.