The subject of the invention is a method of transcription, which makes it possible to synthesize RNA strands complementary to an RNA template, as well as new RNA polymerases which make it possible to carry out this method.
The method of the invention leads to the amplification of RNA present in small quantities in a biological sample, and thus allows the detection and/or quantification of the RNA in the sample, or the sequencing of the product of amplification, in particular in the field of microbiology and virology, and more generally in the field of medical diagnosis. The method of the invention may also be used in the synthesis of RNA probes.
It is known that in microbiology and in virology, the microorganisms which it is sought to identify are often viable bacteria (therefore containing more RNA than DNA) or RNA viruses such as the HIV and HCV viruses. It is also known that in various pathologies, it is often advantageous to monitor the variations in the expression of genes, and therefore in the synthesis of messenger RNA.
It is therefore important to be able to have a simple and effective method of amplifying an RNA target.
The PCR method, which makes it possible to cyclically amplify a DNA target, uses a single enzyme but requires the production of temperature cycles, generally at three different temperatures. The PCR method may be adapted to the amplification of an RNA target by adding an additional enzymatic activity of RNA-dependent DNA polymerase, which further complicates this method.
The so-called NASBA/TMA method of amplification has the advantage of being an isothermic method, but requires the use of three enzymatic activities (RNA-dependent DNA polymerase, RNase H and DNA-dependent RNA polymerase) carried by two or three enzymes.
It is therefore desirable to be able to have a simple and automatable method of amplification of RNA, and in particular an isothermic method using only one enzyme.
To avoid the disadvantages, which have just been mentioned, of known amplification techniques, it therefore appears to be necessary to use, for the amplification of RNA, an RNA-dependent RNA polymerase activity.
Unfortunately, the known natural RNA-dependent RNA polymerases (RNAd RNAp) are not suitable for such a use because they have specific requirements as regards the RNA template, and their activity requires the presence of protein cofactors (also called auxiliary protein factors or associated protein factors).
It has now been discovered that some known DNA-dependent RNA polymerases are capable of transcribing a single-stranded RNA in the presence of a double-stranded DNA promoter. Furthermore, some of these enzymes, which are transformed by mutation, are capable of synthesizing a transcriptional product with a better yield when the template consists of RNA than when the template consists of DNA.
In the present application, the term xe2x80x9ctranscriptionxe2x80x9d designates the synthesis of several strands of RNA in the presence of a polynucleotide template and of ribonucleoside triphosphates, in an appropriate reaction medium and under conditions allowing the catalytic activity of an RNA polymerase to be exerted. The transcription occurs by synthesis of a complementary or antiparallel copy of the template. The strand of the template which is copied is called the transcribed strand or the template strand. The synthesis of the RNA progresses in the 5xe2x80x2-3xe2x80x2 direction.
It is known that some RNA polymerases function under the control of a promoter. A promoter is a double-stranded nucleotide sequence recognized for the RNA polymerase and necessary for the initiation of transcription.
It should be recalled that when the template strand is linked to the promoter, the first nucleotide transcribed on the template strand, linked by its 3xe2x80x2 end to the 5xe2x80x2 end of one of the strands of the promoter, is designated by +1. The strand of the promoter which is linked to the template strand is called the antisense strand. The other strand of the promoter, which is complementary to the antisense strand, and hybridized to it, is called sense strand. The successive nucleotides which are situated on the side of the promoter, with respect to nucleotide +1, are, starting from +1, numbered xe2x88x921, xe2x88x922, xe2x88x923, and the like.
The position xe2x88x921 therefore corresponds to the 5xe2x80x2 end of the antisense strand of the promoter, and to the 3xe2x80x2 end of the sense strand. However, some authors include the nucleotide sequence corresponding to the region where the transcription starts (in particular the sequence from +1 to +6, for which a consensus sequence can generally be defined) in the definition of the sequence of the promoter.
On the template strand, the positions of the successive nucleotides copied, starting from +1, and therefore in the 3xe2x80x2-5xe2x80x2 direction, are noted +2, +3, and the like.
In the text which follows, the terms sense strand and antisense strand are generally used for the promoter itself (positions numbered negatively), and the term non-template strand is used for any strand linked to the 3xe2x80x2 end of the sense strand, and the term template strand for any strand linked to the 5xe2x80x2 end of the antisense strand or for any strand hybridized to the non-template strand. In a given polynucleotide strand, xe2x80x9cupstream regionxe2x80x9d refers to a region situated on the side of the 5xe2x80x2 end, and xe2x80x9cdownstream regionxe2x80x9d a region situated on the side of the 3xe2x80x2 end. However, in the domain of transcription under the control of a promoter, and without taking into consideration a particular strand, xe2x80x9cupstreamxe2x80x9d region traditionally refers to the region which, relative to position +1, is on the side of the promoter (positions indicated by negative numbers), and xe2x80x9cdownstreamxe2x80x9d region the region situated on the side of the template copied (positions indicated by positive numbers), such that the downstream direction then corresponds to the 3xe2x80x2-5xe2x80x2 direction on the template strand, and to the 5xe2x80x2-3xe2x80x2 direction on the newly-synthesized RNA strand.
The template strand is not necessarily linked to the 5xe2x80x2 end of the antisense strand of the promoter. However, it should, in this case, be hybridized to a complementary and antiparallel strand (non-template strand) which is itself linked by its 5xe2x80x2 end to the 3xe2x80x2 and of the sense strand of the promoter; see ZHOU W. and DOETSCH P. W., Biochemistry 33, 14926-14934 (1994) and ZHOU W. et al., Cell 82, 577-585 (1995). In such a case, the transcription may start in any position, which may range from +1 to +24, corresponding to the 3xe2x80x2 end of the template strand or of the part of the template strand hybridized with the non-template strand.
Compared with bacterial, eukaryotic or mitochondrial RNA polymerases, the phage RNA polymerases are very simple enzymes. The best known among them are the RNA polymerases of the T7, T3 and SP6 bacteriophages. The bacteriophage RNA polymerase has been cloned; see in particular U.S. Pat. No. 4,952,496. These enzymes are highly homologous to one another and consist of a single subunit. The natural promoters specific for the RNA polymerases of the T7, T3 and SP6 phages are well known. The sequencing of the whole genome of the bacteriophage T7 (Dunn et al., J. Mol. Biol. 166, 477-535 (1983)) has made it possible to define the existence of 17 promoters on the DNA of this phage. Comparison of these 17 sequences shows that 23 contiguous nucleotides situated between positions xe2x88x9217 and +6 relative to the site of initiation (position +1) of transcription, are highly conserved. These nucleotides are even identical in five so-called class III promoters, which are the most efficient in particular in vitro. Likewise, many promoter sequences specific for the T3 RNA polymerase also exhibit a very high homology, in particular between positions xe2x88x9217 and +6. Moreover, several different sequences of promoter for phage SP6 RNA polymerase have been identified and also exhibit a high homology; see Brown J. E., et al., Nucleic Acids Res. 14, 3521-3526 (1986).
It is therefore possible to consider that the various phage RNA polymerases mentioned above form part of a family of RNA polymerases which recognize promoters having a consensus sequence from position xe2x88x9217 to position +6, and in particular from position xe2x88x9217 to position xe2x88x921.
The method of the invention makes it possible to transcribe any RNA sequence because the RNA polymerases, capable of transcribing RNA under the control of a promoter, which are described in the present application, may transcribe RNA without a high sequence specificity. It is known, nevertheless, that some sequences for initiation of transcription, in particular from position +1 to position +6, are more favorable than others for obtaining transcripts of the expected length with a given phage RNA polymerase, in the case of the transcription of DNA; see for example Milligan J. F. et al., Nucleic Acids Research, 15, 8783-8798 (1987). The RNA polymerases capable of transcribing RNA which are described in the present application may also function with variable yields according to the sequence of the region for initiation of transcription. The sequences which are most suitable for a given RNA polymerase may be determined, where appropriate, by simple routine experiments similar to those described by Milligan et al. in the article which has just been mentioned. In addition, as will be seen below, the method of transcription of the invention makes it possible, where appropriate, either to start the transcription in a favorable region of the RNA to be transcribed, or to provide a reagent-promoter which already contains a region for initiation of transcription having a sequence favorable for a given RNA polymerase.
The subject of the present invention is therefore a method of amplifying any RNA target sequence, by transcription under the control of a promoter, in an RNA sample comprising said target sequence, in which said sample is brought into contact:
with a reagent capable of hybridizing with said RNA comprising said target sequence,
and with an enzymatic system comprising an RNA-dependent RNA polymerase activity, under conditions allowing the hybridization of said reagent with said RNA comprising said target sequence and under conditions allowing the functioning of said RNA-dependent RNA polymerase activity; in which said reagent contains:
(i) a first nucleotide strand comprising: a) a first nucleotide segment capable of playing the role of sense strand of a promoter for said RNA polymerase activity and b), downstream of said first segment, a second nucleotide segment comprising a sequence capable of hybridizing with a region of said RNA, and
(ii) in the hybridized state on the first strand, a second nucleotide strand comprising a third nucleotide segment capable of hybridizing with said first segment so as to form with it a functional double-stranded promoter;
and in which said RNA polymerase activity is capable of transcribing an RNA template, in the presence of said reagent hybridized with said template, in the absence of associated protein factor and in the absence of a ligase activity.
The general conditions allowing the hybridization of nucleotide strands are known, and specific conditions may be easily determined, by routine experiments, for strands of a given sequence. The conditions allowing the functioning of the RNA polymerase activity, in the presence of ribonucleoside triphosphates, may also be easily determined by experiments, optionally with the aid of the information provided in the experimental section below.
The 3xe2x80x2 end of the first segment corresponds to position xe2x88x921 in the transcriptional system used. The first segment contains a sufficient number of nucleotides to be able, in the hybridized state, to play the role of a promoter for an RNA polymerase. According to a specific embodiment, the first segment contains at least 9 nucleotides.
In patent FR 2,714,062, it has been shown that short sequences of 6 to 9 consecutive nucleotides chosen from the xe2x88x9212 to xe2x88x924 region of the sense strand of a promoter for a phage RNA polymerase are capable of playing the role of functional promoters in the transcription of a DNA target sequence.
The reagent used in the method of the invention may also exhibit at least one of the following characteristics:
said third segment is flanked, at its upstream end, by a fourth nucleotide segment which is shorter than said second segment of the first strand;
said fourth segment is capable of hybridizing with a portion opposite said second segment.
said first and third segments consist of DNA;
said third and fourth segments consist of DNA or RNA.
The third segment may have the same length as the first segment. It may also be shorter or longer, but its 5xe2x80x2 end must correspond to position xe2x88x921 (that is to say the position immediately preceding the position for initiation of transcription in the case where the template strand is linked to the promoter), when it is hybridized with the first segment.
When the second strand of the reagent does not contain the fourth segment, the reagent may be used in particular to transcribe an RNA whose 3xe2x80x2 end region, or a region close to the 3xe2x80x2 end, has a known sequence, and in this case the second nucleotide segment of the first strand is constructed so that said RNA, in the vicinity of its 3xe2x80x2 end, is capable of hybridization with at least part of the sequence of said second nucleotide segment. The 3xe2x80x2 end of the part of the RNA to be transcribed which is hybridized to the second segment may be contiguous to the 5xe2x80x2 end of the third segment, or it may be distant therefrom by a number x of nucleotides (counted on the second segment), x representing zero or an integer from 1 to 24. Of course, the length of the second segment (in number of nucleotides) is greater than x, in order to be able to ensure the binding of the RNA template to be transcribed, by hybridization with a downstream region of the second segment.
The fourth segment, containing for example from 1 to 18 nucleotides, and in particular from 1 to 12 nucleotides, preferably has a sequence chosen so as to favor the initiation of transcription for a given RNA polymerase (see in particular the experimental section below). The fourth segment may be produced in particular in DNA. Its sequence may be complementary to the upstream region of the second segment facing it and to which it is then hybridized. In this case, the choice of the sequence of the 5xe2x80x2 region of the second segment is dictated by the choice of the sequence of the fourth segment. It is not necessary for the fourth segment to be linked to the third segment since in any case its correct positioning in order to favor the initiation of transcription may be ensured by its hybridization to the second segment. However, in a specific embodiment, the fourth segment is linked to the third segment.
As above, the 3xe2x80x2 end of the target RNA part which is hybridized with the second segment may be distant from the 5xe2x80x2 end of the fourth segment by a number of nucleotides equal to x, as defined above.
For obvious reasons, the second segment contains a number of nucleotides at least equal to the sum of the number of nucleotides of the fourth segment, if it is present, and of the number of nucleotides of said sequence of the second segment which is capable of hybridizing with said region of the RNA to be transcribed.
The method of transcription of the invention may be carried out with a virus or phage wild-type RNA polymerase, and in particular with an RNA polymerase chosen from the family of RNA polymerases, mentioned above, which includes the T7 RNA polymerase, T3 RNA polymerase and SP6 RNA polymerase.
It has indeed been discovered that these RNA polymerases, known to be DNA-dependent, were also capable of transcribing an RNA template, optionally choosing (for example by virtue of the fourth segment described above) a sequence favoring the initiation of transcription.
The discovery of this RNA-dependent RNA polymerase activity makes it possible to have for the first time RNA polymerases capable of transcribing RNA under the control of a promoter, starting from position +1, and in the absence of associated protein factor.
It is also possible to carry out the method of the invention with mutated RNA polymerases which will be described in greater detail below. The importance of these mutated RNA polymerases is that some of them are capable of carrying out the transcription with a better yield when the template consists of RNA than when the template consists of a comparable DNA (that is to say containing deoxyribonucleotides A, C, G in place of the ribonucleotides A, C, G, respectively, and containing the deoxyribonucleotide T in place of the ribonucleotide U).
The invention also relates to the use of an RNA polymerase capable of transcribing an RNA template, under the control of a promoter, in the absence of, auxiliary protein factor, in a method of transcription of a template strand comprising an RNA target sequence in which said RNA polymerase is chosen from the T7 RNA, the SP6 RNA polymerase and the mutated RNA polymerases as defined above. The template strand may consist of RNA, or may consist of DNA in the region for initiation of transcription, and then of RNA next.
The invention also relates to the use of an RNA polymerase capable of transcribing an RNA template, under the control of a promoter, in the absence of auxiliary protein factor, in a method of transcription of a template strand comprising an RNA target sequence, in which said template strand consists of RNA at least between position +5 and the 5xe2x80x2 end of the target. The template strand therefore consists of RNA starting from one of the positions +1 to +5, and may therefore consist of DNA from position +1 to position +2, +3 or +4. The article by S. Leary et al. mentioned below describes the transcription of a template consisting of DNA for positions +1 to +6 and then of RNA for positions 7 and the next ones, with T3 RNA polymerase.
The invention also relates to the RNA-dependent RNA polymerases (RNAdRNAp) obtained by modification of DNA-dependent RNA polymerases and which are capable of synthesizing RNA strands complementary to an RNA template. They may be used, for example, in the sequencing of RNA, the synthesis of RNA probes, and amplification techniques allowing in particular the detection and quantification of RNA.
The known natural RNAdRNAp are not suitable for use as polymerases in several applications because they have acquired a strong discriminatory capacity with respect to their specific RNA template. Furthermore, these enzymes have not been well characterized. For the majority, they form supramolecular complexes composed of both viral and cellular factors; these complexes, which are generally associated with the membranes, are difficult to purify and are unstable during isolation (B. N. Fields, D. M. Knipe, Virology, Vols 1 and 2, Raven Press, New York, (1990); G. P. Pfeifer, R. Drouin, G. P. Holmquist, Mutat. Res. Fundam. Mol. Mech. Mutagen. 288, 39 (1993)).
Few RNAdRNAp have been cloned, sequenced and expressed. The enzyme Qxcex2 replicase is the best characterized. This enzyme is composed of 4 subunits, of which 3 are host factors (M. Kajitarii, A. Ishihama, Nucleic Acids Res. 19, 1063 (1991)). The Qxcex2 enzyme has been isolated; it shows good processability and is capable of carrying out cyclic reactions (P. M. Lizardi, C. E. Guerra, H. Lomeli, I. Tussie-Luna, F. R. Kramer, Bio/Technology 6, 1197 (1988)). However, this enzyme remains very limited in its applications because it recognizes as template only a restricted class of highly structured RNA molecules (V. D. Axelrod, E. Brown, C. Priano, D. R. Mills, Virology 184, 595 (1991)).
Another RNAdRNAp has been partially characterized. It is an enzyme from the Saccharomyces cerevisiae L-A virus. This polymerase, which has been cloned, requires assembling of the viral particle; it binds first of all to the plus RNA strand, and then induces the assembly of the proteins of the particle (T. Fujimura, R. Esteban, L. M. Esteban, R. B. Wickner, Cell 62, 819 (1990)). At least three factors are known to combine with the viral particles. These factors are necessary for the replication of RNA, for transcription and for the coherent maintenance of the particle (T. Fujimura, R. B. Wickner, Molec. Cell. Biol. 7, 420, (1987)). Studies in vitro have shown that an intact viral particle is necessary for the synthesis of the minus strand (replication) (T. Fujimura, R. B. Wickner, Cell 55, 663 (1988)) and for the synthesis of the plus strand (transcription) (T. Fujimura, R. B. Wickner, J. Biol. Chem. 264, 10872 (1989)). Thus, the complexity of this system does not make it easily adaptable to an in vitro transcription system. Furthermore, just like the Qxcex2 system, this system is very discriminatory, accepting only M and L-A viral RNAs as template (T. Fujimura, R. B. Wickner, Cell 55, 663 (1988)).
An RNAdRNAp for which a broad template accepting capacity has been shown is the enzyme of the poliomyelitis virus (J. Plotch, O. Palant, Y. Gluzman, J. Virol. 63, 216 (1989)). However, several problems exist with this enzymatic system: the priming is dependent either on an unidentified host factor or on the addition of a poly(U) oligonucleotide. However, given that priming with a poly(U) oligonucleotide is not selective with respect to the template, many products of different sizes are synthesized, in particular products having twice the length of the template. Furthermore, the sequential synthesis of the plus and minus strands has not been demonstrated (S. J. Plotch, O. Palant, Y. Gluzman, J. Virol. 63, 216 (1989), T. D. Hey, O. C. Richards, E. Ehrenfeld, J. Virol. 61, 802 (1987), J. M. Lubinski, L. J. Ransone, A. Dasgupta, J. Virol 61, 2997 (1987)).
Among the DNAdRNAp polymerases, the enzymes of the T3 and T7 bacteriophages are capable of using RNA as template under particular conditions. For example, the T3 DNAdRNAp may transcribe a single-stranded RNA template (i.e. the messenger RNA for the gene for resistance to neomycin) if it is ligated to the antisense strand of the T3 promoter including the sequence for initiation from +1 to +6 (S. Leary, H. J. Baum, Z. G. Loewy, Gene 106, 93 (1991)). It is also known that the T7 RNA polymerase can transcribe, from one end to the other, an RNA template in the absence of the promoter sequence (M. Chamberlin, J. Ring, J. Biol. Chem. 248, 2235 (1973)). Furthermore, it has been shown that the T7 RNA polymerase can efficiently transcribe two small specific RNA templates, the xe2x80x9cXxe2x80x9d and xe2x80x9cYxe2x80x9d RNAs, producing both plus and minus RNA copies. This replication, which is obtained in the absence of a consensus promoter sequence, appears to be dependent on the presence of a specific secondary structure (M. M. Konarska, P. A. Sharp, Cell 57, 423 (1989), M. M. Konarska, P. A. Sharp, Cell 63, 609 (1990)). On the other hand, the xe2x80x9cXxe2x80x9d and xe2x80x9cYxe2x80x9d RNAs are not replicated by the T3 RNA polymerase, and it is not known if this enzyme is not capable of replicating highly structured RNAs, or if the sequence specificity of this enzyme prevents its recognition of the xe2x80x9cXxe2x80x9d and xe2x80x9cYxe2x80x9d RNAs. In the absence of a promoter, it has also been shown that the T7 DNAdRNAp was capable of carrying out the extension of two overlapping RNA strands in antisense (C. Cazenave and O. C. Ulhlenbeck, Proc. Natl. Acad. Sci. USA 91, 6972 (1994)). Likewise, it has been shown that the wild-type T7 DNAdRNAp is capable of carrying out the extension of an RNA primer on a single-stranded DNA template, in the absence of a promoter (S. S. Daube and P. H. von Hippel, Biochemistry 33, 340 (1994)).
The best characterized bacteriophage enzyme is the T7 RNA polymerase, a monomeric enzyme of 98 kDa (B. A. Moffatt, J. J. Dunn, F. W. Studier, J. Mol. Biol. 173, 265 (1984)). This monomeric polymerase has all the essential properties of an RNA polymerase, that is to say recognition of a promoter, initiation of transcription, extension and termination (M. Chamberlin, T. Ryan, The Enzymes XV, 87 (1982)). Furthermore, the catalytic activity requires few elements, namely a template, ribonucleoside triphosphates and the divalent Mg2+ ion, and it does not require any auxiliary protein factor for the initiation or termination of transcription, unlike the other RNA polymerases (M. Chamberlin, T. Ryan, The Enzymes XV, 87 (1982)).
Mutagenesis of the T7 RNA polymerase gene has made it possible to identify and to define regions or residues involved in the polymerase function. A mutagenesis strategy has consisted in exchanging elements between the T7 RNA polymerase and its close relative the T3 RNA polymerase whose amino acid sequence is 82% identical (K. E. Joho, L. B. Gross, N. J. McGraw, C. Raskin, W. T. McAllister, J. Mol. Biol. 215, 31 (1990)). This strategy has led to the identification of polymerase elements involved in the recognition of the promoter. It has been shown, for example, that the substitution of a single amino acid in the T3 (or T7) enzyme allows the mutated enzyme to specifically recognize the heterologous T7 (or T3) promoter (C. A. Raskin, G. Diaz, K. Joho, W. T. McAllister, J. Mol. Biol. 228, 506 (1992)). In the same manner, reciprocal substitutions in the respective promoter sequences confer on the mutated promoter the capacity to be recognized by the heterologous enzyme (C. A. Raskin, G. Diaz, K. Joho, W. T. McAllister, J. Mol. Biol. 228, 506, (1992)).
T7 RNA polymerase has been crystallized and its structure determined at a resolution of 3.3 xc3x85 (R. Sousa, Y. J. Chung, J. P. Rose, B.-C. Wang, Nature 364, 593 (1993)). From this structural study, sequence alignments (K. E. Joho, L. B. Gross, N. J. McGraw, C. Raskin, W. T. McAllister, J. Mol. Biol. 215, 31 (1990); S. Mungal, B. M. Steinberg, L. B. Taichman, J. Virol. 66, 3220 (1992), W. T. McAllister, C. A. Raskin, Molec. Microbiol. 10, 1 (1993)) and mutagenic studies (D. Patra, E. M. Lafer, R. Sousa, J. Mol. Biol. 224, 307 (1992); L. Gross, W-J. Chen, W. T. McAllister, J. Mol. Biol. 228, 1 (1992)), it has been possible to correlate the functional elements of the T7 enzyme with the structural element. T7 RNA polymerase may be divided into two functional domains: a promoter recognition domain and a catalytic domain (R. Sousa, Y. J. Chung, J. P. Rose, B. -C. Wang, Nature 364, 593 (1993), W. T. McAllister, Cell. Molec. Biol. 39, 385 (1993)).
The T7 RNA polymerase asparagine 748 has been shown to interact with nucleotides xe2x88x9210 and xe2x88x9211 in the promoter sequence, an interaction shown to be responsible for the promoter specificity (C. A. Raskin, G. Diaz, K. Joho, W. T. McAllister, J. Mol. Biol. 228, 506 (1992). The possibility that a sigma-type interaction between the T7 polymerase and its promoter can exist in the bacteriophage system has been mentioned. Indeed, a sigma-type sequence, corresponding to the 2.4 region of sigma, i.e. the region of sigma interacting with the xe2x80x9cPribnow boxxe2x80x9d (TATAATG sequence recognized by the E.coli sigma 70 transcription factor) (C. Waldburger, T. Gardella, R. Wong, M. M. Susskind, J. Mol. Biol. 215, 267 (1990); D. A. Siegele, J. C. Hu, W. A. Walter, C. A. Gross, J. Mol. Biol. 206, 591 (1989)), exists in the N-terminal region of the T7 RNA polymerase between amino acids 137 and 157 (L. Gross, W -J. Chen, W. T. McAllister, J. Mol. Biol. 228, 1 (1992)). Moreover, although it has not been possible to attribute any function to it, the 230 to 250 region exhibits sequence homologies with the E.coli xcex repressor (McGraw, N. J., Bailey, J. N., Cleaves, G. R., Dembinski, D. R., Gocke, C. R., Joliffe, L. K., MacWright, R. S. and McAllister, W. T. Nucleic Acids Res. 13, 6753 (1985)).
The catalytic domain consists of a pocket resulting from the bringing into close proximity of several regions dispersed over the primary structure (R. Sousa, Y. J. Chung, J. P. Rose, B.-C. Wang, Nature 364, 593 (1993), W. T. McAllister, C. A. Raskin, Molec. Microbiol. 10, 1 (1993); D. Moras, Nature 364, 572 (1993)). This pocket contains in particular several conserved motifs among which the A and C motifs are the best conserved in the polymerases (Poch, O., Sauvaget, I., Delarue, M. and Tordo, N. EMBO J. 8, 3867 (1989); Delarue, M., Poch, O., Tordo, N. and Moras, D. Protein Engineering 3, 461 (1990); W. T. McAllister, C. A. Raskin, Molec. Microbiol. 10, 1 (1993)). A third motif, the B motif, is conserved in DNA-dependent RNA and DNA polymerases whereas a Bxe2x80x2 motif which is different (both for the sequence and for the apparent structure) exists in the RNA-dependent RNA and DNA polymerases (Poch, O., Sauvaget, I., Delarue, M. and Tordo, N. EMBO J. 8, 3867 (1989); Delarue, M., Poch, O., Tordo, N. and Moras, D. Protein Engineering, 3, 461 (1990); L. A. Kohlstaedt, J. Wang, J. M. Friedman, P. A. Rice, T. A. Steitz, Science 256, 1783 (1992); W. T. McAllister, C. A. Raskin, Molec. Microbiol. 10, 1 (1993)).
One of the aspects of the present invention is based on the discovery that certain mutated DNA-dependent RNA polymerases are capable of transcribing a single-stranded or double-stranded RNA in the presence of a double-stranded DNA promoter. Furthermore, these mutant enzymes are not very capable or are incapable of transcribing single-stranded or double-stranded DNA in the presence of a double-stranded DNA promoter. They are therefore preferably or strictly RNA-dependent. Their use is particularly advantageous in cases where it is desired to selectively transcribe the RNA, in particular when the starting biological sample contains or risks containing DNA having a sequence identical or similar to that of the RNA to be amplified.
The subject of the invention is therefore an RNA polymerase capable of transcribing a polynucleotide segment of interest of any sequence contained in a polynucleotide template, by synthesizing, in the presence of said template, and under the control of a promoter, a product of transcription containing an RNA sequence complementary to the sequence of said polynucleotide segment of interest, characterized in that it is capable of synthesizing said product of transcription with a better yield when said sequence of interest contained in the template consists of RNA than when said sequence of interest contained in the template consists of DNA.
The invention relates in particular to an RNA polymerase defined as above such that the ratio of the yield of product of transcription of a DNA template to the yield of product of transcription of an RNA template, expressed in %, is less than 95%, especially less than 85% and in particular less than 70%.
The subject of the invention is in particular an RNA polymerase as defined above, characterized in that the ratio of the yield of product of transcription of the RNA template to the yield of product of transcription of the DNA template is at least equal to 2 and in particular at least equal to 10.
The xe2x80x9cyieldxe2x80x9d of transcription is the molar ratio of the quantity of product of transcription to the quantity of polynucleotide template present at the origin. This yield may be easily determined experimentally, by introducing into the reaction medium a determined quantity of the polynucleotide template. For comparison of the yields obtained with a DNA template and an RNA template, conditions other than those of the nature of the template must obviously be comparable.
The RNA polymerase of the invention is capable of transcribing a polyribonucleotide template of any sequence, and it differs from Qxcex2-replicase in this respect. It preferentially or exclusively transcribes an RNA template, and it differs from known phage DNA-dependent RNA polymerases in this respect.
The RNA polymerases of the invention, unlike the known natural RNAdRNAP polymerases, are in particular RNA polymerases capable of functioning without associated protein cofactor(s). They may however be provided in the form of multimers, and in particular of dimers.
The mutated RNA polymerases of the invention are therefore generally obtained from RNA polymerases which are themselves capable of functioning without protein cofactors.
The RNA polymerases of the invention may be in particular RNA polymerases which are derived by mutation from a virus or phage DNA-dependent RNA polymerase, and in particular from a DNA polymerase of an E. coli phage. Among the E. coli phages, there may be mentioned in particular T3, T7 and SP6.
An RNA polymerase according to the invention may possess a protein sequence homology greater than 50%, and in particular greater than 80% with a wild-type RNA polymerase of the family of DNA-dependent RNA polymerases including the T7 RNA polymerase, T3 RNA polymerase and SP6 RNA polymerase.
The abovementioned family of DNA-dependent RNA polymerases is known; see for example the article by R. Sousa, TIBS 21, 186-190 (1996), and the references cited in that article.
Among the polymerases of the invention, there may be mentioned in particular those which contain at least one mutation in a region corresponding to the T7 RNA polymerase sequence containing amino acids 625-652, and in particular those which have the composition of a wild-type DNA-dependent RNA polymerase, with the exception of the fact that they contain at least one mutation in said region. xe2x80x9cMutationxe2x80x9d is understood here to mean the replacement, deletion or insertion of an amino acid.
There may be mentioned for example the RNA polymerases containing at least one mutation at a position corresponding to one of positions 627, 628, 631, 632 and 639 of the T7 RNA polymerase amino acid sequence; in particular said mutation may comprise the replacement of an amino acid residue, chosen from arginine, lysine, serine and tyrosine, of the wild-type RNA polymerase with another amino acid residue. The amino acid replaced is for example an arginine or a lysine. The replacement amino acid may be chosen in particular from alanine, valine, leucine, isoleucine, glycine, threonine or serine. It is understood that the expression xe2x80x9camino acidxe2x80x9d designates here, by a misuse of language, an amino acid residue engaged in a peptide bond.
Reference was made above to the peptide sequence of the T7 RNA polymerase. The numbering of the amino acid residues adopted here is that described by Dunn, J. J. and Studier, F. W. J. Mol. Biol. 148(4), 303-330 (1981), and by Stahl, S. J. and Zinn, K., J. Mol. Biol. 148(4), 481-485 (1981).
The invention also relates to:
a gene encoding an RNA polymerase as defined above; such a gene may be obtained for example according to a method similar to that described below in the experimental part;
an expression vector into which such a gene is inserted, said vector being capable of expressing said RNA polymerase in a host cell; this vector may be obtained in a manner known per se;
a host cell containing such a vector.
The invention also relates to a method of producing an RNA polymerase as defined above, characterized in that: a) a gene encoding a wild-type RNA polymerase is obtained in a known manner, b) at least one mutation is performed on said gene, c) the mutated gene obtained is inserted into an expression vector, d) said vector is expressed in a host cell in order to obtain a mutated RNA polymerase and e) among the mutated RNA polymerases obtained, those which exhibit at least one of the properties of an RNA polymerase as defined above are selected.
A more detailed description of a particular embodiment of the method of the invention will be given below in the case of the use of the T7 RNA polymerase as starting material.
A modular gene for T7 DNAdRNAp was prepared, this gene resulting from the assembly of different cassettes (see Example 1 and FIG. 1).
The modular gene thus defined is characterized in that it contains 10 cassettes bordered by unique restriction sites in the cloning vector.
In particular, these cassettes, bordered by unique restriction sites, are characterized in that each cassette comprises a region of interest, in particular those involved in promoter recognition (region exhibiting homology with the E. coli a factor; region exhibiting homology with the E. coli "sgr" repressor; region conferring promoter specificity) and those involved in the catalytic site (motif A; motif B; motif C).
For the definition of motifs A, B and C. see for example R. Sousa, TIBS 21, 186-190 (1996).
These cassettes, derived from The T7 DNAdRNAp gene 1, were obtained with the aid of conventional molecular biology techniques (F. M. Ausubel, R. Brent, R. E. Kingston, D. D. Moore, J. G. Seidman, et al. Current Protocols in Molecular Biology (Current Protocols, 1993)), in particular PCR, which made it possible to introduce restriction sites by silent site-directed mutagenesis, and subcloning into a cloning vector.
The modular gene thus obtained is characterized by the presence of restriction sites bordering the cassettes, these restriction sites being the Nco I (xe2x88x922, +4), Bcl I (218, 223), Hind III (539, 544), SacI (776, 781), PstI (1587, 1592), BglII (1811, 1816), NdeI (1865, 1870), XhoI (1951, 1956), ClaI (2305, 2310), SalI (2497, 2502), XbaI (2660, 2665) site; position 1, in nucleic acids, corresponds to the adenine of the initiator ATG codon and position 2652 to the third base of the TAA terminator codon. The NdeI site (2504, 2509) as destroyed. All the mutations inducing these restriction sites are silent, except the mutation generating the NcoI site which induces the replacement of asparagine at position +2 with a glycine. Position 1, in amino acids, corresponds to the first methionine, position 883 corresponds to the carboxy-terminal alanine.
The modular gene, cloned into a cloning vector pGNEX derived from pGEM-1 in which the polylinker has been replaced by an adaptor containing the Nco I, EcoRI, XbaI restriction sites constitutes the basic support for subsequent mutageneses. This is made possible by the fact that each cassette containing a region of interest is bordered by unique restriction sites in the cloning vector.
The introduction of nonsilent mutations, with the aid of PCR techniques, into one or more cassettes of the modular gene previously defined, led to genes encoding polymerases exhibiting an amino acid sequence differing by at least one amino acid with respect to the T7 expressed from the modular gene. Mutant genes were in particular prepared which encode at least one modified amino acid in the B motif of the wild-type enzyme, for example with an alanine (A) in place of an arginine (R) at position 627 and/or an alanine (A) in place of a serine (S) at position 628 and/or an alanine (A) in place of a lysine (K) at position 631 and/or an alanine (A) in place of an arginine (R) at position 632 and/or an alanine (A) in place of a tyrosine (Y) at position 639.
Mutant genes were also obtained which encode a polymerase whose 625VTRSVTKRSVMTLAYGSKEFGFRQQVLD652 (SEQ ID NO: 6) region comprising the B motif has been replaced as a whole or in part by the homologous region Bxe2x80x2 present in some RNA-dependent polymerases, in particular those of the polymerases of the hepatitis C virus (NCGYRRCRASGVLTTSCGNTLTCYI) (SEQ ID NO:7), and of the yeast integrase 32 (HNTTLGIPQGSWVSPILCNIFLDKL) (SEQ ID NO:8).
The genes described above have been cloned into a vector pMR resulting from the ligation of the SspI fragment of pMal-c (Biolabs) containing in particular the lacIq repressor, and of the SspI fragment of pMH (V. Cheynet, B. Verrier, F. Mallet, Protein expression and purification 4, 367 (1993)) containing a minicistron making it possible to achieve a high level of expression, as well as a sequence encoding a poly-histidine tail fused with the terminal end of the cloned gene (Example 1, FIG. 2). The expression of the recombinant proteins T7 RNAp into the bacterial strain BL21 represents up to 30% of the total proteins of the bacterium. The proteins solubilized in a deoxycholate buffer containing a high salt concentration are deposited on a TALON column (Clontech) allowing specific purification by chelation with the ion of proteins having a poly-histidine tail. 130 to 2200 xcexcg of polymerases are thus obtained for 20 ml of culture, with a purity greater than 95% as indicated (i) by a Coomassie blue staining (Example 1, FIG. 3), (ii) by a Western-blot analysis with a guinea-pig polyclonal antibody anti-T7 RNAP (bioMxc3xa9rieux) and a mouse monoclonal antibody (Quiagen) anti-MRGSHHHHHH (SEQ ID NO: 9), (iii) by the absence of endonuclease, single-stranded and double-stranded exonuclease or ribonuclease activity, as determined essentially according to the method of He et al (B. A. He, M. Rong, D. L. Lyakhov, H. Gartenstein, G. Diaz, et al, Protein Expr Purif 9, 142 (1997)). This result reflects the performances of the host-pMR-BL21 vector pair and of the method of purification with respect to the MRGSHHHHHSVLE (SEQ ID NO: 10) tail.
The subject of the invention is also the use of an RNA polymerase as defined above, in a method of transcription of a polynucleotide segment of interest having any sequence, said segment, of the RNA type, being contained in a polynucleotide template, so as to synthesize, in the presence of said template, a product of transcription containing an RNA sequence complementary to the sequence of said polynucleotide segment of interest.
According to a particular embodiment, said use is characterized in that said polynucleotide template comprises, upstream of said polynucleotide segment of interest, a promoter recognized by said RNA polymerase, and in that said product of transcription is an RNA complementary to a sequence of the template starting at a site of initiation of transcription for said promoter.
The RNA polymerases of the invention may be used in particular to carry out (i) an amplification of an RNA target isothermally, (ii) a direct sequencing of, RNA and (iii) the synthesis of RNA of special interest (for example probes, ribozymes and the like). In addition, RNA polymerases of the invention are capable of incorporating modified bases into the newly-synthesized strand, which facilitates in particular the quantification or the use of said strand.
The invention relates in particular to the use of these recombinant enzymes thus expressed and purified in a method of synthesizing RNA from an RNA template, under the control of a promoter.
The enzymes thus purified were evaluated, in a promoter-dependent context, on different templates (Example 2, FIG. 4 on the one hand, and Example 3, FIG. 6, on the other hand) in particular a template containing a single-stranded RNA. It has been shown that a mutated polymerase obtained according to the invention was capable of generating a specific transcript of the correct size in particular on a single-stranded RNA template. In Example 2, it is indicated that the wild-type enzyme identically produced does not appear to be able to carry out this phenomenon. However, Example 3 shows that the wild-type enzyme in fact possesses the property of transcribing an RNA template. These apparently divergent results are explained by the fact that the experimental conditions in these two examples are different. Indeed, in Example 2, the presence of the transcript is identified by the technique of incorporating a UTP, labeled with radioactive phosphorus, whereas in Example 3 the technique used is Northern blotting for the group 2 templates, and in the latter case, the detection of the transcript by the Northern blotting technique is 40 times more sensitive than the detection by incorporation of radioactive phosphorus. Example 2 below therefore shows that a mutated polymerase obtained according to the invention is capable of generating a specific transcript of the correct size on a single-stranded RNA template, and Example 3 shows that in fact the corresponding wild-type polymerase is capable of generating a specific transcript of the correct size on RNA templates independently of their sequence.
Furthermore, such a mutated polymerase is incapable, unlike the wild-type polymerase, of generating a transcript of the correct size on a single-stranded or double-stranded DNA template. If the Mg2+ ion present in the reaction medium is replaced by the Mn2+ ion, the mutated enzyme, on a single-stranded RNA template, does not generate under these conditions a specific transcript of the correct size, but is nevertheless capable of generating large quantities of abortive products. Such a mutated polymerase is in addition capable of displacing an RNA/RNA hybrid.