The present invention relates to a modified promoter for RNA polymerase and to its applications. More especially, it relates to the use of a promoter for phage RNA polymerases, this promoter being modified relative to the promoters present in nature or which have already been described. It has been discovered that it is possible to reduce the size of the promoter sequence and to integrate mispairings between the two strands of the promoter. The modified promoters obtained according to the invention make it possible, in particular, to effect the transcription of a nucleotide target from a site which is not normally a transcription start site for the RNA polymerase used. By means of the invention, it becomes possible to transcribe in vitro a wide diversity of sequences which are not transcribed by phage RNA polymerases when the wild-type promoters (that is to say the natural promoters) of the said RNA polymerases are used. The use of these modified promoters makes it possible, furthermore, to obtain different efficiencies of initiation of transcription for the differential transcription of several sequences within a same reaction medium.
RNA (ribonucleic acid) is known to be the transcription product of a molecule of DNA (deoxyribonucleic acid) synthesized under the action of an enzyme, DNA-dependent RNA polymerase.
It is advantageous in several respects to be able to obtain several RNA sequences from a DNA sequence. Different applications of the obtaining of specific RNA sequences have already been described, such as, for example, the synthesis of RNA probes or of oligoribonucleotides (see, in particular, Milligan, J. F., Groebe, D. R., Witherell, G. W. and Uhlenbeck, O. C. (1987) Nucleic Acids Res. 25, 8783-8798), or the expression of genes (see, in particular, Steen, R. et al. (1986) EMBO J. 5, 1099-1103 and Fuerst, T. R. et al. (1987) Molecular and Cellular Probes 7, 2538-2544 and Patent Applications WO 91/05,866 and EP 0,178,863), or alternatively gene amplification as described by Kievits, T. et al. (Journal of Virological Methods 35, 273-286 (1991)) and Kwoh, D. Y. et al. (Proc. Natl. Acad. Sci. USA 86, 1173-1177 (1989)) or in Patent Applications WO 88/10,315 and WO 91/02,818.
One of the distinctive features of DNA-dependent RNA polymerases is that of initiating RNA synthesis according to a DNA template from a particular start site as a result of the recognition of a nucleic acid sequence, termed promoter, which makes it possible to define the precise localization and the strand on which initiation is to be effected. Contrary to DNA-dependent DNA polymerases, polymerization by DNA-dependent RNA polymerases is not initiated from a 3'-OH end, and their natural substrate is an intact DNA double strand.
A few generic terms which will be employed are defined below.
Oligonucleotide or polynucleotide is understood to mean nucleotide sequences containing the natural bases (A, C, G, T, U) and/or one or more modified bases such as inosine, 5-methyldeoxycytidine, deoxyuridine, 5-(dimethylamino)deoxyuridine, 2,6-diaminopurine, 5-bromodeoxyuridine or all modified bases permitting hybridization, especially those modified by all suitable chemical modifications enabling the hybridization yield to be increased. Examples of such modifications are, in particular, the introduction between at least two nucleotides of a group chosen from diphosphate, alkyl-and/or arylphosphonate and/or phosphorothioate esters, or the replacement of at least one sugar (ribose or deoxyribose) by a polyamide; see, for example, Nielsen, P. E. et al., Science, 254, 1497-1500 (1991). The term "nucleotide" or "base" denotes not only a natural deoxyribonucleotide or ribonucleotide, but any nucleotide modified as has just been mentioned.
Promoter is understood to mean any nucleic acid sequence capable of being recognized by an RNA polymerase to initiate transcription, that is to say the synthesis of an RNA. This RNA polymerase can be either DNA-dependent or RNA-dependent.
Natural promoter is understood to mean the promoter sequences present in the genome, coding for the RNA polymerase for which this promoter is specific. Promoter sequence is understood to mean the sequence of an oligonucleotide which participates in the composition of one of the strands of the promoter.
Transcription is understood to mean the neosynthesis of several RNA strands complementary to the sequence (or target) which is under the control of the promoter, the promoter permitting initiation of this reaction. This sequence is in general adjacent to the 5' end of the template strand of the promoter. The RNA strands synthesized are termed transcripts.
Differential transcription is understood to mean a transcription giving different numbers of transcripts from different templates.
Mutation is understood to mean any change in sequence relative to the natural sequence. The mutation can affect both complementary bases positioned on each strand of a nucleic acid duplex, or affect only one of them. If the mutation affects only one of the two complementary bases, it gives rise to the appearance of what is called a mispairing.
Mispairing is understood to mean the introduction within a nucleic acid double strand of pairs of bases other than A:T, C:G or A:U.
Non-pairing is understood to mean the presence of nucleotides which are not paired with a base of the complementary nucleic acid strand, by virtue of their nature or by virtue of their position in the sequence. If several bases of the same strand are unpaired, they can form bonds with one another which can cause the appearance of secondary structures termed "loops" or "stem-loops".
Deletion is understood to mean the removal of some nucleotides from a sequence. A deletion can affect only one base, or several adjacent bases. If several non-adjacent bases are affected, this will be referred to as multiple deletions. On a double-stranded nucleic acid, a deletion generally affects, together, both complementary bases positioned on each nucleic acid strand. However, a deletion can also take place only on one or several bases of only one strand, and cause non-pairings.
Consensus sequence is understood to mean a theoretical nucleotide sequence in which the nucleotide at each site is the one which appears most commonly at this site in the different natural forms of the genetic element in question (for example the promoter). The term "consensus" also denotes any actual sequence very closely similar to the theoretical consensus sequence.
It will be recalled that transcription takes place by synthesis, in the 5'.fwdarw.3' direction, of an antiparallel RNA complementary to the nucleotide strand transcribed (termed "template strand" or "antisense strand"). The DNA strand which is complementary to the transcribed strand is termed "non-template strand" or "sense strand". By convention, the nucleotide from which transcription starts is designated +1 (or simply 1). On the template strand, the successive nucleotides located beside the 3' end (upstream region) are, starting from +1, numbered -1, -2, and the like. A downstream region relative to a given nucleotide (or to a given sequence) is located towards the 5' end of the template strand, and hence towards the 3' end of the RNA strand synthesized. Starting from +1, nucleotides downstream correspond successively to positions +2, +3, and the like.
Compared to bacterial, eukaryotic or mitochondrial RNA polymerases, phage RNA polymerases are very simple enzymes. Among these, the best known are the RNA polymerases of bacteriophages T7, T3 and SP6. Bacteriophage T7 RNA polymerase has been cloned (see, in particular, U.S. Pat. No. 4,952,496). These enzymes are very homologous with one another, and are composed of a single subunit of 98 to 100 kDa. Two other phage polymerases share these homologies: that of Klebsiella phage K11 and that of phage BA14; see Diaz et al., J. Mol. Biol. 229: 805-811 (1993).
The natural promoters specific for the RNA polymerases of phages T7, T3 and SP6 are well known. Sequencing of the whole bacteriophage T7 genome in 1983 by Dunn et al. (J. Mol. Biol. 166, 477-535 (1983)) enabled the existence of 17 promoters to be defined on this DNA, these being shown in Table 1 below (non-template strand).
TABLE 1 __________________________________________________________________________ Promoters for T7 RNA polymerase Promoter Nucleotide __________________________________________________________________________ -101 Conserved sequence TAATACGACTCACTATAGGGAGA (SEQ ID NO:10) Replication promoter -20 -101 . .. .phi.OL TTGTCTTTAT TAATACAACTCACTATAAGGAGA GA (SEQ ID NO:11) Class II promoters -20 -101 . .. .phi.1-1A AACGCCAAAT CAATACGACTCACTATAGAGGGA CA (SEQ ID NO:12) .phi.1-1B TTCTTCCGGT TAATACGACTCACTATAGGAGGA CC (SEQ ID NO:13) .phi.1-3 GGACTGGAAG TAATACGACTCAGTATAGGGACA AT (SEQ ID NO:14) .phi.1-5 AGTTAACTGG TAATACGACTCACTAAAGGAGGT AC (SEQ ID NO:15) .phi.1-6 TGGTCACGCT TAATACGACTCACTAAAGGAGAC AC (SEQ ID NO:16) .phi.2-5 AGCACCGAAG TAATACGACTCACTATTAGGGAA GA (SEQ ID NO:17) .phi.3-8 CGTGGATAAT TAATTGAACTCACTATTAGGGAA GA (SEQ ID NO:18) .phi.4c CCGACTGAGA CAATCCGACTCACTAAAGAGAGA GA (SEQ ID NO:19) .phi.4-3 AGTCCCATTC TAATACGACTCACTAAAGGAGAG AC (SEQ ID NO:20) .phi.4-7 TTCATGAATA CTATTCGACTCACTATAGGAGAT AT (SEQ ID NO:21) Class III promoters -20 -101 . .. .phi.6-5 GTCCCTAAAT TAATACGACTCACTATAGGGAGA TA (SEQ ID NO:22) .phi.9 GCCGGGAATT TAATACGACTCACTATAGGGAGA CC (SEQ ID NO:23) .phi.10 ACTTCGAAAT TAATACGACTCACTATAGGGAGA CC (SEQ ID NO:24) .phi.13 GGCTCGAAAT TAATACGACTCACTATAGGGAGA AC (SEQ ID NO:25) .phi.17 GCGTACCAAA TAATACGACTCACTATAGGGAGA GG (SEQ ID NO:26) Replication promoter -20 -101 . .. .phi.OR CACGATAAAT TAATACGACTCACTATAGGGAGA GG (SEQ ID NO:27) __________________________________________________________________________
As is apparent from the above table, twenty-three adjacent nucleotides located between positions -17 and +6 relative to the transcription start site (position 1) are highly conserved. These nucleotides are even identical in the 5 class III promoters, which are the most efficient in vivo and in vitro. Among the 11 different promoters, the majority diverge in respect of the nucleotides lying between positions +3 and +6. Among the nucleotides lying between positions -16 and -1, nine are completely conserved, and four others are conserved in 16 promoters. These findings suggest that these 13 nucleotides are important factors in the definition of a promoter. These conclusions have been confirmed by demonstrating the efficacy of a promoter after cleavage of the DNA upstream of position -21 (Osterman, H. L. et al. (1981) Biochemistry 20, 4884-4892) and -17 (Martin, C. T. et al. (1987) Biochemistry 26, 2690-2696). In contrast, these same publications show that a cleavage upstream of -10 or -12 abolishes this efficacy.
Phage T3 RNA polymerase is almost as well known as T7 RNA polymerase. These two enzymes have a very similar structure (80% identity) (McGraw, N. J. et al. (1985) Nucleic Acids Res. 13, 6753-6766), but possess, however, an almost absolute template specificity. The sequences of 11 promoters specific for T3 RNA polymerase have been determined and are presented in Table 2 below (non-template strand).
TABLE 2 __________________________________________________________________________ Promoters for T3 RNA polymerase Promoter Nucleotide __________________________________________________________________________ -20 -101 .. Conserved sequence TATTAACCCTCACTAAAGGGAGA (SEQ ID NO:28) -20 -101 .. GTC TATTTACCCTCACTAAAGGGAAT AAGG (SEQ ID NO:29) TAG CATTAACCCTCACTAACGGGAGA CTAC (SEQ ID NO:30) TAC AGTTAACCCTGACTAACGGGAGA GTTA (SEQ ID NO:31) AAG TAATAACCCTCACTAACAGGAGA ATCC (SEQ ID NO:32) GGG CATTAACCCTCACTAACAGGAGA CACA (SEQ ID NO:33) GCC TAATTACCCTCACTAAAGGGAAC AACC (SEQ ID NO:34) TAC AATTAACCCTCACTAAAGGGAAG AGGG (SEQ ID NO:35) TCT AATTAACCCTCACTAAAGGGAGA GACC (SEQ ID NO:36) ACC TAATTACCCTCACTAAAGGGAGA CCTC (SEQ ID NO:37) GTG AATTAACCCTCACTAAAGGGAGA CACT (SEQ ID NO:38) TTG CATTAACCCTCACTAAAGGGAGA GAGG (SEQ ID NO:39) __________________________________________________________________________
At the present time, 4 different sequences of promoters for phage SP6 RNA polymerase have been demonstrated by Brown, J. E., et al. (Nucleic Acids Res. 14, 3521-3526) (1986)). Phage SP6 RNA polymerase also displays many similarities with phage T7 RNA polymerase. These sequences are presented in Table 3 below (non-template strand).
TABLE 3 __________________________________________________________________________ Promoters for SP6 RNA polymerase Promoter Nucleotide __________________________________________________________________________ -101 .. Conserved sequence ATTTAGGTGACACTATAGAAGGG (SEQ ID NO:40) -20 -101 . .. pSF64 ACACATACG ATTTAGGTGACACTATAGAATAC AA (SEQ ID NO:41) pJEB1 TAATTGCCT ATTTAGGTGACACTATAGAAGGG AG (SEQ ID NO:42) pJEB4 GGACTTGGT AATTAGGGGACACTATAGAAGGA GG (SEQ ID NO:43) pJEB6 GTGTCTCTT ATTTAGGGGACACTATAGAAGAG AG (SEQ ID NO:44) __________________________________________________________________________
As is the case for T7 RNA polymerase, the sequences of the promoters for T3 RNA polymerase and for SP6 RNA polymerase are very similar, in particular between positions -17 and +6. Comparison of the sequences of these three promoters (FIG. 1) shows the existence of a common sequence from position -7 to -3; see in this connection, in particular: Brown, J. E. et al., Nucleic Acid Res., 14, 3521-3526 (1986) and Bailey et al., Proc. Natl. Acad. Sci. USA 80: 2814-2818 (1983).
Hence it is possible to consider that the various phage RNA polymerases studied above belong to a family of RNA polymerases which recognize promoters possessing a consensus sequence from position -17 to position +6, and in particular from -17 to -1.
To obtain the RNA corresponding to a given DNA sequence through the action of an RNA polymerase, it is necessary to place this sequence under the control of the promoter of this RNA polymerase. This is the so-called step of installation of a promoter, immediately upstream of the sequence to be transcribed. This installation in the present state of the art requires the use of laborious methods.
The most traditional method of installation of a promoter is the cloning of the sequence to be transcribed into a vector containing a promoter for a phage RNA polymerase upstream of a cloning site. Several vectors of this type are on the market, such as, for example, pT3/T7-LUC (Clontech Laboratories Inc.), or Lambda ZAP II, Uni-ZAP XR, Lambda DASH II, Lambda FIX II, pWE 15 and SuperCos cosmid (Stratagene Cloning System), or pT7-0, pT71 or pT7-2 (United States Biochemical) or alternatively pT7/T3a-18 or pT7/T3a-19 (GIBCO BRL), or are described in publications, such as the pET vectors (Rosenberg, A. H. et al., Gene 56, 125-135 (1987)). After the DNA fragment to be cloned has been obtained and inserted into the vector, and the vector has been amplified, transcription may be effected. This method enables any sequence to be transcribed. However, it is very laborious, and not every kind of end can be obtained on the RNA in this way, since the localization of this end is imposed by the enzyme restriction site used for the cloning.
Some known methods for the in vitro synthesis of oligoribonucleotides require the synthesis by chemical means of two complementary oligodeoxynucleotides comprising a promoter for a phage RNA polymerase. It has been demonstrated (Milligan et al., paper cited) that a partially single-stranded template which is double-stranded only on the promoter sequence, from position +1 to -14, is as active in transcription as a double-stranded template. Thus, the method requires the synthesis of a 15-mer nucleotide comprising the sequence of the non-template strand of a phage promoter from position -14 to +1, and of a second oligonucleotide containing at its 3' end the sequence complementary to the first oligonucleotide, and on its 5' region the sequence complementary to the oligoribonucleotide which it is desired to synthesize. Since chemical synthesis does not enable oligonucleotides of good quality to be obtained if their size is larger than 70 bases, this technique is not applicable for oligoribonucleotides of more than 55 bases if the promoter sequences already described are used. Furthermore, the method requires exact knowledge of the RNA sequence which it is desired to synthesize.
Another method of installation of a promoter on a sequence, by ligation, is possible (Leary, S. et al., Gene 106, 93-96 (1991)) if this sequence possesses a well-defined 3' end. It comprises the hybridization with this 3' end of an oligonucleotide carrying the sequence of the non-template strand of the promoter, followed by the sequence complementary to the 3' end of the target sequence, over a sufficient length to permit hybridization. A second oligonucleotide is involved, the sequence of which is complementary to the promoter region of the first oligonucleotide, and which carries a phosphate group at the 5' end. Hybridization of these two oligonucleotides and the target brings the 5'-phosphate end of the oligonucleotide carrying the sequence of the template strand of the promoter into contact with the 3'-OH end of the target. Through the action of a ligase, a phosphodiester link between these two ends is established, permitting the formation of a complex in which the target is under the control of the promoter.
The appearance of the "polymerase chain reaction" (PCR) technique has made possible the creation of new, more efficacious methods for installing a promoter. The transcription template is synthesized by PCR. The upstream primer supplies the promoter for the phage RNA polymerase and defines the 5' end of the transcription product (or transcript), while the downstream primer defines the 3' end of the amplified DNA and of the transcript. This technique enables an RNA of any size and with any 3' end to be obtained.
The installation of a phage promoter by means of a reaction modelled on the cycle of retroviruses is possible from an RNA (Kwoh, D. Y. et al., P.N.A.S. USA, 86, 1173-1177 (1989)). An upstream primer carrying the sequence of the promoter is used to synthesize the DNA complementary to the RNA by means of a reverse transcriptase. The RNA of the RNA:DNA duplex thereby formed is digested with RNaseH. The single-stranded DNA thus liberated hybridizes with the second primer which, by means of the reverse transcriptase, enables the second complementary DNA strand to be synthesized. However, this first step does not enable the 3' end of the RNA to be defined. It is the second synthesis, with the same enzymes and by the same method, of a second generation of template for the phage RNA, from the first transcripts obtained, which enables a 3' end to be obtained which is strictly defined by the second primer used.