The rate of world wide DNA sequencing is about 50 million bp (base-pairs) a year, doubling approximately every 18 months. With both molecular biology and genetic engineering becoming information oriented, the need for more sequencing information exceeds by far the technical possibility to obtain this information.
At present, a major technique of DNA sequencing in the world is the "walking primer" method used in the framework of the enzymatic sequencing by incorporation of dideoxy chain terminators. In this method the enzymatic sequencing is used to determine the sequence of a few hundred bases of the strand enzymatically extended by a DNA polymerase enzyme in the 5' to 3' direction from a primer, which has primed the extension reaction. This sequencing information is then used to synthesize a new primer to prime the next sequencing reaction further in the 5' to 3' direction of the enzymatically extended strand, and so on. A major advantage of the walking primer method is that it uses the same template for a number of sequencing reactions, thus minimizing the need for subcloning and for other steps in template preparation. Another advantage of the walking primer method is that it facilitates integration of sequences obtained from individual runs into continuous contigs of sequence. A major disadvantage of the walking primer method is the high cost of synthesis of a new primer for each sequencing reaction. Another disadvantage is the delay in sequencing caused by the lengthy procedures of synthesis and purification of the walking primer.
The present invention addresses the following quantitative discrepancy in the walking primer method. On the one hand, the molar amount of a primer required for a typical reaction of enzymatic DNA sequencing is on the order of a picomole. On the other hand, the oligonucleotide synthesizers make a typical amount of primer between 0.2 and 1.0 micromoles. That is five to six orders of magnitude more than is required. People have given thought to this consideration, and suggested to create a library of primers, containing a sufficient variety of sequences to cover the needs of the walking primer method (Studier, W. 1989 Proc. Natl. Acad. Sci. USA 86, pp. 6917-6921). An advantage of such a library is, that only a minuscule part of a primer sample, is used for each sequencing reaction. Each primer sample is used many times, dramatically reducing the normally high cost of walking primer per reaction. The sequencing is also speeded up by the use of ready made primers.
However, it turns out that creation of such a library would be an enormous enterprise at the present synthesis capabilities. Indeed, a primer from such a library can have a reasonably unique priming site in most templates, only if it contains at least 8 or 9 nucleotides. The number of all possible 8-mers is about 64,000. Even if only a third of them, about 21,000 different sequences, are needed for a representative library, the synthesis of such a large number of oligonucleotides would be very expensive and time consuming. On the user side, storage and utilization of such a huge library would also present serious problems.
To overcome the problem of the large size and, therefore, the high cost of such a library, it was suggested recently (Szybalski, W. 1990 Gene, 90, pp. 177-178) to make 12-mer primer by ligating two 6-mer oligonucleotides annealed to the template next to each other. The two 6-mers were expected to be selected from a library of possible 6-mers. So far, to the best of my knowledge, the experiments with this technique (unpublished) showed that the efficiency of ligation of the two 6-mers on the template depends on the position of the priming site within the template. Sometimes this efficiency is prohibitively low. A dramatic drop in the ligation efficiency occurs, as the length of the ligated oligonucleotides is changed from 6-mer to 5-mer. Ligation of two 5-mers on the template usually does not work even when ligation of two 6-mers does, presumably, because the combined length of two adjacent 5-mers (10 nucleotides) is not long enough substrate for the ligase to work efficiently. However, even if a ligation technique is found to work efficiently, it would introduce a very inconvenient step of additional enzymatic reaction (ligation) in the sequencing.
The present invention obviates the need for such an additional enzymatic reaction (ligation) in the sequencing protocol. Moreover, the present invention allows oligonucleotides shorter than 6-mers to form a composite primer.