The desired end-product of most DNA characterization procedures is the nucleotide sequence of the DNA. The laboratory procedures which commonly precede DNA sequencing are 1) restriction mapping, and 2) subcloning or polymerase chain reaction ("PCR") amplification. The function of subcloning is to is divide large DNA fragments into smaller fragments which can be propagated in a suitable host, thereby providing large amounts of template, i.e., the DNA segments or fragments of interest that are being cloned for sequencing. The subcloning procedure also provides primer sites carried on the vector DNA which may be used in DNA sequencing.
DNA sequencing and PCR rely on the ability of a DNA polymerase to extend an oligonucleotide primer that is annealed to a single stranded DNA template. During thermal cycle DNA sequencing, a thermostable DNA polymerase is used with a single oligonucleotide primer. PCR ordinarily extends primers from two complementary DNA strands, and thus makes double stranded DNA product, whereas asymmetric PCR produces predominantly single-stranded DNA by limiting the concentration of one primer while adding an excess of another.
For the DNA sequencing and PCR methods described above to succeed, the primer(s) must be complementary to the primer site(s) on the template DNA, i.e., the primer site on the template DNA and the oligonucleotide primer must be closely matched to ensure that annealing takes place at the desired location. Since the potential priming sites are known on vector DNA (and on the ends of PCR product), sequencing can be carried out directly on this DNA; however, due to practical limitations of sequencing technology, DNA may only be sequenced from the priming site to a maximum length of about 300 nucleotides ("nt"). If the DNA sequence on the ends of restriction fragments were known, complementary primers thereto could be synthesized, and the DNA sequenced. However, when uncharacterized DNA is restriction mapped, only a remnant of the nucleotide sequence generated by the action of the endonucleases (typically 2-4nt), is known. The length of this cleavage site is inadequate to design a primer for DNA sequencing or PCR amplification.
Another problem inherent in current sequencing technology is as follows. Because restriction mapping is a "top-to-bottom approach" where large fragments are progressively reduced to equimolar amounts of smaller fragments, those smaller fragments become increasingly more difficult to detect and manipulate. In the past, this limitation was remedied by cloning into small multicopy plasmids, and transforming laboratory strains of easily propagated and genetically well-characterized hosts. However, plasmid rearrangement, insert size limitation, possible dissemination of antibiotic resistance, containment of recombinant organisms, and "unclonable" fragments, are just a small sampling of the problems associated with maintaining and propagating of DNA in living organisms. Another limitation of current DNA methods is that most restriction fragments are refractory to nucleotide sequencing techniques, because the minimum sequence information necessary to design an oligonucleotide primer is unavailable. Furthermore, when subcloning is used, the total procedure (if successful) takes at least 80 hours total elapsed time and many steps; it would obviously be a great advantage to reduce this time. As such, improvements in the ease of sequencing DNA that address these problems have long been desired.
Yet another limitation of currently practiced DNA sequencing techniques is the throughput limitation imposed by the sensitivity of detection of sequencing cloning products by, most notably, autoradiography. In conventional manual sequencing protocols, for example, the products of cloning, the DNA fragment(s) of interest having a known priming site sequence at one end, are 1 ) labeled and 2) terminated. In the labeling step, one takes the products of cloning and labels them using DNA polymerase. The mix in which the reaction takes place contains one radioactive (generally, either .sup.35 S-, .sup.32 p- or .sup.33 p- labeled) deoxynucleotide triphosphates ("dNTPs"), as well as non-radioactive dNTPs. The enzyme will incorporate the radioactive nucleotide(s) into the DNA fragments so the fragments are visible via autoradiography. The ease with which sequencing product is detected is proportional to the amount of label which is incorporated during the labeling step. One dNTP is deliberately left out of the labeling mixture. This controls the length of the labeled primer in the labeling step. This process is generally repeated about 60X in thermal cycle sequencing protocols.
In the termination step, the labeled, extended primer is added to the just prepared termination mixes. In manual sequencing protocols, there are four separate termination mixes into which the labeling reaction is divided. All have one of four different dNTP analogs present (either dideoxy-GTP, dideoxy-ATP, dideoxy-TTP, or dideoxy-CTP) in addition to the four normal dNTP's. The analogs are incorporated into the growing DNA strand by polymerase at low frequency, but once this has been done, the DNA strand can not be extended any further. The ratio of normal dNTP's to analogs (dideoxynucleotide triphosphates ("ddNTPs")) is adjusted so that ddNTP is only incorporated occasionally leading Is to sequencing products of many different lengths, but all terminating in the same ddNTP. The four termination mixtures (i.e., ddGTP, ddATP, ddTTP, and ddCTP) are electrophoresed side-by-side in polyacrylamide. The resultant DNA sequencing ladder is detected by autoradiography, whereupon analysis of the image will elucidate the DNA sequence.
One can see that the sensitivity of detection is determined by the amount of template DNA utilized, and the duration of the autoradiography. These limitations may be seen as an impediment to high throughput DNA sequencing. In conventional DNA sequencing, as described above, priming sites are selected from vector sequences which are adjacent to insert DNA. The number of s radioactive nucleotides incorporated (sensitivity) is controlled by priming site selection. Naturally-occurring priming sites rarely allow the incorporation of more than 4 or 5 radioactive nucleotides using the labeling method described above. As such, if the autoradiographic image corresponding to the banding pattern in the polyacrylamide gel is too faint, another film must be exposed, resulting in a delay in obtaining sequencing information equal to the time necessary to expose and develop the film, which can be up to 72 hours in some cases. Improvements in the sensitivity of autoradiographic detection, and therefore, faster sequencing, have been sought out.