DNA sequencing has become a vitally important technique in modern biology and biotechnology, providing information relevant to fields ranging from basic biological research to drug discovery to clinical medicine. Because of the large volume of DNA sequence data to be collected, automated techniques have been developed to increase the throughput and decrease the cost of DNA sequencing methods (Smith; Connell; Trainor).
A preferred automated DNA sequencing method is based on the enzymatic replication technique developed by Sanger (Sanger). In Sanger's technique, the DNA sequence of a single-stranded template DNA is determined using a DNA polymerase to synthesize a set of polynucleotide fragments wherein the fragments (i) have a sequence complementary to the template sequence, (ii) vary in length by a single nucleotide, and (ii) have a 5′-end terminating in a known nucleotide, e.g., A, C, G, or T. In the method, an oligonucleotide primer is annealed to a 3′-end of a template DNA to be sequenced, the 3′-end of the primer serving as the initiation site for polymerase-mediated polymerization of a complementary polynucleotide fragment. The enzymatic polymerization step is carried out by combining the template-primer hybrid with the four natural deoxynucleotides (“dNTPs”), a DNA polymerase enzyme, and a 2′,3′-dideoxynucleotide triphosphate (“ddNTP”) “terminator”. The incorporation of the terminator forms a fragment which lacks a hydroxy group at the 3′-terminus and thus can not be further extended, i.e., the fragment is “terminated”. The competition between the ddNTP and its corresponding dNTP for incorporation results in a distribution of different-sized fragments, each fragment terminating with the particular terminator used in the reaction. To determine the complete DNA sequence of the template, four parallel reactions are run, each reaction using a different ddNTP terminator. To determine the size distribution of the fragments, the fragments are separated by electrophoresis such that fragments differing in size by a single nucleotide are resolved.
In a modern variant of the classical Sanger technique, the nucleotide terminators are labeled with fluorescent dyes (Prober; Hobbs), and a thermostable DNA polymerase enzyme is used (Murray). Several advantages are realized by utilizing dye-labeled terminators: (i) problems associated with the storage, use and disposal of radioactive isotopes are eliminated; (ii) the requirement to synthesize dye-labeled primers is eliminated; and, (iii) when using a different dye label for each A, G, C, or T nucleotide, all four reactions can be performed simultaneously in a single tube. Using a thermostable polymerase enzyme (i) permits the polymerization reaction to be run at elevated temperature thereby disrupting any secondary structure of the template resulting in less sequence-dependent artifacts, and (ii) permits the sequencing reaction to be thermocycled, thereby serving to linearly amplify the amount of extension product produced, thus reducing the amount of DNA template required to obtain a sequence.
While these modem variants on Sanger sequencing methods have proven effective, several problems remain with respect to optimizing their performance and economy. One problem encountered when using dye-labeled terminators in combination with thermostable polymerase enzymes, particularly in the case of fluorescein-type dye labels, is that a large excess of dye-labeled terminator over the unlabeled dNTPs is required, up to a ratio of 50:1. This large excess of labeled terminator makes it necessary to purify the sequencing reaction products prior to performing the electrophoretic separation step. This clean-up step is required in order to avoid interference caused by the comigration of unincorporated labeled terminator species and bona fide sequencing fragments. A typical clean-up method includes an ethanol precipitation or a chromatographic separation (ABI PRISM™ Dye Terminator Cycle Sequencing Core Kit Protocol). Such a clean-up step greatly complicates the task of developing totally automated sequencing systems wherein the sequencing reaction products are transferred directly into an electrophoretic separation process. A second problem encountered when using presently available dye-labeled terminators in combination with a thermostable polymerase is that an uneven distribution of peak heights is obtained in Sanger-type DNA sequencing.