Today, there are two predominant methods for DNA sequence determination: the chemical degradation method (Maxam and Gilbert, Proc. Natl. Acad. Sci. 74:560-564 (1977), and the dideoxy chain termination method (Sanger et al., Proc. Natl. Acad. Sci. 74:5463-5467 (1977)). Most automated sequencers are based on the chain termination method utilizing fluorescent detection of product formation. There are two common variations of these systems: (1) dye-labeled primers to which deoxynucleotides and dideoxynucleotides are added, and (2) primers to which deoxynucleotides and fluorescently labeled dideoxynucleotides are added. In addition, the labeled deoxynucleotides can be used in conjunction with unlabeled dideoxynucleotides. This method is based upon the ability of an enzyme to add specific nucleotides onto the 3' hydroxyl end of a primer annealed to a template. The base pairing property of nucleic acids determines the specificity of nucleotide addition. The extension products are separated electrophoretically on a polyacrylamide gel and detected by an optical system utilizing laser excitation.
Although both the chemical degradation method and the dideoxy chain termination method are in widespread use, there are many associated disadvantages: for example, the methods require gel-electrophoretic separation. Typically, only 400-800 base pairs can be sequenced from a single clone. As a result, the systems are both time- and labor-intensive. Methods avoiding gel separation have been developed in attempts to increase the sequencing throughput.
Methods have been proposed by Crkvenjakov (Drmanac, et al., Genomics 4:114 (1989); Strezoska et al., (Proc. Natl. Acad.Sci. USA 88:10089 (1991); Drmanac, et al., Science 260: 1649 (1991)) and Bains and Smith (Bains and Smith, J. Theoretical Biol. 135: 303 (1988)). These sequencing by hybridization (SBH) methods potentially can increase the sequence throughput because multiple hybridization reactions are performed simultaneously. This type of system utilizes the information obtained from multiple hybridizations of the polynucleotide of interest, using short oligonucleotides to determine the nucleic acid sequence (Drmanac, U.S. Pat. No. 5,202,231). To reconstruct the sequence requires an extensive computer search algorithm to determine the optimal order of all fragments obtained from the multiple hybridizations.
These methods are problematic in several respects. For example, the hybridization is dependent upon the sequence composition of the duplex of the oligonucleotide and the polynucleotide of interest, so that GC-rich regions are more stable than AT-rich regions. As a result, false positives and false negatives during hybridization detection are frequently present and complicate sequence determination. Furthermore, the sequence of the polynucleotide is not determined directly, but is inferred from the sequence of the known probe, which increases the possibility for error. A great need remains to develop efficient and accurate methods for nucleic acid sequence determination.