Nucleic acid sequencing involves the determination of the sequence of nucleotides of a particular nucleic acid molecule. Knowledge of the sequence of a nucleic acid molecule is typically fundamental to elucidating the function of the molecule and facilitating manipulation of the molecule. Further, variations in individual genomes often account for differences in susceptibility to diseases and pharmacological responses to treatment. To illustrate, changes in a single base of a nucleic acid molecule, which are commonly referred to as single nucleotide polymorphisms (SNPs), can affect an individuals risk for a given disease. By comparing these variations, for example, researchers are gaining an understanding of the medical utility of SNPs, thereby enhancing our ability to effectively diagnose, prognosticate, and treat disease.
Nucleic acid sequencing technology began in the late 1960s with efforts to sequence RNA. In particular, the sequence of 5S-ribosomal RNA from Escherichia coli (Brownlee et al. (1967) “Nucleotide sequence of 5S-ribosomal RNA from Escherichia coli,” Nature 215(102):735) and R17 bacteriophage RNA coding for coat protein (Adams et al. (1969) “Nucleotide sequence from the coat protein cistron of R17 bacteriophage RNA,” Nature 223(210):1009) are some of the early examples of RNA sequencing. Subsequently, Sanger described the sequencing of bacteriophage fl DNA by primed synthesis with DNA polymerase (Sanger et al. (1973) “Use of DNA polylmerase I primed by a synthetic oligonucleotide to determine a nucleotide sequence in phage fl DNA,” Proc. Natl. Acad. Sci. USA 70(4):1209), while Gilbert and Maxam reported on the DNA nucleotide sequence of the lac operator (Gilbert and Maxam (1973) “The nucleotide sequence of the lac operator,” Proc. Natl. Acad. Sci. USA 70(12):3581).
In 1977, Sanger described the use of modified nucleoside triphosphates (including dideoxyribose) in combination with deoxyribonucleotides to terminate chain elongation (Sanger et al. (1977) “DNA sequencing with chain-terminating inhibitors,” Biotechnology 24:104). In that same year, Maxam and Gilbert reported a method for sequencing DNA that utilized chemical cleavage of DNA preferentially at guanines, at adenines, at cytosines and thymines equally, and at cytosines alone (Maxam and Gilbert (1977) “A new method for sequencing DNA,” Proc. Natl. Acad. Sci. USA 74:560). These two methods accelerated manual sequencing based on electrophoretic separation of DNA fragments labeled with radioactive markers and subsequent detection via autoradiography.
The Sanger dideoxy method for sequencing DNA has become far more widely used than the Maxam-Gilbert chemical cleavage method. The Sanger method includes the synthesis of a new strand of DNA starting from a specific priming site and ending with the incorporation of a chain terminating or terminator nucleotide. In particular, a DNA polymerase extends a primer nucleic acid annealed to a specific location on a DNA template by incorporating deoxynucleotides (dNTPs) complementary to the template. Synthesis of the new DNA strand continues until the reaction is randomly terminated by the inclusion of a dideoxynucleotide (ddNTP). These nucleotide analogs are incapable of supporting further chain extension since the ribose moiety of the ddNTP lacks the 3′-hydroxyl necessary for forming a phosphodiester bond with the next incoming dNTP. This produces a population of truncated sequencing fragments, each with a defined or fixed 5′-end and a varying 3′-end. Among the disadvantages of the dideoxy method is the expense associated with making ddNTPs.
Two frequently used automated sequencing methodologies are dye-primer nucleic acid and dye-terminator sequencing. These methods are suitable for use with fluorescent label moieties. Although sequencing can also be done using radioactive label moieties, fluorescence-based sequencing is increasingly preferred. Briefly, in dye-primer sequencing, a fluorescently labeled primer is used in combination with unlabeled ddNTPs. The procedure typically utilizes four synthesis reactions and up to four lanes on a gel for each template to be sequenced (one corresponding to each of the base-specific termination products). Following primer nucleic acid extension, the sequencing reaction mixtures containing dideoxynucleotide-incorporated termination products are routinely electrophoresed on a DNA sequencing gel. Following separation by electrophoresis, the fluorescently-labeled products are excited in the gel with a laser and the fluorescence is detected with an appropriate detector. In automated systems, a detector scans the bottom of the gel during electrophoresis, to detect whatever label moiety has been employed, as the reactions pass through the gel matrix (Smith et al. (1986) “Fluorescence detection in automated DNA sequence analysis,” Nature 321:674). In a modification of this method, four primers are each labeled with a different fluorescent marker. After the four separate sequencing reactions are completed, the mixtures are combined and the reaction is subjected to gel analysis in a single lane, and the different fluorescent tags (one corresponding to each of the four different base-specific termination products) are individually detected.
Alternatively, dye-terminator sequencing methods are employed. In this method, a DNA polymerase is used to incorporate dNTPs and fluorescently labeled ddNTPs onto the growing end of a DNA primer (Lee et al. (1992) “DNA sequencing with dye-labeled terminators and T7 DNA polymerase: effect of dyes and dNTPs on incorporation of dye-terminators and probability analysis of termination fragments,” Nucleic Acid Res. 26:2471). This process offers the advantage of not having to synthesize dye-labeled primers. Furthermore, dye-terminator reactions are more convenient in that all four reactions can be performed in the same tube.
Other methods of deconvoluting sequencing reaction mixtures include the use of gas phase ion spectrometry. For example, matrix assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF-MS) is one approach that has been successfully utilized in high-throughput sequencing and SNP genotyping analyses (see, e.g., Sauer et al. (2002) “Facile method for automated genotyping of single nucleotide polymorphisms by mass spectrometry,” Nucleic Acids Res. 30(5):e22.
From the foregoing, it is apparent that additional methods of sequencing and genotyping nucleic acids are desirable. The present invention provides new nucleic acid sequencing methods that utilize 2′-terminator nucleotides, as well as a variety of additional features including approaches to nucleic acid labeling that will be apparent upon a complete review of the following disclosure.