This invention relates to the field of nucleic acid sequence detection. The detection of nucleic acid sequences can be used in two general contexts. First, the detection of nucleic acid sequences can be used to determine the presence or absence of a particular genetic element. Second, the detection of nucleic acid sequences can be used to determine the specific type of a particular genetic element that is present. Variant genetic elements usually exist. Many techniques have been developed (1) to determine the presence of specific nucleic acid sequences, and (2) to compare homologous segments of nucleic acid sequence to determine if the segments are identical or if they differ at one or more nucleotides. Practical applications of these techniques include genetic disease diagnoses, infectious disease diagnoses, forensic techniques, paternity determinations, and genome mapping.
In general, the detection of nucleic acids in a sample and the subtypes thereof depends on the technique of specific nucleic acid hybridization in which the oligonucleotide probe is annealed under conditions of high stringency to nucleic acids in the sample, and the successfully annealed probes are subsequently detected (see Spiegelman, S., Scientific American, Vol. 210, p. 48 (1964)).
The most definitive method for comparing DNA segments is to determine the complete nucleotide sequence of each segment. Examples of how sequencing has been used to study mutations in human genes are included in the publications of Engelke, et al., Proc. Natl. Acad. Sci. U.S.A., 85:544-548 (1988) and Wong, et al., Nature, 330:384-386 (1987). At the present time, it is not practical to use extensive sequencing to compare more than just a few DNA segments because the effort required to determine, interpret, and compare sequence information is time-consuming.
A commonly used screen for DNA polymorphisms arising from DNA sequence variation consists of digesting DNA with restriction endonucleases and analyzing the resulting fragments by means of Southern blots, as described by Botstein, et al., Am. J. Hum. Genet., 32:314-331 (1980) and White, et al., Sci. Am., 258:40-48 (1988). Mutations that affect the recognition sequence of the endonuclease will preclude enzymatic cleavage at that site, thereby altering the cleavage pattern of that DNA. DNAs are compared by looking for differences in restriction fragment lengths. A major problem with this method (known as restriction fragment length polymorphism mapping or RFLP mapping) is its inability to detect mutations that do not affect cleavage with a restriction endonuclease. Thus, many mutations are missed with this method. One study, by Jeffreys, Cell, 18:1-18 (1979), was able to detect only 0.7% of the mutational variants estimated to be present in a 40,000 base pair region of human DNA. Another problem is that the methods used to detect restriction fragment length polymorphisms are very labor intensive, in particular, the techniques involved with Southern blot analysis.
A technique for detecting specific mutations in any segment of DNA is described in Wallace, et al., Nucl. Acids Res., 9:879-894 (1981). It involves hybridizing the DNA to be analyzed (target DNA) with a complementary, labeled oligonucleotide probe. Due to the thermal instability of DNA duplexes containing even a single base pair mismatch, differential melting temperature can be used to distinguish target DNAs that are perfectly complementary to the probe from target DNAs that differ by as little as a single nucleotide. In a related technique, described in Landegren, et al., Science, 41:1077-1080 (1988), oligonucleotide probes are constructed in pairs such that their junction corresponds to the site on the DNA being analyzed for mutation. These oligonucleotides are then hybridized to the DNA being analyzed. Base pair mismatch between either oligonucleotide and the target DNA at the junction location prevents the efficient joining of the two oligonucleotide probes by DNA ligase.
A. Nucleic acid hybridization
The base pairing of nucleic acids in a hybridization reaction forms the basis of most nucleic acid analytical and diagnostic techniques. In practice, tests based only on parameters of nucleic acid hybridization function poorly in cases where the sequence complexity of the test sample is high. This is partly due to the small thermodynamic differences in hybrid stability, generated by single nucleotide changes, and the fact that increasing specificity by lengthening the probe has the effect of further diminishing this differential stability. Nucleic acid hybridization is, therefore, generally combined with some other selection or enrichment procedure for analytical and diagnostic purposes.
Combining hybridization with size fractionation of hybridized molecules as a selection technique has been one general diagnostic approach. Size selection can be carried out prior to hybridization. The best known prior size selection technique is Southern Blotting (see Southern, E., Methods in Enzymology, 69:152 (1980). In this technique, a DNA sample is subjected to digestion with restriction enzymes which introduce double stranded breaks in the phosphodiester backbone at or near the site of a short sequence of nucleotides which is characteristic for each enzyme. The resulting heterogeneous mixture of DNA fragments is then separated by gel electrophoresis, denatured, and transferred to a solid phase where it is subjected to hybridization analysis in situ using a labeled nucleic acid probe. Fragments which contain sequences complementary to the labeled probe are revealed visually or densitometrically as bands of hybridized label. A variation of this method is Northern Blotting for RNA molecules. Size selection has also been used after hybridization in a number of techniques, in particular by hybrid protection techniques, by subjecting probe/nucleic acid hybrids to enzymatic digestion before size analysis.
B. Polymerase extension of duplex primer:template complexes
Hybrids between primers and DNA targets can be analyzed by polymerase extension of the hybrids. A modification of this methodology is the polymerase chain reaction in which the purification is produced by sequential hybridization reactions of anti-parallel primers, followed by enzymatic amplification with DNA polymerase (see Saiki, et al., Science 239:487-491 (1988)). By selecting for two hybridization reactions, this methodology provides the specificity lacking in techniques that depend only upon a single hybridization reaction.
It has long been known that primer-dependent DNA polymerases have, in general, a low error rate for the addition of nucleotides complementary to a template. This feature is essential in biology for the prevention of genetic mistakes which would have detrimental effects on progeny. The specificity inherent in this enzymological reaction has been widely exploited as the basis of the "Sanger" or dideoxy chain termination sequencing methodology which is the ultimate nucleic acid typing experiment. One type of Sanger DNA sequencing method makes use of mixtures of the four deoxynucleoside triphosphates, which are normal DNA precursors, and one of the four possible dideoxynucleoside triphosphates, which have a hydrogen atom instead of a hydroxyl group attached to the 3' carbon atom of the ribose sugar component of the nucleotide. DNA chain elongation in the 5' to 3' direction ("downstream") requires this hydroxyl group. As such, when a dideoxynucleotide is incorporated into the growing DNA chain, no further elongation can occur. With one dideoxynucleotide in the mixture, DNA polymerases can, from a primer:template combination, produce a population of molecules of varying length, all of which terminate after the addition of one out of the four possible nucleotides. The series of four independent reactions, each with a different dideoxynucleotide, generates a nested set of fragments, all starting at the same 5' terminus of the priming DNA molecule and terminating at all possible 3' nucleotide positions.
Another utilization of dideoxynucleoside triphosphates and a polymerase in the analysis of DNA involves labeling the 3' end of a molecule. One prominent manifestation of this technique provides the means for sequencing a DNA molecule from its 3' end using the Maxam-Gilbert method. In this technique, a molecule with a protruding 3' end is treated with terminal transferase in the presence of radioactive dideoxy-ATP. One radioactive nucleotide is added, rendering the molecule suitable for sequencing. Both methods of DNA sequencing using labeled dideoxynucleotides require electrophoretic separation of reaction products in order to derive the typing information. Most methods require four separate gel tracks for each typing determination.
The following two patents describe other methods of typing nucleic acids which employ primer extension and labeled nucleotides. Mundy (U.S. Pat. No. 4,656,127) describes a method whereby a primer is constructed complementary to a region of a target nucleic acid of interest such that its 3' end is close to a nucleotide in which variation can occur. This hybrid is subject to primer extension in the presence of a DNA polymerase and four deoxynucleoside triphosphates, one of which is an .alpha.-thionucleotide. The hybrid is then digested using an exonuclease enzyme which cannot use thio-derivatized DNA as a substrate for its nucleolytic action (for example Exonuclease III of E. coli). If the variant nucleotide in the template is complementary to one of the thionucleotides in the reaction mixture, the resulting extended primer molecule will be of a characteristic size and resistant to the exonuclease; hybrids without thio-derivatized DNA will be digested. After an appropriate enzyme digest to remove underivatized molecules, the thio-derivatized molecule can be detected by gel electrophoresis or other separation technology.
Vary and Diamond (U.S. Pat. No. 4,851,331) describes a method similar to that of Mundy wherein the last nucleotide of the primer corresponds to the variant nucleotide of interest. Since mismatching of the primer and the template at the 3' terminal nucleotide of the primer is counterproductive to elongation, significant differences in the amount of incorporation of a tracer nucleotide will result under normal primer extension conditions. This method depends on the use of a DNA polymerase, e.g., AMV reverse transcriptase, that does not have an associated 3' to 5' exonuclease activity. The methods of Mundy and of Vary and Diamond have drawbacks. The method of Mundy is useful but cumbersome due to the requirements of the second, different enzymological system where the non-derivatized hybrids are digested. The method of Vary is complicated by the fact that it does not generate discrete reaction products. Any "false" priming will generate significant noise in such a system which would be difficult to distinguish from a genuine signal.
The present invention circumvents the problems associated with the methods of Mundy and of Vary and Diamond for typing nucleic acid with respect to particular nucleotides. With methods employing primer extension and a DNA polymerase, the current invention will generate a discrete molecular species one base longer than the primer itself. In many methods, particularly those employing the polymerase chain reaction, the type of reaction used to purify the nucleic acid of interest in the first step can also be used in the subsequent detection step. Finally, with terminators which are labeled with different detector moieties (for example different fluorophors having different spectral properties), it will be possible to use only one reagent for all sequence detection experiments. Furthermore, if techniques are used to separate the terminated primers post-reaction, sequence detection experiments at more than one locus can be carried out in the same tube.
A recent article by Mullis (Scientific American, April 1990, pp. 56-65) suggests an experiment, which apparently was not performed, to determine the identity of a targeted base pair in a piece of double-stranded DNA. Mullis suggests using four types of dideoxynucleosides triphosphate, with one type of dideoxynucleoside triphosphate being radioactively labeled.
The present invention permits analyses of nucleic acid sequences that can be useful in the diagnosis of infectious diseases, the diagnosis of genetic disorders, and in the identification of individuals and their parentage.
A number of methods have been developed for these purposes. Although powerful, such methodologies have been cumbersome and expensive, generally involving a combination of techniques such as gel electrophoresis, blotting, hybridization, and autoradiography or non-isotopic revelation. Simpler technologies are needed to allow the more widespread use of nucleic acid analysis. In addition, tests based on nucleic acids are currently among the most expensive of laboratory procedures and for this reason cannot be used on a routine basis. Finally, current techniques are not adapted to automated procedures which would be necessary to allow the analysis of large numbers of samples and would further reduce the cost.
The current invention provides a method that can be used to diagnose or characterize nucleic acids in biological samples without recourse to gel electrophoretic size separation of the nucleic acid species. This feature renders this process easily adaptable to automation and thus will permit the analysis of large numbers of samples at relatively low cost. Because nucleic acids are the essential blueprint of life, each organism or individual can be uniquely characterized by identifiable sequences of nucleic acids. It is, therefore, possible to identify the presence of particular organisms or demonstrate the biological origin of certain samples by detecting these specific nucleic acid sequences.