This invention relates to the field of nucleic acid sequence detection. The detection of nucleic acid sequences can be used in two general contexts. First, the detection of nucleic acid sequences can be used to determine the presence or absence of a particular genetic element. Second, the detection of nucleic acid sequences can be used to determine the specific type of a particular genetic element that is present. Variant genetic elements usually exist. Many techniques have been developed (1) to determine the presence of specific nucleic acid sequences, and (2) to compare homologous segments of nucleic acid sequence to determine if the segments are identical or if they differ at one or more nucleotides. Practical applications of these techniques include genetic disease diagnoses, infectious disease diagnoses, forensic techniques, paternity determinations, and genome mapping.
In general, the detection of nucleic acids in a sample and the subtypes thereof depends on the technique of specific nucleic acid hybridization in which the oligonucleotide probe is annealed under conditions of high stringency to nucleic acids in the sample, and the successfully annealed probes are subsequently detected (see Spiegelman, S., Scientific American, Vol. 210, p. 48 (1964)).
The most definitive method for comparing DNA segments is to determine the complete nucleotide sequence of each segment. Examples of how sequencing has been used to study mutations in human genes are included in the publications of Engelke, et al., Proc. Natl. Acad. Sci. U.S.A., 85:544-548 (1988) and Wong, et al., Nature, 330:384-386 (1987). At the present time, it is not practical to use extensive sequencing to compare more than just a few DNA segments because the effort required to determine, interpret, and compare sequence information is time-consuming.
A commonly used screen for DNA polymorphisms arising from DNA sequence variation consists of digesting DNA with restriction endonucleases and analyzing the resulting fragments by means of Southern blots, as described by Botstein, et al., Am. J. Hum. Genet., 32:314-331 (1980) and White, et al., Sci. Am., 258:40-48 (1988). Mutations that affect the recognition sequence of the endonuclease will preclude enzymatic cleavage at that site, thereby altering the cleavage pattern of that DNA. DNAs are compared by looking for differences in restriction fragment lengths. A major problem with this method (known as restriction fragment length polymorphism mapping or RFLP mapping) is its inability to detect mutations that do not affect cleavage with a restriction endonuclease. Thus, many mutations are missed with this method. One study, by Jeffreys, Cell, 18:1-18 (1979), was able to detect only 0.7% of the mutational variants estimated to be present in a 40,000 base pair region of human DNA. Another problem is that the methods used to detect restriction fragment length polymorphisms are very labor intensive, in particular, the techniques involved with Southern blot analysis.
A technique for detecting specific mutations in any segment of DNA is described in Wallace, et al., Nucl. Acids Res., 9:879-894 (1981). It involves hybridizing the DNA to be analyzed (target DNA) with a complementary, labeled oligonucleotide probe. Due to the thermal instability of DNA duplexes containing even a single base pair mismatch, differential melting temperature can be used to distinguish target DNAs that are perfectly complementary to the probe from target DNAs that differ by as little as a single nucleotide. In a related technique, described in Landegren, et al., Science, 41:1077-1080 (1988), oligonucleotide probes are constructed in pairs such that their junction corresponds to the site on the DNA being analyzed for mutation. These oligonucleotides are then hybridized to the DNA being analyzed. Base pair mismatch between either oligonucleotide and the target DNA at the junction location prevents the efficient joining of the two oligonucleotide probes by DNA ligase.
A. Nucleic Acid Hybridization
The base pairing of nucleic acids in a hybridization reaction forms the basis of most nucleic acid analytical and diagnostic techniques. In practice, tests based only on parameters of nucleic acid hybridization function poorly in cases where the sequence complexity of the test sample is high. This is partly due to the small thermodynamic differences in hybrid stability, generated by single nucleotide changes, and the fact that increasing specificity by lengthening the probe has the effect of further diminishing this differential stability. Nucleic acid hybridization is, therefore, generally combined with some other selection or enrichment procedure for analytical and diagnostic purposes.
Combining hybridization with size fractionation of hybridized molecules as a selection technique has been one general diagnostic approach. Size selection can be carried out prior to hybridization. The best known prior size selection technique is Southern Blotting (see Southern, E., Methods in Enzymology, 69:152 (1980). In this technique, a DNA sample is subjected to digestion with restriction enzymes which introduce double stranded breaks in the phosphodiester backbone at or near the site of a short sequence of nucleotides which is characteristic for each enzyme. The resulting heterogeneous mixture of DNA fragments is then separated by gel electrophoresis, denatured, and transferred to a solid phase where it is subjected to hybridization analysis in situ using a labeled nucleic acid probe. Fragments which contain sequences complementary to the labeled probe are revealed visually or densitometrically as bands of hybridized label. A variation of this method is Northern Blotting for RNA molecules. Size selection has also been used after hybridization in a number of techniques, in particular by hybrid protection techniques, by subjecting probe/nucleic acid hybrids to enzymatic digestion before size analysis.
B. Polymerase Extension of Duplex Primer:template Complexes
Hybrids between primers and DNA targets can be analyzed by polymerase extension of the hybrids. A modification of this methodology is the polymerase chain reaction in which the purification is produced by sequential hybridization reactions of anti-parallel primers, followed by enzymatic amplification with DNA polymerase (see Saiki, et al., Science 239:487-491 (1988)). By selecting for two hybridization reactions, this methodology provides the specificity lacking in techniques that depend only upon a single hybridization reaction.
It has long been known that primer-dependent DNA polymerases have, in general, a low error rate for the addition of nucleotides complementary to a template. This feature is essential in biology for the prevention of genetic mistakes which would have detrimental effects on progeny. The specificity inherent in this enzymological reaction has been widely exploited as the basis of the xe2x80x9cSangerxe2x80x9d or dideoxy chain termination sequencing methodology which is the ultimate nucleic acid typing experiment. One type of Sanger DNA sequencing method makes use of mixtures of the four deoxynucleoside triphosphates, which are normal DNA precursors, and one of the four possible dideoxynucleoside triphosphates, which have a hydrogen atom instead of a hydroxyl group attached to the 3xe2x80x2 carbon atom of the ribose sugar component of the nucleotide. DNA chain elongation in the 5xe2x80x2 to 3xe2x80x2 direction (xe2x80x9cdownstreamxe2x80x9d) requires this hydroxyl group. As such, when a dideoxynucleotide is incorporated into the growing DNA chain, no further elongation can occur. With one dideoxynucleotide in the mixture, DNA polymerases can, from a primer:template combination, produce a population of molecules of varying length, all of which terminate after the addition of one out of the four possible nucleotides. The series of four independent reactions, each with a different dideoxynucleotide, generates a nested set of fragments, all starting at the same 5xe2x80x2 terminus of the priming DNA molecule and terminating at all possible 3xe2x80x2 nucleotide positions.
Another utilization of dideoxynucleoside triphosphates and a polymerase in the analysis of DNA involves labeling the 3xe2x80x2 end of a molecule. One prominent manifestation of this technique provides the means for sequencing a DNA molecule from its 3xe2x80x2 end using the Maxam-Gilbert method. In this technique, a molecule with a protruding 3xe2x80x2 end is treated with terminal transferase in the presence of radioactive dideoxy-ATP. One radioactive nucleotide is added, rendering the molecule suitable for sequencing. Both methods of DNA sequencing using labeled dideoxynucleotides require electrophoretic separation of reaction products in order to derive the typing information. Most methods require four separate gel tracks for each typing determination.
The following two patents describe other methods of typing nucleic acids which employ primer extension and labeled nucleotides. Mundy (U.S. Pat. No. 4,656,127) describes a method whereby a primer is constructed complementary to a region of a target nucleic acid of interest such that its 3xe2x80x2 end is close to a nucleotide in which variation can occur. This hybrid is subject to primer extension in the presence of a DNA polymerase and four deoxynucleoside triphosphates, one of which is an xcex1-thionucleotide. The hybrid is then digested using an exonuclease enzyme which cannot use thio-derivatized DNA as a substrate for its nucleolytic action (for example Exonuclease III of E. coli). If the variant nucleotide in the template is complementary to one of the thionucleotides in the reaction mixture, the resulting extended primer molecule will be of a characteristic size and resistant to the exonuclease; hybrids without thio-derivatized DNA will be digested. After an appropriate enzyme digest to remove underivatized molecules, the thio-derivatized molecule can be detected by gel electrophoresis or other separation technology.
Vary and Diamond (U.S. Pat. No. 4,851,331) describes a method similar to that of Mundy wherein the last nucleotide of the primer corresponds to the variant nucleotide of interest. Since mismatching of the primer and the template at the 3xe2x80x2 terminal nucleotide of the primer is counterproductive to elongation, significant differences in the amount of incorporation of a tracer nucleotide will result under normal primer extension conditions. This method depends on the use of a DNA polymerase, e.g., AMV reverse transcriptase, that does not have an associated 3xe2x80x2 to 5xe2x80x2 exonuclease activity.
The methods of Mundy and of Vary and Diamond have drawbacks. The method of Mundy is useful but cumbersome due to the requirements of the second, different enzymological system where the non-derivatized hybrids are digested. The method of Vary is complicated by the fact that it does not generate discrete reaction products. Any xe2x80x9cfalsexe2x80x9d priming will generate significant noise in such a system which would be difficult to distinguish from a genuine signal.
The present invention circumvents the problems associated with the methods of Mundy and of Vary and Diamond for typing nucleic acid with respect to particular nucleotides. With methods employing primer extension and a DNA polymerase, the current invention will generate a discrete molecular species one base longer than the primer itself. In many methods, particularly those employing the polymerase chain reaction, the type of reaction used to purify the nucleic acid of interest in the first step can also be used in the subsequent detection step. Finally, with terminators which are labeled with different detector moieties (for example different fluorophors having different spectral properties), it will be possible to use only one reagent for all sequence detection experiments. Furthermore, if techniques are used to separate the terminated primers post-reaction, sequence detection experiments at more than one locus can be carried out in the same tube.
A recent article by Mullis (Scientific American, April 1990, pp. 56-65) suggests an experiment, which apparently. was not performed, to determine the identity of a targeted base pair in a piece of double-stranded DNA. Mullis suggests using four types of dideoxynucleosides triphosphate, with one type of dideoxynucleoside triphosphate being radioactively labeled.
The present invention permits analyses of nucleic acid sequences that can be useful in the diagnosis of infectious diseases, the diagnosis of genetic disorders, and in the identification of individuals and their parentage.
A number of methods have been developed for these purposes. Although powerful, such methodologies have been cumbersome and expensive, generally involving a combination of techniques such as gel electrophoresis, blotting, hybridization, and autoradiography or non-isotopic revelation. Simpler technologies are needed to allow the more widespread use of nucleic acid analysis. In addition, tests based on nucleic acids are currently among the most expensive of laboratory procedures and for this reason cannot be used on a routine basis. Finally, current techniques are not adapted to automated procedures which would be necessary to allow the analysis of large numbers of samples and would further reduce the cost.
The current invention provides a method that can be used to diagnose or characterize nucleic acids in biological samples without recourse to gel electrophoretic size separation of the nucleic acid species. This feature renders this process easily adaptable to automation and thus will permit the analysis of large numbers of samples at relatively low cost. Because nucleic acids are the essential blueprint of life, each organism or individual can be uniquely characterized by identifiable sequences of nucleic acids. It is, therefore, possible to identify the presence of particular organisms or demonstrate the biological origin of certain samples by detecting these specific nucleic acid sequences.
The subject invention provides a reagent composition comprising an aqueous carrier and an admixture of at least two different terminators of a nucleic acid template-dependent, primer extension reaction. Each of the terminators is capable of specifically terminating the extension reaction in a manner strictly dependent on the identity of the unpaired nucleotide base in the template immediately adjacent to, and downstream of, the 3xe2x80x2 end of the primer. In addition, at least one of the terminators is labeled with a detectable marker.
The subject invention further provides a reagent composition comprising an aqueous carrier and an admixture of four different terminators of a nucleic acid template-dependent, primer extension reaction. Each of the terminators is capable of specifically terminating the extension reaction as above and one, two, three, or four of the terminators is labeled with a detectable marker.
The subject invention further provides a reagent as described above wherein the terminators comprise nucleotides, nucleotide analogs, dideoxynucleotides, or arabinoside triphosphates. The subject invention also provides a reagent wherein the terminators comprise one or more of dideoxyadenosine triphosphate (ddATP), dideoxycytosine triphosphate (ddCTP), dideoxyguanosine triphosphate (ddGTP), dideoxythymidine triphosphate (ddTTP), or dideoxyuridine triphosphate (ddUTP).
The subject invention also provides a method for determining the identity of a nucleotide base at a specific position in a nucleic acid of interest. First, a sample containing the nucleic acid of interest is treated, if such nucleic acid is double-stranded, so as to obtain unpaired nucleotide bases spanning the specific position. If the nucleic acid of interest is single-stranded, this step is not necessary. Second, the sample containing the nucleic acid of interest is contacted with an oligonucleotide primer under hybridizing conditions. The oligonucleotide primer is capable of hybridizing with a stretch of nucleotide bases present in the nucleic acid of interest, immediately adjacent to the nucleotide base to be identified, so as to form a duplex between the primer and the nucleic acid of interest such that the nucleotide base to be identified is the first unpaired base in the template immediately downstream of the 3xe2x80x2 end of the primer in the duplex of primer and the nucleic acid of interest. Enzymatic extension of the oligonucleotide primer in the resultant duplex by one nucleotide, catalyzed, for example, by a DNA polymerase, thus depends on correct base pairing of the added nucleotide to the nucleotide base to be identified.
The duplex of primer and the nucleic acid of interest is then contacted with a reagent containing four labeled terminators, each terminator being labeled with a different detectable marker. The duplex of primer and the nucleic acid of interest is contacted with the reagent under conditions permitting base pairing of a complementary terminator present in the reagent with the nucleotide base to be identified and the occurrence of a template-dependent, primer extension reaction so as to incorporate the terminator at the 3xe2x80x2 end of the primer. The net result is that the oligonucleotide primer has been extended by one terminator. Next, the identity of the detectable marker present at the 3xe2x80x2 end of the extended primer is determined. The identity of the detectable marker indicates which terminator has base paired to the next base in the nucleic acid of interest. Since the terminator is complementary to the next base in the nucleic acid of interest, the identity of the next base in the nucleic acid of interest is thereby determined.
The subject invention also provides another method for determining the identity of a nucleotide base at a specific position in a nucleic acid of interest. This additional method uses a reagent containing four terminators, only one of the terminators having a detectable marker.
The subject invention also provides a method of typing a sample of nucleic acids which comprises identifying the base or bases present at each of one or more specific positions, each such nucleotide base being identified using one of the methods for determining the identity of a nucleotide base at a specific position in a nucleic acid of interest as outlined above. Each specific position in the nucleic acid of interest is determined using a different primer. The identity of each nucleotide base or bases at each position can be determined individually or the identities of the nucleotide bases at different positions can be determined simultaneously.
The subject invention further provides a method for identifying different alleles in a sample containing nucleic acids which comprises identifying the base or bases present at each of one or more specific positions. The identity of each nucleotide base is determined by the method for determining the identity of a nucleotide base at a specific position in a nucleic acid of interest as outlined above.
The subject invention also provides a method for determining the genotype of an organism at one or more particular genetic loci which comprises obtaining from the organism a sample containing genomic DNA and identifying the nucleotide base or bases present at each of one or more specific positions in nucleic acids of interest. The identity of each such base is determined by using one of the methods for determining the identity of a nucleotide base at a specific position in a nucleic acid of interest as outlined above. The identities of the nucleotide bases determine the different alleles and, thereby, determine the genotype of the organism at one or more particular genetic loci.