The detection and characterization of specific nucleic acid sequences and sequence changes have been utilized to detect the presence of viral or bacterial nucleic acid sequences indicative of an infection, the presence of variants or alleles of mammalian genes associated with disease and cancers, and the identification of the source of nucleic acids found in forensic samples, as well as in paternity determinations. As nucleic acid sequence data for genes from humans and pathogenic organisms accumulates, the demand for fast, cost-effective, and easy-to-use tests for as yet unknown, as well as known, mutations within specific sequences is rapidly increasing.
A handful of methods have been devised to scan nucleic acid segments for mutations. One option is to determine the entire gene sequence of each test sample (e.g., a clinical sample suspected of containing bacterial strain). For sequences under approximately 600 nucleotides, this may be accomplished using amplified material (e.g., PCR reaction products). This avoids the time and expense associated with cloning the segment of interest. However, specialized equipment and highly trained personnel are required for DNA sequencing, and the method is too labor-intense and expensive to be practical and effective in the clinical setting.
In view of the difficulties associated with sequencing, a given segment of nucleic acid may be characterized on several other levels. At the lowest resolution, the size of the molecule can be determined by electrophoresis by comparison to a known standard run on the same gel. A more detailed picture of the molecule may be achieved by cleavage with combinations of restriction enzymes prior to electrophoresis, to allow construction of an ordered map. The presence of specific sequences within the fragment can be detected by hybridization of a labeled probe, or the precise nucleotide sequence can be determined by partial chemical degradation or by primer extension in the presence of chain-terminating nucleotide analogs.
For detection of single-base differences between like sequences (e.g., the wild type and a mutant form of a gene), the requirements of the analysis are often at the highest level of resolution. For cases in which the position of the nucleotide in question is known in advance, several methods have been developed for examining single base changes without direct sequencing. For example, if a mutation of interest happens to fall within a restriction recognition sequence, a change in the pattern of digestion can be used as a diagnostic tool (e.g., restriction fragment length polymorphism [RFLP] analysis). In this way, single point mutations can be detected by the creation or destruction of RFLPs.
Single-base mutations have also been identified by cleavage of RNA--RNA or RNA-DNA heteroduplexes using RNaseA (Myers et al., Science 230:1242 [1985] and Winter et al., Proc. Natl. Acad. Sci. USA 82:7575 [1985]). Mutations are detected and localized by the presence and size of the RNA fragments generated by cleavage at the mismatches. Single nucleotide mismatches in DNA heteroduplexes are also recognized and cleaved by some chemicals, providing an alternative strategy to detect single base substitutions, generically named the "Mismatch Chemical Cleavage" (MCC) (Gogos et al., Nucl. Acids Res., 18:6807-6817 [1990]). However, this method requires the use of osmium tetroxide and piperidine, two highly noxious chemicals which are not suited for use in a clinical laboratory. In addition, all of the mismatch cleavage methods lack sensitivity to some mismatch pairs, and all are prone to background cleavage at sites removed from the mismatch.
RFLP analysis suffers from low sensitivity and requires a large amount of sample. When RFLP analysis is used for the detection of point mutations, it is, by its nature, limited to the detection of only those single base changes which fall within a restriction sequence of a known restriction endonuclease. Moreover, the majority of the available enzymes have 4 to 6 base-pair recognition sequences, and cleave too frequently for many large-scale DNA manipulations (Eckstein and Lilley (eds.), Nucleic Acids and Molecular Biology, vol. 2, Springer-Verlag, Heidelberg [1988]). Thus, it is applicable only in a small fraction of cases, as most mutations do not fall within such sites.
A handful of rare-cutting restriction enzymes with 8 base-pair specificities have been isolated and these are widely used in genetic mapping, but these enzymes are few in number, are limited to the recognition of G+C-rich sequences, and cleave at sites that tend to be highly clustered (Barlow and Lehrach, Trends Genet., 3:167 [1987]). Recently, endonucleases encoded by group I introns have been discovered that might have greater than 12 base-pair specificity (Perlman and Butow, Science 246:1106 [1989]), but again, these are few in number.
If the change is not in a restriction enzyme recognition sequence, then allele-specific oligonucleotides (ASOs), can be designed to hybridize in proximity to the unknown nucleotide, such that a primer extension or ligation event can be used as the indicator of a match or a mis-match. Hybridization with radioactively labeled allelic specific oligonucleotides (ASO) also has been applied to the detection of specific point mutations (Conner, Proc. Natl. Acad. Sci., 80:278 [1983]). The method is based on the differences in the melting temperature of short DNA fragments differing by a single nucleotide (Wallace et al., Nucl. Acids Res. 6:3543 [1979]). Similarly, hybridization with large arrays of short oligonucleotides was proposed as a method for DNA sequencing (Bains and Smith, J. Theor. Biol. 135:303 [1988]) (Drmanac et al., Genomics 4:114 [1989]). To perform either method it is necessary to work under conditions in which the formation of mismatched duplexes is eliminated or reduced while perfect duplexes still remains stable. Such conditions are termed "high stringency" conditions. The stringency of hybridization conditions can be altered in a number of ways known in the art. In general, changes in conditions that enhance the formation of nucleic acid duplexes, such as increases in the concentration of salt, or reduction in the temperature of the solution, are considered to reduce the stringency of the hybridization conditions. Conversely, reduction of salt and elevation of temperature are considered to increase the stringency of the conditions. Because it is easy to change and control, variation of the temperature is commonly used to control the stringency of nucleic acid hybridization reactions.
Discrimination of hybridization based solely on the presence of a mismatch imposes a limit on probe length because effect of a single mismatch on the stability of a duplex is smaller for longer duplexes. For oligonucleotides designed to detect mutation in genomes of high complexity, such as human DNA, it has been shown that the optimal length for hybridization is between 16 and 22 nucleotides, and the temperature window within which the hybridization stringency will allow single base discrimination can be as large as 10.degree. C. (Wallace [1979], supra). Usually, however, it is much narrower, and for some mismatches, such as G-T, it may be as small as 1 to 2.degree. C. These windows may be even smaller if any other reaction conditions, such as temperature, pH, concentration of salt and the presence of destabilizing agents (e.g., urea, formamide, dimethylsulfoxide) alter the stringency. Thus, for successful detection of mutations using such high stringency hybridization methods, a tight control of all parameters affecting duplex stability is critical.
In addition to the degree of homology between the oligonucleotide probe and the target nucleic acid, efficiency of hybridization also depends on the secondary structure of the target molecule. Indeed, if the region of the target molecule that is complementary to the probe is involved in the formation of intramolecular structures with other regions of the target, this will reduce the binding efficiency of the probe. Interference with hybridization by such secondary structure is another reason why high stringency conditions are so important for sequence analysis by hybridization. High stringency conditions reduce the probability of secondary structures formation (Gamper et al., J. Mol. Biol. 197:349 [1987]). Another way to of reducing the probability of secondary structure formation is to decrease the length of target molecules, so that fewer intrastrand interactions can occur. This can be done by a number of methods, including enzymatic, chemical or thermal cleavage or degradation. Currently, it is standard practice to perform such a step in commonly used methods of sequence analysis by hybridization to fragment the target nucleic acid into short oligonucleotides (Fodor et al., Nature 364:555 [1993]).
Two other methods of mutation detection rely on detecting changes in electrophoretic mobility in response to minor sequence changes. One of these methods, termed "Denaturing Gradient Gel Electrophoresis" (DGGE) is based on the observation that slightly different sequences will display different patterns of local melting when electrophoretically resolved on a gradient gel. In this manner, variants can be distinguished, as differences in the melting properties of homoduplexes versus heteroduplexes differing in a single nucleotide can be used to detect the presence of mutations in the target sequences because of the corresponding changes in the electrophoretic mobilities of the hetero- and homoduplexes. The fragments to be analyzed, usually PCR products, are "clamped" at one end by a long stretch of G-C base pairs (30-80) to allow complete denaturation of the sequence of interest without complete dissociation of the strands. The attachment of a GC "clamp" to the DNA fragments increases the fraction of mutations that can be recognized by DGGE (Abrams et al., Genomics 7:463 [1990]). Attaching a GC clamp to one primer is critical to ensure that the amplified sequence has a low dissociation temperature (Sheffield et al., Proc. Natl. Acad. Sci., 86:232 [1989]; and Lerman and Silverstein, Meth. Enzymol. 155:482 [1987]). Modifications of the technique have been developed, using temperature gradient gels (Wartell et al., Nucl. Acids Res. 18:2699-2701 [1990]), and the method can be also applied to RNA:RNA duplexes (Smith et al., Genomics 3:217 [1988]).
Limitations on the utility of DGGE include the requirement that the denaturing conditions must be optimized for each specific nucleic acid sequence to be tested. Furthermore, the method requires specialized equipment to prepare the gels and maintain the high temperatures required during electrophoresis. The expense associated with the synthesis of the clamping tail on one oligonucleotide for each sequence to be tested is also a major consideration. In addition, long running times are required for DGGE. The long running time of DGGE was shortened in a modification of DGGE called constant denaturant gel electrophoresis (CDGE) (Borrensen et al., Proc. Natl. Acad. Sci. USA 88:8405 [1991]). CDGE requires that gels be performed under different denaturant conditions in order to reach high efficiency for the detection of unknown mutations. Both DGGE and CDGE are unsuitable for use in clinical laboratories.
An technique analogous to DGGE, termed temperature gradient gel electrophoresis (TGGE), uses a thermal gradient rather than a chemical denaturant gradient (Scholz, et al., Hum. Mol. Genet. 2:2155 [1993]). TGGE requires the use of specialized equipment which can generate a temperature gradient perpendicularly oriented relative to the electrical field. TGGE can detect mutations in relatively small fragments of DNA therefore scanning of large gene segments requires the use of multiple PCR products prior to running the gel.
Another common method, called "Single-Strand Conformation Polymorphism" (SSCP) was developed by Hayashi, Sekya and colleagues (reviewed by Hayashi, PCR Meth. Appl., 1:34-38, [1991]) and is based on the observation that single strands of nucleic acid can take on characteristic conformations under non-denaturing conditions, and these conformations influence electrophoretic mobility. The complementary strands assume sufficiently different structures that the two strands may be resolved from one another. Changes in the sequence of a given fragment will also change the conformation, consequently altering the mobility and allowing this to be used as an assay for sequence variations (Orita, et al., Genomics 5:874 [1989]).
The SSCP process involves denaturing a DNA segment (e.g., a PCR product) that is labelled on both strands, followed by slow electrophoretic separation on a non-denaturing polyacrylamide gel, so that intra-molecular interactions can form and not be disturbed during the run. This technique is extremely sensitive to variations in gel composition and temperature. A serious limitation of this method is the relative difficulty encountered in comparing data generated in different laboratories, under apparently similar conditions.
The dideoxy fingerprinting (ddF) technique is another technique developed to scan genes for the presence of unknown mutations (Liu and Sommer, PCR Methods Appli., 4:97 [1994]). The ddF technique combines components of Sanger dideoxy sequencing with SSCP. A dideoxy sequencing reaction is performed using one dideoxy terminator and then the reaction products are electrophoresised on nondenaturing polyacrylamide gels to detect alterations in mobility of the termination segments as in SSCP analysis. While ddF is an improvement over SSCP in terms of increased sensitivity, ddF requires the use of expensive dideoxynucleotides and this technique is still limited to the analysis of fragments of the size suitable for SSCP (i.e., fragments of 200-300 bases for optimal detection of mutations).
In addition to the above limitations, all of these methods are limited as to the size of the nucleic acid fragment that can be analyzed. For the direct sequencing approach, sequences of greater than 600 base pairs require cloning, with the consequent delays and expense of either deletion sub-cloning or primer walking, in order to cover the entire fragment. SSCP and DGGE have even more severe size limitations. Because of reduced sensitivity to sequence changes, these methods are not considered suitable for larger fragments. Although SSCP is reportedly able to detect 90% of single-base substitutions within a 200 base-pair fragment, the detection drops to less than 50% for 400 base pair fragments. Similarly, the sensitivity of DGGE decreases as the length of the fragment reaches 500 base-pairs. The ddF technique, as a combination of direct sequencing and SSCP, is also limited by the relatively small size of the DNA that can be screened.
Another method of detecting sequence polymorphisms based on the conformation assumed by strands of nucleic acid is the Cleavase.RTM. Fragment Length Polymorphism (CFLP.RTM.) method (Brow et al., J. Clin. Microbiol. 34:3129 [1996]; PCT International Application No. PCT/US95/14673 [WO 96/15267]; co-pending application Ser. Nos. 08/484,956 and 08/520,946). This method uses the actions of a structure specific nuclease to cleave the folded structures, thus creating a set of product fragments that can by resolved by size, e.g., by electrophoresis. This method is much less sensitive to size so that entire genes, rather than gene fragments, may be analyzed.
In many situations, e.g., in many clinical laboratories, electrophoretic separation and analysis may not be technically feasible, or may not be able to accommodate the processing of a large number of samples in a cost-effective manner. There is a clear need for a method of analyzing the characteristic conformations of nucleic acids without the need for either electrophoretic separation of conformations or fragments or for elaborate and expensive methods of visualizing gels (e.g., darkroom supplies, blotting equipment or fluorescence imagers).