Understanding the variation, or polymorphism, in a species's genome serves multiple scientific and practical goals. Examples include the study of evolutionary relatedness, the study of the genetic relatedness and geographical dispersion of populations within a species, the study of disease mechanism and predisposition, genetic linkage mapping in genome study and animal and plant breeding, the forensic identification and differentiation of individuals, the clinical diagnosis and prognosis of inherited diseases and cancer, and the clinical diagnosis and prognosis of drug resistance in infectious microorganisms.
A variety of methods have been described for discovering mutations, a process sometimes called "mutation scanning". Representative methods include denaturing gradient gel electrophoresis, constant denaturant gel electrophoresis, temporal temperature gradient gel electrophoresis, single-stranded conformational polymorphism analysis, denaturing HPLC, direct heteroduplex electrophoretic analysis, protein truncation test, immobilized mismatch binding protein assay, cleavage fragment length polymorphism analysis, enzymatic mismatch scanning, chemical cleavage of mismatch analysis (CCM), and complete sequence analysis (e.g., Sanger sequencing). As presently practiced, each of these methods suffers from one or more disadvantages which limits its value for cost-effective, high-throughput, reliable, mutation scanning, including low clinical specificity (a significant false-positive rate); low clinical sensitivity (a significant false-negative rate); low sensitivity to mutations present in only a small fraction of the nucleic acid (with a high normal background); limitation to relatively short sequences; low information content; low throughput; low potential for automation; high labor requirement; long turnaround time; dependence on expensive, labile, or proprietary reagents; or dependence on toxic or noxious reagents.
Of the mutation scanning methods listed above, CCM is one of the most attractive, especially for clinical applications requiring specificity, sensitivity, high information content, and low cost. However, conventional CCM methods (Cotton; Meo) suffer from two major weaknesses: (1) the use of toxic and noxious reagents and (2) the need for multiple, labor-intensive, hard-to-automate, technically demanding separation steps to remove the CCM reagents from the nucleic acid sample.
Conventional CCM entails the application of Maxam-Gilbert sequencing chemistry (Maxam and Gilbert) to the reactive mismatched bases generated when largely identical normal and mutant ("parental") sequences are mixed, denatured, and re-annealed to create a mixture of parental (perfectly matched) duplexes and heteroduplexes, where the heteroduplexes, containing one strand from each of the parental sequences, are mismatched at any mutant positions. There are two common ways to create heteroduplexes. (1) Nucleic acid samples purified from control and test individuals or tissues can be mixed, denatured by heating or adding alkali, and renatured by cooling or adding acid to return the pH approximately to neutrality. This method works whether the test sample is homozygous or heterozygous for the mutation. (2) Alternatively, as long as the test sample is likely to contain a mixture of normal and mutant sequences (e.g., because it is heterozygous for any mutation or because it is purified from a mixture of normal and mutant cells, as in many cancers), it can be denatured and renatured to form heteroduplex without adding control nucleic acid (Dianzani). The nucleic acid in question is almost always entirely DNA, but CCM is distinguished from some of the other mutation scanning methods in that one or both nucleic acid samples could be RNA. The parental nucleic acids could, in principle, be single-stranded, as long as the mixture contained the complementary strands needed to form a duplex molecule; but in practice the parental molecules are almost always duplex DNA, generated by PCR amplification of a sequence from genomic DNA or from cDNA generated from RNA by a reverse transcription reaction.
In conventional CCM, chemical treatment of potential heteroduplex to cleave at mismatched sites occurs in two stages. First, in a base-removal step, the nucleic acid is treated either (a) with moderately concentrated (16 mM) aqueous osmium tetroxide (OsO.sub.4) in fairly concentrated (0.3 M) pyridine under conditions favoring destruction of mismatched thymines while minimally affecting matched base pairs and mismatched purines and cytosines, or (b) with concentrated (about 4 M) aqueous hydroxylamine (NH.sub.2 OH) in concentrated (about 2 M) diethylammonium chloride under conditions favoring destruction of mismatched cytosines, substantially sparing matched base pairs and mismatched purines and thymines. The two base-removal reactions are run separately, in parallel, in order to assure that all mutations are detected and to supply some information about the chemical nature of the mutation. Following the base-removal reaction, the base-removal reagent is separated from the nucleic acid, typically by ethanol precipitation. Next, in a cleavage reaction, concentrated (1 M) aqueous piperidine is incubated with the nucleic acid and incubated at about 90.degree. C. for about 20 minutes to cleave any abasic sites generated by the base-removal reaction. Because the base-removal reaction usually removes only pyrimidines, cleavage commonly generates nicked duplex nucleic acid rather than double-stranded breaks. Then ethanol precipitation is used again, to eliminate the cleavage reagent, before dissolving the nucleic acid in a solvent suitable for subsequent analysis. Typically, analysis of the cleaved heteroduplex is performed by denaturing gel electrophoresis, in a high-resolution sequencing gel capable of detecting differences the size of the parental strands and any cleavage fragments to within one or a few nucleotides. Also typically, the nucleic acid strands are tagged at the 5' end with either radioactivity or fluorescent dyes, in a way which allows approximate or exact location of the cleavage point in the nucleic acid sequence. For example, one PCR primer is 5'-tagged with a fluorescent dye, the other primer is tagged with a rhodamine dye, and the cleavage product is analyzed on a multicolor, automated DNA sequencer (Meo). Any heteroduplex molecule with only one mismatched position can be cleaved on only the strand(s) containing a mismatched pyrimidine, but every heteroduplex population generated from duplex parental molecules must contain two, "complementary", mismatched heteroduplex molecules. Therefore, several different fluorescent electropherograms can be obtained for base-substitution mutations, depending on whether the mismatch pair is A--A and T--T, C--C and G--G, A-G and T-C, or A-C, and G-T, and on whether OsO.sub.4 or NH.sub.2 OH was used for mismatched pyrimidine removal. With two-dye fluorescent analysis, the position and chemical nature of the mutation can be determined from the two patterns obtained with the two base-removal reagents.
Informative though the conventional CCM method is, it is unsuitable for economical, high-throughput, clinical mutation screening. OsO.sub.4 is volatile, toxic, and noxious; pyridine is volatile and noxious; NH.sub.2 OH is volatile, putatively carcinogenic, and mildly noxious; and piperidine is volatile and noxious. This toxicology is aggravated by the high concentrations needed for effectiveness. Furthermore, the several nucleic acid precipitation and washing steps increase operator exposure to the undesirable reagents, require considerable operator skill and judgment to get good recoveries, and are difficult to automate. Finally, hot aqueous piperidine occasionally causes some background cleavage at sites where no base removal has occurred.
One published effort to reduce the operational complexity of conventional CCM effected mismatched cytosine and thymine removal serially on one sample rather than in parallel on two samples, but could not avoid the inclusion of a separation step after each chemical step (Ramus and Cotton). It was reported to be important to remove cytosine before thymine.
Gogos et al. showed that dilute (0.1 mM) potassium permanganate (KMnO.sub.4, a non-noxious and effectively non-toxic reagent) could replace OsO.sub.4 /pyridine for mismatched thymine removal from heteroduplexes, with much improved sensitivity and specificity if the reaction occurred in 3 M tetramethylammonium chloride (TMAC) (Gogos). They also showed that 0.5 to 4 M tetraethylammonium chloride (TEAC) (2 M preferred) increased the sensitivity and specificity of mismatched cytidine removal by NH.sub.2 OH. They reduced the need for demanding separation steps by adsorbing the DNA to an ion-exchange paper, performing the chemical treatments on paper-bound DNA, and changing chemical conditions simply by shifting the paper from one reagent solution to another with an intervening water wash. Unfortunately, the commercial supplier of the ion-exchange paper (Amersham) since has discontinued its sale. Furthermore, the piperidine treatment removed the cleaved DNA from the paper, so that time-consuming lyophilization was needed to remove the concentrated piperidine from the DNA before electrophoretic analysis. In addition, Gogos et al. stated that their improved reagents were more specific (and therefore successful) for immobilized DNA than for DNA treated in solution. Another limitation of this work was that almost all experiments studied the cleavage of labeled oligonucleotide probes interrogating only rather short (approximately 30 nt) sequences: unrealistic models for the PCR-generated polynucleotide samples needed for cost-effective clinical mutation scanning.
In a totally different application not dependent on base-removal reagents (chemical detection of photoproducts in UV-damaged DNA), McHugh and Knowland showed that 1,2-ethylenediamine and two related compounds (piperazine and N,N'-dimethylethylenediamine) can replace piperidine for cleaving abasic sites in duplex DNA. These relatively non-toxic reagents generate cleavage products with subtle structural differences yielding observable electrophoretic mobility differences, but they have several striking advantages over piperidine which could greatly benefit mutation detection. (1) They are effective at much lower concentrations than piperidine (as low as 20 mM for ethylenediamine). (2) They cause much less background cleavage of perfectly matched duplex DNA than does piperidine.
The TMAC found by Gogos to improve KMnO.sub.4 sensitivity and specificity may be crucial to this advance because of the documented binding to and stabilization of duplex DNA by tetramethylammonium ion (Shapiro; Melchior and von Hippel). The necessarily high concentration of TMAC, a strong electrolyte, should interfere with direct electrophoretic analysis because it increases conductivity of the electrophoresis sample by over an order of magnitude. It may also block the electrostatic interaction of ethylenediamine and its analogs with duplex DNA which was theorized by McHugh and Knowland to be responsible for their unusual cleavage efficiency.
Betaine, a zwitterionic analog of tetramethylammonium ion which has no net charge at pH values near and above neutrality, appears to affect duplex DNA stability much the same way that tetramethylammonium ion does (Rees). The benefit of high betaine concentrations already has been applied to the sequencing and PCR amplification of difficult, usually GC-rich, nucleic acid targets, where this additive improves the interaction of the DNA polymerase with its substrate (in some way not yet completely understood) without increasing the ionic strength to levels which would inhibit the enzyme (Papp; Mytelka and Chamberlin; Baskaran; Weissensteiner and Lanchbury).
The operational simplicity of the present invention derives significantly from finding chemical conditions where both pyrimidines (and even purines) are removed from mismatched positions while remaining substantially untouched in correctly matched duplex DNA. Uracil can be completely substituted for thymine in PCR with some reduction of amplification efficiency; radical reduction in the sensitivity of hybridization-based detection of dU-containing amplicon implies that replacement of T with dU significantly changes the conformation or stability of AT-rich sequences of duplex DNA (Carmody and Vary). It has been noted in sequencing applications, which examine single-stranded DNA, that although thymine resists hydroxylamine attack at all pH values, uracil can be removed by NH.sub.2 OH at high pH (Rubin and Schmid). This observation raises the question whether under appropriate conditions NH.sub.2 OH can function as a single reagent for removal of both mismatched C and U from duplex DNA. The technical challenge in improving mutation scanning by CCM is to optimize and streamline the use of chemistries developed to analyze single-stranded DNA in an application which preserves the structure and chemical inertia of perfectly matched regions of duplex DNA.