The ability to detect mutations in double stranded polynucleotides, and especially in DNA fragments, is of great importance in medicine, as well as in the physical and social sciences. The Human Genome Project is providing an enormous amount of genetic information which is setting new criteria for evaluating the links between mutations and human disorders (Guyer et al., Proc. Natl. Acad. Sci. U.S.A 92:10841 (1995)). The ultimate source of disease, for example, is described by genetic code that differs from wild type (Cotton, TIG 13:43 (1997)). Understanding the genetic basis of disease can be the starting point for a cure. Similarly, determination of differences in genetic code can provide powerful and perhaps definitive insights into the study of evolution and populations (Cooper, et. al., Human Genetics vol. 69:201 (1985)).
Understanding these and other issues related to genetic coding is based on the ability to identify anomalies, i.e., mutations, in a DNA fragment relative to the wild type. A need exists, therefore, for a methodology to detect mutations in an accurate, reproducible and reliable manner.
DNA molecules are polymers comprising sub-units called deoxynucleotides. The four deoxynucleotides found in DNA comprise a common cyclic sugar, deoxyribose, which is covalently bonded to any of the four bases, adenine (a purine), guanine (a purine), cytosine (a pyrimidine), and thymine (a pyrimidine), hereinbelow referred to as A, G, C, and T respectively. A phosphate group links a 3′-hydroxyl of one deoxynucleotide with the 5′-hydroxyl of another deoxynucleotide to form a polymeric chain. In double stranded DNA, two strands are held together in a helical structure by hydrogen bonds between, what are called, complementary bases. The complementarity of bases is determined by their chemical structures. In double stranded DNA, each A pairs with a T and each G pairs with a C, i.e., a purine pairs with a pyrimidine. Ideally, DNA is replicated in exact copies by DNA polymerases during cell division in the human body or in other living organisms.
Sometimes, exact replication fails and an incorrect base pairing occurs, which after further replication of the new strand results in double stranded DNA offspring containing a heritable difference in the base sequence from that of the parent. Such heritable changes in base pair sequence are called mutations.
In the present invention, double stranded DNA is referred to as a duplex. When the base sequence of one strand is entirely complementary to base sequence of the other strand, the duplex is called a homoduplex. When a duplex contains at least one base pair which is not complementary, the duplex is called a heteroduplex. A heteroduplex can be formed during DNA replication when an error is made by a DNA polymerase enzyme and a non-complementary base is added to a polynucleotide chain being replicated. A heteroduplex can also be formed during repair of a DNA lesion. Further replications of a heteroduplex will, ideally, produce homoduplexes which are heterozygous, i.e., these homoduplexes will have an altered sequence compared to the original parent DNA strand. When the parent DNA has the sequence which predominates in a natural population it is generally called the “wild type.”
Many different types of DNA mutations are known. Examples of DNA mutations include, but are not limited to, “point mutation” or “single base pair mutations” wherein an incorrect base pairing occurs. The most common point mutations comprise “transitions” wherein one purine or pyrimidine base is replaced for another and “transversions” wherein a purine is substituted for a pyrimidine (and visa versa). Point mutations also comprise mutations wherein a base is added or deleted from a DNA chain. Such “insertions” or “deletions” are also known as “frameshift mutations”. Although they occur with less frequency than point mutations, larger mutations affecting multiple base pairs can also occur and may be important. A more detailed discussion of mutations can be found in U.S. Pat. No. 5,459,039 to Modrich (1995), and U.S. Pat. No. 5,698,400 to Cotton (1997). These references and the references contained therein are incorporated in their entireties herein.
The sequence of base pairs in DNA codes for the production of proteins. In particular, a DNA sequence in the exon portion of a DNA chain codes for a corresponding amino acid sequence in a protein. Therefore, a mutation in a DNA sequence may result in an alteration in the amino acid sequence of a protein. Such an alteration in the amino acid sequence may be completely benign or may inactivate a protein or alter its function to be life threatening or fatal. Intronic mutations at splice sites may also be causative of disease (e.g. β-thalassemia). Mutation detection in an intron section may be important by causing altered splicing of mRNA transcribed from the DNA, and may be useful, for example, in a forensic investigation.
Detection of mutations is, therefore, of great interest and importance in diagnosing diseases, understanding the origins of disease and the development of potential treatments. Detection of mutations and identification of similarities or differences in DNA samples is also of critical importance in increasing the world food supply by developing diseases resistant and/or higher yielding crop strains, in forensic science, in the study of evolution and populations, and in scientific research in general (Guyer et al., Proc. Natl. Acad. Sci. U.S.A 92:10841 (1995); Cotton, TIG 13:43 (1997)). These references and the references contained therein are incorporated in their entireties herein.
Analysis of DNA samples has historically been done using gel electrophoresis. Capillary electrophoresis has been used to separate and analyze mixtures of DNA. However, these methods cannot distinguish point mutations from homoduplexes having the same base pair length.
Recently, a chromatographic method called ion-pair reverse-phase high pressure liquid chromatography (IP-RP-HPLC), also referred to as Matched Ion Polynucleotide Chromatography (MIPC), was introduced to effectively separate mixtures of double stranded polynucleotides, in general and DNA, in particular, wherein the separations are based on base pair length (Huber, et al., Chromatographia 37:653 (1993); Huber, et al., Anal. Biochem. 212:351 (1993); U.S. Pat. Nos. 5,585,236; 5,772,889; 5,972,222; 5,986,085; 5,997,742; 6,017,457; 6,030,527; 6,056,877; 6,066,258; 6,210,885; and U.S. patent application Ser. No. 09/129,105 filed Aug. 4, 1998.
As the use and understanding of IP-RP-HPLC developed it became apparent that when IP-RP-HPLC analyses were carried out at a partially denaturing temperature, i.e., a temperature sufficient to denature a heteroduplex at the site of base pair mismatch, homoduplexes could be separated from heteroduplexes having the same base pair length (Hayward-Lester, et al., Genome Research 5:494 (1995); Underhill, et al., Proc. Natl. Acad. Sci. U.S.A 93:193 (1996); Doris, et al., DHPLC Workshop, Stanford University, (1997)). Thus, the use of denaturing high performance liquid chromatography (DHPLC) was applied to mutation detection (Underhill, et al., Genome Research 7:996 (1997); Liu, et al., Nucleic Acid Res., 26;1396 (1998)). These chromatographic methods are generally used to detect whether or not a mutation exists in a test DNA fragment.
DHPLC, as known in the art, provides a method for separating heteroduplex and homoduplex nucleic acid molecules (e.g., DNA or RNA) in a mixture using high performance liquid chromatography. In the separation method, a mixture containing both heteroduplex and homoduplex nucleic acid molecules is applied to a stationary reverse-phase support. The sample mixture is then eluted with a mobile phase containing an ion-pairing reagent and an organic solvent. Sample elution is carried out under conditions effective to at least partially denature the heteroduplexes and results in the separation, or at least partial separation, of the heteroduplex and homoduplex molecules.
Single nucleotide polymorphisms (SNPs) are thought to be ideally suited as genetic markers for establishing genetic linkage and as indicators of genetic diseases (Landegre et al. Science 242:229–237 (1988)). In some cases a single SNP is responsible for a genetic disease. According to estimates the human genome may contain over 3 million SNPs. Due to their propensity they lend themselves to very high resolution genotyping. The SNP consortium, a joint effort of 10 major pharmaceutical companies, has announced the development of 300,000 SNP markers and their placement in the public domain by mid 2001.
The efficiency of DHPLC for detection of novel mutations (frequently termed scanning) has been quantified by several authors. Results ranged from 87% detection when a single-temperature analysis was used without any amplicon design (Cargill, et al. Nature Genet. 22:231–238 (1999)) to 100% detection in a blinded study of many polymorphisms within a single, well-behaved amplicon (O'Donovan et al., Genomics 52:44–49 (1998)). Comparisons with single-strand conformation polymorphism (SSCP) (Choy et al., Ann. Hum. Genet. 63:383–391 (1999); Gross et al., Hum. Genet. 105:72–78 (1999); Dobson-Stone et al., Eur. J. Hum. Genet. 8:24–32. (2000)) and denaturing gradient gel electrophoresis (DGGE) (Skopek et al., Mutat. Res. 430:13–21 (1999)) have shown DHPLC to have a superior detection rate, whereas most recently DHPLC has been shown to detect mutations reliably in BRCA1 and BRCA2 (Wagner et al., Genomics 62:369–376 (1999)).
The ability of DHPLC to detect mutations may be less than 100% in some cases, for example if a mutation site is within a region having high GC content. There is a need for methods, compositions, and devices for improving the ability of DHPLC to detect such mutations.
In DHPLC, the required analysis temperature depends on the sequence of the fragment and on the position of the mutation. DHPLC analysis is typically run at an elevated column temperature (e.g. in the range of about 50° C. to about 80° C.) with the temperature being thermostatically controlled. As described in the prior art (U.S. Pat. Nos. 6,287,882 and 6,103,112), it is preferred to maintain the temperature control within a range of +0.1° C. in order to reliably detect heteroduplexes. However, this requirement increases the cost and complexity of the column temperature control system. For multiple different DNA samples, the column oven must be adjusted for each sample, thus potentially slowing down sample throughput.
Algorithms and software for predicting the temperature for conducting DHPLC have been described (U.S. Pat. Nos. 5,795,976; 6,197,516; 6,287,882; and U.S. patent application Ser. No. 09/469,551 filed Dec. 22, 1999) and are available commercially (Wavemaker® software and Navigator™ software (Transgenomic)) and on the World Wide Web (http://insertion.stanford.edu/melt.html). These programs require that the user input the sequence of the DNA being analyzed. Additional empirical analyses often must be performed, below and above the predicted temperature value, in order to arrive at a suitable temperature for detecting heteroduplexes.
There is a need for DHPLC methods, compositions and systems that minimize the requirement for costly column temperature control devices. There is a need for eliminating the requirement for using a separate column temperature for each fragment being analyzed. There is a need for methods and compositions that can avoid the use of costly software for temperature prediction, and that can avoid the need for conducting multiple empirical analyses.