The ability to detect mutations in double stranded polynucleotides, and especially in DNA fragments, is of great importance in medicine, as well as in the physical and social sciences. The Human Genome Project is providing an enormous amount of genetic information which is setting new criteria for evaluating the links between mutations and human disorders (Guyer et al., Proc. Natl. Acad. Sci. USA 92:10841 (1995)). The ultimate source of disease, for example, is described by genetic code that differs from wild type (Cotton, TIG 13:43 (1997)). Understanding the genetic basis of disease can be the starting point for a cure. Similarly, determination of differences in genetic code can provide powerful and perhaps definitive insights into the study of evolution and populations (Cooper, et. al., Human Genetics vol. 69:201 (1985)). Understanding these and other issues related to genetic coding is based on the ability to identify anomalies, i.e., mutations, in a DNA fragment relative to the wild type. A need exists, therefore, for a methodology to detect mutations in an accurate, reproducible and reliable manner.
DNA molecules are polymers comprising sub-units called deoxynucleotides. The four deoxynucleotides found in DNA comprise a common cyclic sugar, deoxyribose, which is covalently bonded to any of the four bases, adenine (a purine), guanine (a purine), cytosine (a pyrimidine), and thymine (a pyrimidine), hereinbelow referred to as A, G, C, and T respectively. A phosphate group links a 3'-hydroxyl of one deoxynucleotide with the 5'-hydroxyl of another deoxynucleotide to form a polymeric chain. In double stranded DNA, two strands are held together in a helical structure by hydrogen bonds between, what are called, complimentary bases. The complimentarity of bases is determined by their chemical structures. In double stranded DNA, each A pairs with a T and each G pairs with a C, i.e., a purine pairs with a pyrimidine. Ideally, DNA is replicated in exact copies by DNA polymerases during cell division in the human body or in other living organisms. DNA strands can also be replicated in vitro by means of the Polymerase Chain Reaction (PCR).
Sometimes, exact replication fails and an incorrect base pairing occurs, which after further replication of the new strand results in double stranded DNA offspring containing a heritable difference in the base sequence from that of the parent. Such heritable changes in base pair sequence are called mutations.
In the present invention, double stranded DNA is referred to as a duplex. When the base sequence of one strand is entirely complimentary to base sequence of the other strand, the duplex is called a homoduplex. When a duplex contains at least one base pair which is not complimentary, the duplex is called a heteroduplex. A heteroduplex duplex is formed during DNA replication when an error is made by a DNA polymerase enzyme and a non-complimentary base is added to a polynucleotide chain being replicated. Further replications of a heteroduplex will, ideally, produce homoduplexes which are heterozygous, i.e., these homoduplexes will have an altered sequence compared to the original parent DNA strand. When the parent DNA has the sequence which predominates in a natural population it is generally called the "wild type."
Many different types of DNA mutations are known. Examples of DNA mutations include, but are not limited to, "point mutation" or "single base pair mutations" wherein an incorrect base pairing occurs. The most common point mutations comprise "transitions" wherein one purine or pyrimidine base is replaced for another and "transversions" wherein a purine is substituted for a pyrimidine (and visa versa). Point mutations also comprise mutations wherein a base is added or deleted from a DNA chain. Such "insertions" or "deletions" are also known as "frameshift mutations". Although they occur with less frequency than point mutations, larger mutations affecting multiple base pairs can also occur and may be important. A more detailed discussion of mutations can be found in U.S. Pat. No. 5,459,039 to Modrich (1995), and U.S. Pat. No. 5,698,400 to Cotton (1997). These references and the references contained therein are incorporated in their entireties herein.
The sequence of base pairs in DNA codes for the production of proteins. In particular, a DNA sequence in the exon portion of a DNA chain codes for a corresponding amino acid sequence in a protein. Therefore, a mutation in a DNA sequence may result in an alteration in the amino acid sequence of a protein. Such an alteration in the amino acid sequence may be completely benign or may inactivate a protein or alter its function to be life threatening or fatal. On the other hand, mutations in an intron portion of a DNA chain would not be expected to have a biological effect since an intron section does not contain code for protein production. Nevertheless, mutation detection in an intron section may be important, for example, in a forensic investigation.
Detection of mutations is, therefore, of great interest and importance in diagnosing diseases, understanding the origins of disease and the development of potential treatments. Detection of mutations and identification of similarities or differences in DNA samples is also of critical importance in increasing the world food supply by developing diseases resistant and/or higher yielding crop strains, in forensic science, in the study of evolution and populations, and in scientific research in general (Guyer et al., Proc. Natl. Acad. Sci. USA 92:10841 (1995); Cotton, TIG 13:43 (1997)). These references and the references contained therein are incorporated in their entireties herein.
Alterations in a DNA sequence which are benign or have no negative consequences are sometimes called "polymorphisms". In the present invention, any alterations in the DNA sequence, whether they have negative consequences or not, are called "mutations". It is to be understood that the method of this invention has the capability to detect mutations regardless of biological effect or lack thereof. For the sake of simplicity, the term "mutation" will be used throughout to mean an alteration in the base sequence of a DNA strand compared to a reference strand. It is to be understood that in the context of this invention, the term "mutation" includes the term "polymorphism" or any other similar or equivalent term of art.
There exists a need for an accurate and reproducible analytical method for mutation detection which is easy to implement. Such a method, which can be automated and provide high throughput sample screening with a minimum of operator attention, is also highly desirable.
Analysis of DNA samples has historically been done using gel electrophoresis. Capillary electrophoresis has been used to separate and analyze mixtures of DNA. However, these methods cannot distinguish point mutations from homoduplexes having the same base pair length.
The "heteroduplex site separation temperature" is defined herein to mean, the temperature at which one or more base pairs denature, i.e., separate, at the site of base pair mismatch in a heteroduplex DNA fragment. Since at least one base pair in a heteroduplex is not complimentary, it takes less energy to separate the bases at that site compared to its fully complimentary base pair analog in a homoduplex. This results in the lower melting temperature of a heteroduplex compared to a homoduplex. The local denaturation creates, what is generally called, a "bubble" at the site of base pair mismatch. The bubble distorts the structure of a DNA fragment compared to a fully complimentary homoduplex of the same base pair length. This structural distortion under partially denaturing conditions has been used in the past to separate heteroduplexes and homoduplexes by denaturing gel electrophoresis and denaturing capillary electrophoresis. However, these techniques are operationally difficult to implement and require highly skilled personnel. In addition, the analyses are lengthy and require a great deal of set up time. A denaturing capillary gel electrophoresis analysis of a 90 base pair fragment takes more than 30 minutes and a denaturing gel electrophoresis analysis may take 5 hours or more. The long analysis time of the gel methodology is further exacerbated by the fact that the movement of DNA fragments in a gel is inversely proportional to the length of the fragments.
In addition to the deficiencies of denaturing gel methods mentioned above, these techniques are not always reproducible or accurate since the preparation of a gel and running an analysis is highly variable from one operator to another.
Recently, a chromatographic method called Matched Ion Polynucleotide Chromatography (MIPC) was introduced to effectively separate mixtures of double stranded polynucleotides, in general and DNA, in particular, wherein the separations are based on base pair length (U.S. Pat. No. 5,585,236 to Bonn (1996); Huber, et al., Chromatographia 37:653 (1993); Huber, et al., Anal. Biochem. 212:351 (1993)). These references and the references contained therein are incorporated herein in their entireties. MIPC is not limited by any of the deficiencies associated with gel based separation methods.
The term "Matched Ion Polynucleotide Chromatography" as used herein is defined as a process for separating single and double stranded polynucleotides using non-polar separation media, wherein the process uses a counter-ion agent, and an organic solvent to release the polynucleotides from the separation media. MIPC separations are complete in less than 10 minutes, and frequently in less than 5 minutes. MIPC systems (WAVE.TM. DNA Fragment Analysis System, Transgenomic, Inc. San Jose, Calif.) are equipped with computer controlled ovens which enclose the columns and column inlet areas.
As the use and understanding of MIPC developed it became apparent that when MIPC analyses were carried out at a partially denaturing temperature, i.e., a temperature sufficient to denature a heteroduplex at the site of base pair mismatch, homoduplexes could be separated from heteroduplexes having the same base pair length (Hayward-Lester, et al., Genome Research 5:494 (1995); Underhill, et al., Proc. Natl. Acad. Sci. USA 93:193 (1996); Doris, et al., DHPLC Workshop, Stanford University, (1997)). These references and the references contained therein are incorporated herein in their entireties. Thus, the use of DHPLC was applied to mutation detection (Underhill, et al., Genome Research 7:996 (1997); Liu, et al., Nucleic Acid Res., 26;1396 (1998)).
DHPLC can separate heteroduplexes that differ by as little as one base pair. However, separations of homoduplexes and heteroduplexes can be poorly resolved. Artifacts and impurities can also interfere with the interpretation of DHPLC separation chromatograms in the sense that it may be difficult to distinguish between an artifact or impurity and a putative mutation (Underhill, et al., Genome Res. 7:996 (1997)). The presence of mutations may even be missed entirely (Liu, et al., Nucleic Acid Res. 26:1396 (1998)). The references cited above and the references contained therein are incorporated in their entireties herein.
The accuracy and reproducibility of mutation detection assays based on DHPLC have been compromised in the past for two principle reasons; DHPLC system related problems and PCR related problems.
When used under partially denaturing conditions, MIPC is defined herein as Denaturing Matched Ion Polynucleotide Chromatography (DMIPC).
Samples to be analyzed for the presence or absence of mutations often contain amounts of material too small to detect. The first step in mutation detection assays is, therefore, sample amplification using the PCR process. PCR amplification comprises steps such as primer design, choice of DNA polymerase enzyme, the number of amplification cycles and concentration of reagents. Each of these steps, as well as other steps involved in the PCR process affects the purity of the amplified product. Although the PCR process and the factors which affect fidelity of replication and product purity are well known in the PCR art, these factors have not been addressed, heretofore, in relation to mutation detection using MIPC. As a result, PCR induced mutations, wherein a non-complimentary base is added to a template, are often formed during sample amplification. Such PCR induced mutations make mutation detection results ambiguous, since it may not be clear if a detected mutation was present in the sample or was produced during the PCR process. Unfortunately, many workers in the PCR and mutation detection fields make the erroneous assumption that PCR replication is perfect or close to perfect and PCR induced mutations are generally not taken into consideration in mutation detection analyses. This approach can result in false positives. Applicants have recognized the importance of optimizing PCR sample amplification in order to minimize the formation of PCR induced mutations and ensure an accurate and unambiguous analysis of putative mutation containing samples. The use of MIPC by Applicants to identify and optimize the factors affecting PCR replication fidelity will be discussed in the Detailed Description.
Other aspects of mutation detection by MIPC which have not been heretofore addressed, comprise the treatment of, and materials comprising chromatography system components, the treatment of, and materials comprising separation media, solvent pre-selection to minimize methods development time, optimum temperature pre-selection to effect partial denaturation of a heteroduplex during MIPC and optimization of MIPC for automated high throughput mutation detection screening assays. These factors are essential in order to achieve unambiguous, accurate and reproducible mutation detection results using MIPC.
A need exists to identify and optimize all the aspects of the MIPC methodology in order to minimize artifacts and remove ambiguity from the analysis of samples containing putative mutations.