Mixtures of double stranded nucleic acid fragments having different base pair lengths are separated for numerous and diverse reasons. The ability to detect mutations in double stranded polynucleotides, and especially in DNA fragments which have been amplified by PCR, presents a somewhat different problem since DNA fragments containing mutations are generally the same length as their corresponding wild type (defined herein below) but differ in base sequence.
DNA separation and mutation detection are of great importance in medicine, as well as in the physical and social sciences, as well as in forensic investigations. The Human Genome Project is providing an enormous amount of genetic information which is setting new criteria for evaluating the links between mutations and human disorders (Guyer, et al., Proc. Natl. Acad. Sci. USA 92:10841 (1995)). The ultimate source of disease, for example, is described by genetic code that differs from wild type (Cotton, TIG 13:43 (1997)). Understanding the genetic basis of disease can be the starting point for a cure. Similarly, determination of differences in genetic code can provide powerful and perhaps definitive insights into the study of evolution and populations (Cooper, et. al., Human Genetics 69:201 (1985)). Understanding these and other issues related to genetic coding is based on the ability to identify anomalies, i.e., mutations, in a DNA fragment relative to the wild type. A need exists, therefore, for a methodology which can separate DNA fragments based on size differences as well as separate DNA having the same length but differing in base pair sequence (mutations from wild type), in an accurate, reproducible, reliable manner. Ideally, such a method would be efficient and could be adapted to routine high throughput sample screening applications.
DNA molecules are polymers comprising sub-units called deoxynucleotides. The four deoxynucleotides found in DNA comprise a common cyclic sugar, deoxyribose, which is covalently bonded to any of the four bases, adenine (a purine), guanine (a purine), cytosine (a pyrimidine), and thymine (a pyrimidine), hereinbelow referred to as A, G, C, and T respectively. A phosphate group links a 3'-hydroxyl of one deoxynucleotide with the 5'-hydroxyl of another deoxynucleotide to form a polymeric chain. In double stranded DNA, two strands are held together in a helical structure by hydrogen bonds between, what are called, complimentary bases. The complimentarity of bases is determined by their chemical structures. In double stranded DNA, each A pairs with a T and each G pairs with a C, i.e., a purine pairs with a pyrimidine. Ideally, DNA is replicated in exact copies by DNA polymerases during cell division in the human body or in other living organisms. DNA strands can also be replicated in vitro by means of the Polymerase Chain Reaction (PCR).
Sometimes, exact replication fails and an incorrect base pairing occurs, which after further replication of the new strand, results in double stranded DNA offspring containing a heritable difference in the base sequence from that of the parent. Such heritable changes in base pair sequence are called mutations.
In the present invention, double stranded DNA is referred to as a duplex. When a base sequence of one strand is entirely complimentary to a base sequence of the other strand, the duplex is called a homoduplex. When a duplex contains at least one base pair which is not complimentary, the duplex is called a heteroduplex. A heteroduplex is formed during DNA replication when an error is made by a DNA polymerase enzyme and a non-complimentary base is added to a polynucleotide chain being replicated. Further replications of a heteroduplex will, ideally, produce homoduplexes which are heterozygous, i.e., these homoduplexes will have an altered sequence compared to the original parent DNA strand. When the parent DNA has a sequence which predominates in a naturally occurring population, it is generally called "wild type".
Many different types of DNA mutations are known. Examples of DNA mutations include, but are not limited to, "point mutation" or "single base pair mutations" wherein an incorrect base pairing occurs. The most common point mutations comprise "transitions" wherein one purine or pyrimidine base is replaced for another and "transversions" wherein a purine is substituted for a pyrimidine (and visa versa). Point mutations also comprise mutations wherein a base is added or deleted from a DNA chain. Such "insertions" or "deletions" are also known as "frameshift mutations". Although they occur with less frequency than point mutations, larger mutations affecting multiple base pairs can also occur and may be important. A more detailed discussion of mutations can be found in U.S. Pat. No. 5,459,039 to Modrich (1995), and U.S. Pat. No. 5,698,400 to Cotton (1997). These references and the references contained therein are incorporated in their entireties herein.
The sequence of base pairs in DNA code for the production of proteins. In particular, a DNA sequence in the exon portion of a DNA chain codes for the a corresponding amino acid sequence in a protein. Therefore, a mutation in a DNA sequence may result in an alteration in the amino acid sequence of a protein. Such an alteration in the amino acid sequence may be completely benign or may inactivate a protein or alter its function to be life threatening or fatal. On the other hand, mutations in an intron portion of a DNA chain would not be expected to have a biological effect since an intron section does not contain code for protein production. Nevertheless, mutation detection in an intron section may be important, for example, in a forensic investigation.
Detection of mutations is, therefore, of great interest and importance in diagnosing diseases, understanding the origins of disease and the development of potential treatments. Detection of mutations and identification of similarities or differences in DNA samples is also of critical importance in increasing the world food supply by developing diseases resistant and/or higher yielding crop strains, in forensic science, in the study of evolution and populations, and in scientific research in general (Guyer, et al., Proc. Natl. Acad. Sci. USA 92:10841 (1995); Cotton, TIG 13:43 (1997)).
Alterations in a DNA sequence which are benign or have no negative consequences are sometimes called "polymorphisms". In the present invention, any alterations in the DNA sequence, whether they have negative consequences or not, are denoted as "mutations". It is to be understood that the method and system of this invention have the capability to detect mutations regardless of biological effect or lack thereof. For the sake of simplicity, the term "mutation" will be used throughout to mean an alteration in the base sequence of a DNA strand compared to a reference strand (generally, but not necessarily, wild type). It is to be understood that in the context of this invention, the term "mutation" includes the term "polymorphism" or any other similar or equivalent term of art.
A need exists for an accurate and reproducible analytical method for mutation detection which is easy to implement. Ideally, the method would be automated and provide high throughput sample screening with a minimum of operator attention, is also highly desirable.
Prior to this invention, size based analysis of DNA samples relied upon separation by gel electrophoresis (GEP). Capillary gel electrophoresis (CGE) has also been used to separate and analyze mixtures of DNA fragments having different lengths, e.g., the different lengths resulting from restriction enzyme cleavage. However, these methods cannot distinguish DNA fragments which differ in base sequence, but have the same base pair length. Therefore, gel electrophoresis cannot be used directly for mutation detection. This is a serious limitation of GEP.
Gel based analytical methods, such as denaturing gradient gel electrophoresis (DGGE) and denaturing gradient gel capillary electrophoresis (DGGC), can detect mutations in heteroduplex DNA strands under "partially denaturing" conditions. The term "partially denaturing" means the separation of a mismatched base pair (caused by temperature, pH, solvent, or other factors) in a DNA double strand while the remainder of the double strand remains intact. The phenomenon of "partial denaturation" is well known in the art and occurs because a heteroduplex will denature at the site of base pair mismatch at a lower temperature than is required to denature the remainder of the strand. However, these gel based techniques are operationally difficult to implement and require highly skilled personnel. In addition, the analyses are lengthy and require a great deal of set up time. A denaturing capillary gel electrophoresis analysis is limited to relatively small fragments. Separation of a 90 base pair fragment takes more than 30 minutes. A gradient denaturing gel runs overnight and requires about a day of set up time. Additional deficiencies of gradient gels are the isolation of separated DNA fragments (which requires specialized techniques and equipment) and analysis conditions must be experimentally developed for each fragment (Laboratory Methods for the Detection of Mutations and Polymorphisms, ed. G. R. Taylor, CRC Press, 1997). The long analysis time of the gel methodology is further exacerbated by the fact that the movement of DNA fragments in a gel is inversely proportional, in a geometric relationship, to their length. Therefore, the analysis time of longer DNA fragments can be often be untenable.
In addition to the deficiencies of denaturing gel methods mentioned above, these techniques are not always reproducible or accurate since the preparation of a gel and running an analysis can be highly variable from one operator to another, and in general, suffer from serious deficiencies which are inherent to the art.
Recently, an HPLC based ion pairing chromatographic method was introduced to effectively separate mixtures of double stranded polynucleotides, in general and DNA, in particular, wherein the separations are based on base pair length. This method is described in the following references which are incorporated herein in their entireties: U.S. Pat. No. 5,795,976 (1998) to Oefner; U.S. Pat. No. 5,585,236 (1996) to Bonn; Huber, et al., Chromatographia 37:653 (1993); Huber, et al., Anal. Biochem. 212:351 (1993).
As the use and understanding of HPLC developed it became apparent that when HPLC analyses were carried out at a partially denaturing temperature, i.e., a temperature sufficient to denature a heteroduplex at the site of base pair mismatch, homoduplexes could be separated from heteroduplexes having the same base pair length (Hayward-Lester, et al., Genome Research 5:494 (1995); Underhill, et al., Proc. Natl. Acad. Sci. USA 93:193 (1996); Doris, et al., DHPLC Workshop, Stanford University, (1997)). These references and the references contained therein are incorporated herein in their entireties. Thus, the use of Denaturing HPLC (DHPLC) was applied to mutation detection (Underhill, et al., Genome Research 7:996 (1997); Liu, et al., Nucleic Acid Res., 26;1396 (1998)).
DHPLC can separate heteroduplexes that differ by as little as one base pair. However, in certain cases, separations of homoduplexes and heteroduplexes are poorly resolved. Artifacts and impurities can interfere with the interpretation of DHPLC separation chromatograms in the sense that it may be difficult to distinguish between an artifact or impurity and a putative mutation (Underhill, et al., Genome Res. 7:996 (1997)). The presence of mutations may even be missed entirely (Liu, et al., Nucleic Acid Res. 26:1396 (1998)). The references cited above and the references contained therein are incorporated in their entireties herein.
The accuracy, reproducibility, convenience and speed of DNA fragment separations and mutation detection assays based on HPLC have been compromised in the past because of HPLC system related problems. Applicants have addressed these problems and applied the term "Matched Ion Polynucleotide Chromatography" (MIPC) to the separation method and system which is used in connection with the present invention. When used under partially denaturing conditions, MIPC is defined herein as Denaturing Matched Ion Polynucleotide Chromatography (DMIPC).
The term "Matched Ion Polynucleotide Chromatography" as used herein is defined as a process for separating single and double stranded polynucleotides using non-polar separation media, wherein the process uses a counterion agent, and an organic solvent to release the polynucleotides from the separation media. MIPC separations are routinely complete in less than 10 minutes, and frequently in less than 5 minutes. MIPC systems (WAVE.TM. DNA Fragment Analysis System, Transgenomic, Inc. San Jose, Calif.) are equipped with computer controlled ovens which enclose the columns and column inlet areas. Non-limiting examples of key distinguishing features of MIPC include the a) use of hardware having liquid contacting surfaces which do not release multivalent cations therefrom, b) protection of liquid contacting surfaces from exogenous multivalent cations by means cartridges containing multivalent cation capture resins, c) the use of a special washing protocol for MIPC separation media, d) automated selection of an optimum solvent gradient solvent gradient for elution of a specific base length DNA fragment, and e) automated determination of the temperature required to effect partial denaturation of a heteroduplex when MIPC is used under partially denaturing conditions (DMIPC) for mutation detection.
The present invention can be used in the separation of RNA or of double- or single-stranded DNA. For purposes of simplifying the description of the invention, and not by way of limitation, the separation of double-stranded DNA will be described in the examples herein, it being understood that all polynucleotides are intended to be included within the scope of this invention. The invention applies to size-dependent separations and denaturing separations by MIPC. Both these separations can include separations of DNA fragments having non-polar tags.
Important aspects of DNA separation and mutation detection by HPLC and DHPLC which have not been heretofore addressed, comprise a) the treatment of, and materials comprising chromatography system components, b) the treatment of, and materials comprising separation media, c) solvent pre-selection to minimize methods development time, d) optimum temperature pre-selection to effect partial denaturation of a heteroduplex during HPLC and e) optimization of DHPLC for automated high throughput mutation detection screening assays. These factors, which comprise MIPC/DMIPC but not HPLC/DHPLC, are essential when using chromatographic methods in order to achieve unambiguous, accurate, reproducible and high throughput DNA separations and mutation detection results. A comprehensive description of MIPC systems and separation media, including the critical importance of maintaining an environment which is free of multivalent cations, is presented in U.S. Pat. No. 5,772,889 (1998) to Gjerde and U.S. patent application Ser. Nos. 09/129,105 filed Aug. 4, 1998; 09/081,040 filed May 18, 1998; 09/080,547 filed May 18, 1998; 09/058,580 filed Apr. 10, 1998; 09/058,337 filed Apr. 10, 1998; 09/065,913 filed Apr. 24, 1998; 09/039,061 filed Mar. 13, 1998; 09/081,039 filed May 18, 1998. These references and the references contained therein are incorporated in their entireties herein.
All the liquid chromatographic separations discussed herein above comprise gradient elution, i.e., they utilize a multi-component mobile phase wherein the concentration of the driving component, usually an organic solvent, is increased during the course of the chromatography. This approach reduces the time required to complete an analysis. However, the separation of mixture components can be compromised. Efforts have been made to improve the resolving power of MIPC. These efforts have centered on improving the gradient process, changing the column particle size, or changing the column length. However, only small improvements have been achieved with these efforts. Therefore, there exists a need improve the separation of poorly resolved or close running components. Such improvement is especially useful when it is important to isolate a component in pure form, as for example, for PCR amplification, sequencing, mutation detection, and numerous other applications.