Mixtures of double stranded nucleic acid fragments having different base pair lengths are separated for numerous and diverse reasons. The ability to detect mutations in double stranded polynucleotides, and especially in DNA fragments which have been amplified by PCR, presents a somewhat different problem since DNA fragments containing mutations are generally the same length as their corresponding wild type (defined herein below) but differ in base sequence.
DNA separation and mutation detection are of great importance in medicine, as well as in the physical and social sciences, as well as in forensic investigations. The Human Genome Project is providing an enormous amount of genetic information which is setting new criteria for evaluating the links between mutations and human disorders (Guyer, et al., Proc. Natl. Acad. Sci. USA 92:10841 (1995)). The ultimate source of disease, for example, is described by genetic code that differs from wild type (Cotton, TIG 13:43 (1997)). Understanding the genetic basis of disease can be the starting point for a cure. Similarly, determination of differences in genetic code can provide powerful and perhaps definitive insights into the study of evolution and populations (Cooper, et. al., Human Genetics 69:201 (1985)). Understanding these and other issues related to genetic coding is based on the ability to identify anomalies, i.e., mutations, in a DNA fragment relative to the wild type. A need exists, therefore, for a methodology which can separate DNA fragments based on size differences as well as separate DNA having the same length but differing in base pair sequence (mutations from wild type), in an accurate, reproducible, reliable manner. Ideally, such a method would be efficient and could be adapted to routine high throughput sample screening applications.
DNA molecules are polymers comprising sub-units called deoxynucleotides. The four deoxynucleotides found in DNA comprise a common cyclic sugar, deoxyribose, which is covalently bonded to any of the four bases, adenine (a purine), guanine (a purine), cytosine (a pyrimidine), and thymine (a pyrimidine), hereinbelow referred to as A, G, C, and T respectively. A phosphate group links a 3'-hydroxyl of one deoxynucleotide with the 5'-hydroxyl of another deoxynucleotide to form a polymeric chain. In double stranded DNA, two strands are held together in a helical structure by hydrogen bonds between, what are called, complimentary bases. The complimentarity of bases is determined by their chemical structures. In double stranded DNA, each A pairs with a T and each G pairs with a C, i.e., a purine pairs with a pyrimidine. Ideally, DNA is replicated in exact copies by DNA polymerases during cell division in the human body or in other living organisms. DNA strands can also be replicated in vitro by means of the Polymerase Chain Reaction (PCR).
Sometimes, exact replication fails and an incorrect base pairing occurs, which after further replication of the new strand, results in double stranded DNA offspring containing a heritable difference in the base sequence from that of the parent. Such heritable changes in base pair sequence are called mutations.
In the present invention, double stranded DNA (dsDNA) is referred to as a duplex. When a base sequence of one strand is entirely complimentary to a base sequence of the other strand, the duplex is called a homoduplex. When a duplex contains at least one base pair which is not complimentary, the duplex is called a heteroduplex. A heteroduplex is formed during DNA replication when an error is made by a DNA polymerase enzyme and a non-complimentary base is added to a polynucleotide chain being replicated. Further replications of a heteroduplex will, ideally, produce homoduplexes which are heterozygous, i.e., these homoduplexes will have an altered sequence compared to the original parent DNA strand. When the parent DNA has a sequence which predominates in a naturally occurring population, it is generally called "wild type".
Many different types of DNA mutations are known. Examples of DNA mutations include, but are not limited to, "point mutation" or "single base pair mutations" wherein an incorrect base pairing occurs. The most common point mutations comprise "transitions" wherein one purine or pyrimidine base is replaced for another and "transversions" wherein a purine is substituted for a pyrimidine (and visa versa). Point mutations also comprise mutations wherein a base is added or deleted from a DNA chain. Such "insertions" or "deletions" are also known as "frameshift mutations". Although they occur with less frequency than point mutations, larger mutations affecting multiple base pairs can also occur and may be important. A more detailed discussion of mutations can be found in U.S. Pat. No. 5,459,039 to Modrich (1995), and U.S. Pat. No. 5,698,400 to Cotton (1997). These references and the references contained therein are incorporated in their entireties herein.
The sequence of base pairs in DNA code for the production of proteins. In particular, a DNA sequence in the exon portion of a DNA chain codes for the corresponding amino acid sequence in a protein. Therefore, a mutation in a DNA sequence may result in an alteration in the amino acid sequence of a protein. Such an alteration in the amino acid sequence may be completely benign or may inactivate a protein or alter its function to be life threatening or fatal. On the other hand, mutations in an intron portion of a DNA chain would not be expected to have a biological effect since an intron section does not contain code for protein production. Nevertheless, mutation detection in an intron section may be important, for example, in a forensic investigation.
Detection of mutations is, therefore, of great interest and importance in diagnosing diseases, understanding the origins of disease and the development of potential treatments. Detection of mutations and identification of similarities or differences in DNA samples is also of critical importance in increasing the world food supply by developing diseases resistant and/or higher yielding crop strains, in forensic science, in the study of evolution and populations, and in scientific research in general (Guyer, et al., Proc. Natl. Acad. Sci. USA 92:10841 (1995); Cotton, TIG 13:43 (1997)).
Alterations in a DNA sequence which are benign or have no negative consequences are sometimes called "polymorphisms". In the present invention, any alterations in the DNA sequence, whether they have negative consequences or not, are called "mutations". It is to be understood that the method and system of this invention have the capability to detect mutations regardless of biological effect or lack thereof. For the sake of simplicity, the term "mutation" will be used throughout to mean an alteration in the base sequence of a DNA strand compared to a reference strand (generally, but not necessarily, wild type). It is to be understood that in the context of this invention, the term "mutation" includes the term "polymorphism" or any other similar or equivalent term of art.
There exists a need for an accurate and reproducible analytical method for mutation detection which is easy to implement. Such a method, which can be automated and provide high throughput sample screening with a minimum of operator attention, is also highly desirable.
Size based analysis of DNA samples has historically been done using gel electrophoresis (GEP). Capillary gel electrophoresis (CGE) has also been used to separate and analyze mixtures of DNA fragments having different lengths, e.g., the result of restriction enzyme cleavage. However, these methods cannot distinguish DNA fragments which differ in base sequence, but have the same base pair length. Therefore, gel electrophoresis cannot be used directly for mutation detection. This is a serious limitation of GEP.
Gel based analytical methods, such as denaturing gradient gel electrophoresis (DGGE) and denaturing gradient gel capillary electrophoresis (DGGC), can detect mutations in heteroduplex DNA strands under "partially denaturing" conditions. The term "partially denaturing" means the separation of a mismatched base pair (caused by temperature, pH, solvent, or other factors) in a DNA double strand while the remainder of the double strand remains intact. The phenomenon of "partial denaturation" is well known in the art and occurs because a heteroduplex will denature at the site of base pair mismatch at a lower temperature than is required to denature the remainder of the strand. However, these gel based techniques are operationally difficult to implement and require highly skilled personnel. In addition, the analyses are lengthy and require a great deal of set up time. A denaturing capillary gel electrophoresis analysis is limited to relatively small fragments. Separation of a 90 base pair fragment takes more than 30 minutes. A gradient denaturing gel runs overnight and requires about a day of set up time. Additional deficiencies of gradient gels are the isolation of separated DNA fragments (which requires specialized techniques and equipment) and analysis conditions must be experimentally developed for each fragment (Laboratory Methods for the Detection of Mutations and Polymorphisms, ed. G. R. Taylor, CRC Press, 1997). The long analysis time of the gel methodology is further exacerbated by the fact that the movement of DNA fragments in a gel is inversely proportional, in a geometric relationship, to their length. Therefore, the analysis time of longer DNA fragments can often be untenable.
Another problem encountered under partially denaturing conditions occurs when a mutation is located in a domain of a DNA fragment which has a high melting temperature relative to other domains of the same fragment. In such a case, partially denaturing conditions cannot be achieved since the entire double strand will denature before the site of base mismatch (mutation site) denatures. To circumvent this problem, a "G-C" clamp can be applied to a terminal domain of the DNA fragment as described by Myers, et al., in Nucleic Acids Res. 13:3111 (1985) and Sheffield, et al., in Proc. Natl. Acad Sci. USA 86:232 (1989) both of which publications are hereby incorporated by reference. A "G-C clamp" is a sequence of several G-C base pairs, generally 10-20, located at a terminus of the DNA fragment. Since G-C base pairs have stronger hydrogen bonds than those of other bases, their melting temperature is higher. Therefore, partial denaturing can occur at a mutation site, while the G-C clamp keeps the DNA strand from denaturing entirely. G-C clamps are introduced into DNA fragments by connecting a G-C sequence of desired length to a primer to be used in PCR amplification of a target DNA fragment. However, this an expensive and labor intensive technique.
In addition to the deficiencies of denaturing gel methods mentioned above, these techniques are not always reproducible or accurate since the preparation of a gel and running an analysis can be highly variable from one operator to another, and in general, suffer from serious deficiencies which are inherent to the art.
Separation of double stranded nucleic acid fragment mixtures by GEP or DGGE produces a linear array of bands, wherein each band in the array represents a separated double stranded nucleic acid component of that mixture. Since many mixtures are typically separated and analyzed simultaneously in separate lanes on the same gel slab, a parallel series of such linear arrays of bands is produced. Bands are often curved rather than straight, their mobility and shape can change across the width of the gel and lanes and bands can mix with each other. The sources of such inaccuracies stem from the lack of uniformity and homogeneity of the gel bed, electroendosmosis, thermal gradient and diffusion effects, as well as host of other factors. Inaccuracies of this sort are well known in the GEP art and can lead to serious distortions and inaccuracies in the display of the separation results. In addition, the band display data obtained from GEP separations is not quantitative or accurate because of the uncertainties related to the shape and integrity of the bands. True quantitation of linear band array displays produced by GEP separations cannot be achieved, even when the linear band arrays are scanned with a detector and the resulting data is integrated, because the linear band arrays are scanned only across the center of the bands. Since the detector only sees a small portion of any given band and the bands are not uniform, the results produced by the scanning method are not accurate and can even be misleading.
Methods for visualizing GEP and DGGE separations, such as staining or autoradiography are also cumbersome and time consuming. In addition, separation data is in hard copy form and cannot be electronically stored for easy retrieval and comparison, nor can it be enhanced to improve the visualization of close separations. Fluorescent tags have been covalently attached to DNA fragments which have been separated on a gel in order to enhance detection of the separated DNA fragments (for example, U.S. Pat. No. 4,855,255 (1989) to Fung). This reference in incorporated by reference herein in its entirety. However, this approach still suffers from the inherent disadvantages related to gel based separations described above.
Recently, an ion pairing reverse phase HPLC method was introduced to effectively separate mixtures of double stranded polynucleotides, in general and DNA, in particular, wherein the separations are based on base pair length. This method is described in the following references which are incorporated herein in their entireties: U.S. Pat. No. 5,795,976 (1998) to Oefner; U.S. Pat. No. 5,585,236 (1996) to Bonn; Huber, et al., Chromatographia 37:653 (1993); Huber, et al., Anal. Biochem. 212:351 (1993).
As the use and understanding of HPLC developed it became apparent that when HPLC analyses were carried out at a partially denaturing temperature, i.e., a temperature sufficient to denature a heteroduplex at the site of base pair mismatch, homoduplexes could be separated from heteroduplexes having the same base pair length as disclosed in the following references: Hayward-Lester, et al., Genome Research 5:494 (1995); Underhill, et al., Proc. Natl. Acad. Sci. USA 93:193 (1996); Oefner, et al., DHPLC Workshop, Stanford University, Palo Alto, Calif., (Mar. 17, 1997); Underhill, et al., Genome Research 7:996 (1997); Liu, et al., Nucleic Acid Res., 26;1396 (1998). These references and the references contained therein are incorporated herein in their entireties. DHPLC can separate heteroduplexes that differ by as little as one base pair. However, as demonstrated in the these references, in certain cases, separations of homoduplexes and heteroduplexes are poorly resolved. Artifacts and impurities can interfere with the interpretation of DHPLC separation chromatograms in the sense that it may be difficult to distinguish between an artifact or impurity and a putative mutation. The presence of mutations may even be missed entirely.
The accuracy, reproducibility, convenience and speed of DNA fragment separations and mutation detection assays based on HPLC have been compromised in the past because of HPLC system related problems. Applicants have addressed these problems and applied the term "Matched Ion Polynucleotide Chromatography" (MIPC) to the separation method and system which is used in connection with the present invention. When used under partially denaturing conditions, MIPC is defined herein as Denaturing Matched Ion Polynucleotide Chromatography (DMIPC).
The term "Matched Ion Polynucleotide Chromatography" as used herein is defined as a process for separating single and double stranded polynucleotides using separation media having a non-polar surface, wherein the process uses a counterion agent, and an organic solvent to release the polynucleotides from the separation media. MIPC separations are routinely complete in less than 10 minutes, and frequently in less than 5 minutes. MIPC systems (WAVE.TM. DNA Fragment Analysis System, Transgenomic, Inc. San Jose, Calif.) are equipped with computer controlled ovens which enclose the columns and column inlet areas. Non-limiting examples of key distinguishing features of MIPC include the a) use of hardware having liquid contacting surfaces which do not release multivalent cations therefrom, b) protection of liquid contacting surfaces from exogenous multivalent cations by means cartridges containing multivalent cation capture resins, c) the use of a special washing protocol for MIPC separation media, d) automated selection of an optimum solvent gradient solvent gradient for elution of a specific base length DNA fragment, and e) automated determination of the temperature required to effect partial denaturation of a heteroduplex when MIPC is used under partially denaturing conditions (DMIPC) for mutation detection.
Important aspects of DNA separation and mutation detection by HPLC and DHPLC which have been recognized and addressed by Applicants, comprise a) the treatment of, and materials comprising chromatography system components, b) the treatment of, and materials comprising separation media, c) solvent pre-selection to minimize methods development time, d) optimum temperature pre-selection to effect partial denaturation of a heteroduplex during HPLC and e) optimization of DHPLC for automated high throughput mutation detection screening assays. These factors, which comprise MIPC/DMIPC but not HPLC/DHPLC, are essential when using chromatographic methods in order to achieve unambiguous, accurate, reproducible and high throughput DNA separations and mutation detection results. A comprehensive description of MIPC systems and separation media, including the critical importance of maintaining an environment which is free of multivalent cations, is presented in U.S. Pat. No. 5,772,889 (1998) to Gjerde and U.S. patent application Ser. No. 09/129,105 filed Aug. 4, 1998; Ser. No. 09/081,040 filed May 18, 1998. Now U.S. Pat. No. 5,997,742; Ser. No. 09/080,547 filed May 18, 1998 now U.S. Pat. No. 6,017,457; Ser. No. 09/058,580 filed Apr. 10, 1998 now abandoned; Ser. No. 09/058,337 filed Apr. 10, 1998 now abandoned; Ser. No. 09/065,913 filed Apr. 24, 1998; Ser. No. 09/039,061 filed Mar. 13, 1998 now U.S. Pat. No. 5,986,913; Ser. No. 09/081,039 filed May 18, 1998 now U.S. Pat. No. 5,972,222. These references and the references contained therein are incorporated in their entireties herein.
DNA fragments which have been separated by MIPC or other chromatographic methods, have been detected using a uv detector set at the DNA absorption maximum of about 260 nm. Although generally effective, a detection method which is more sensitive than uv is often required. For example, when only very small amounts of sample are available or when trying to detect a DNA fragment in the presence of a very large excess of another fragment(s), e.g. cancer screening.
The use of radioactive labels is a well known method of detection in the DNA separation art. However, this method is costly, developing autoradiograms to visualize a separation is a very lengthy process, and radioactivity poses a health hazard.
A need exists, therefore, for a detection method which is capable of detecting DNA fragments at a lower threshold than uv, is not hazardous, and wherein the detection method is coupled to a separation system which allows for the efficient and reproducible separation of DNA fragments. Ideally, such a method is coupled to a system which can be automated for use in high throughput screening assays.