A need exists for rapid and efficient procedures for isolating, separating and purifying single-stranded oligonucleotides and single-stranded DNA fragments, RNA single-stranded DNA fragments, plasmids and the like. Traditional methods such as ion exchange chromatography, high pressure reverse phase chromatography, gel electrophoresis, capillary electrophoresis and the like are slow, laborious and inefficient, and they require the services of a highly skilled chromatographic expert. Furthermore, many methods are incapable of effecting a base-pair length size based separation of these fragments and are capable of yielding only minute quantities of separated materials.
Mixtures of single-stranded nucleic acid fragments having different base pair lengths are separated for numerous and diverse reasons. The ability to detect mutations in single-stranded polynucleotides, and especially in DNA fragments which have been amplified by PCR, presents a somewhat different problem since DNA fragments containing mutations are generally the same length as their corresponding wild type (defined herein below) but differ in base sequence.
DNA separation and mutation detection are of great importance in medicine, as well as in the physical and social sciences, as well as in forensic investigations. The Human Genome Project is providing an enormous amount of genetic information which is setting new criteria for evaluating the links between mutations and human disorders (Guyer, et al., Proc. Natl. Acad. Sci. USA 92:10841 (1995)). The ultimate source of disease, for example, is described by genetic code that differs from wild type (Cotton, TIG 13:43 (1997)). Understanding the genetic basis of disease can be the starting point or a cure. Similarly, determination of differences in genetic code can provide powerful and perhaps definitive insights into the study of evolution and populations (Cooper, et. al., Human Genetics 69:201 (1985)). Understanding these and other issues related to genetic coding is based on the ability to identify anomalies, i.e., mutations, in a DNA fragment relative to the wild type. A need exists, therefore, for a methodology which can separate DNA fragments based on size differences as well as separate DNA having the same length but differing in base pair sequence (mutations from wild type), in an accurate, reproducible, reliable manner. Ideally, such a method would be efficient and could be adapted to routine high throughput sample screening applications.
DNA molecules are polymers comprising sub-units called deoxynucleotides. The four deoxynucleotides found in DNA comprise a common cyclic sugar, deoxyribose, which is covalently bonded to any of the four bases, adenine (a purine), guanine (a purine), cytosine (a pyrimidine), and thymine (a pyrimidine), hereinbelow referred to as A, G, C, and T respectively. A phosphate group links a 3'-hydroxyl of one deoxynucleotide with the 5'-hydroxyl of another deoxynucleotide to form a polymeric chain. In single-stranded DNA, two strands are held together in a helical structure by hydrogen bonds between, what are called, complimentary bases. The complimentarity of bases is determined by their chemical structures. In single-stranded DNA, each A pairs with a T and each G pairs with a C, i.e., a purine pairs with a pyrimidine. Ideally, DNA is replicated in exact copies by DNA polymerases during cell division in the human body or in other living organisms. DNA strands can also be replicated in vitro by means of the Polymerase Chain Reaction (PCR).
Sometimes, exact replication fails and an incorrect base pairing occurs, which after further replication of the new strand, results in single-stranded DNA offspring containing a heritable difference in the base sequence from that of the parent. Such heritable changes in base pair sequence are called mutations.
In the present invention, single-stranded DNA is referred to as a duplex. When a base sequence of one strand is entirely complimentary to a base sequence of the other strand, the duplex is called a homoduplex. When a duplex contains at least one base pair which is not complimentary, the duplex is called a heteroduplex. A heteroduplex is formed during DNA replication when an error is made by a DNA polymerase enzyme and a non-complimentary base is added to a polynucleotide chain being replicated. Further replications of a heteroduplex will, ideally, produce homoduplexes which are heterozygous, i.e., these homoduplexes will have an altered sequence compared to the original parent DNA strand. When the parent DNA has a sequence which predominates in a naturally occurring population, it is generally called "wild type".
Many different types of DNA mutations are known. Examples of DNA mutations include, but are not limited to, "point mutation" or "single base pair mutations" wherein an incorrect base pairing occurs. The most common point mutations comprise "transitions" wherein one purine or pyrimidine base is replaced for another and "transversions" wherein a purine is substituted for a pyrimidine (and visa versa). Point mutations also comprise mutations wherein a base is added or deleted from a DNA chain. Such "insertions" or "deletions" are also known as "frameshift mutations". Although they occur with less frequency than point mutations, larger mutations affecting multiple base pairs can also occur and may be important. A more detailed discussion of mutations can be found in U.S. Pat. No. 5,459,039 to Modrich (1995), and U.S. Pat. No. 5,698,400 to Cotton (1997). These references and the references contained therein are incorporated in their entireties herein.
The sequence of base pairs in DNA code for the production of proteins. In particular, a DNA sequence in the exon portion of a DNA chain codes for the corresponding amino acid sequence in a protein. Therefore, a mutation in DNA sequence may result in an alteration in the amino acid sequence of a protein. Such an alteration in the amino acid sequence may be completely benign or may inactivate a protein or alter its function to be life threatening or fatal. On the other hand, mutations in an intron portion of a DNA chain would not be expected to have a biological effect since an intron section does not contain code for protein production. Nevertheless, mutation detection in an intron section may be important, for example, in a forensic investigation.
Detection of mutations is, therefore, of great interest and importance in diagnosing diseases, understanding the origins of disease and the development of potential treatments. Detection of mutations and identification of similarities or differences in DNA samples is also of critical importance in increasing the world food supply by developing diseases resistant and/or higher yielding crop strains, in forensic science, in the study of evolution and populations, and in scientific research in general (Guyer, et al., Proc. Natl. Acad. Sci. USA 92:10841 (1995); Cotton, TIG 13:43 (1997)).
Alterations in a DNA sequence which are benign or have no negative consequences are sometimes called "polymorphisms". In the present invention, any alterations in the DNA sequence, whether they have negative consequences or not, are denoted as "mutations". It is to be understood that the method and system of this invention have the capability to detect mutations regardless of biological effect or lack thereof. For the sake of simplicity, the term "mutation" will be used throughout to mean an alteration in the base sequence of a DNA strand compared to a reference strand (generally, but not necessarily, wild type). It is to be understood that in the context of this invention, the term "mutation" includes the term "polymorphism" or any other similar or equivalent term of art.
A need exists for an accurate and reproducible analytical method for mutation detection which is easy to implement. Ideally, the method would be automated and provide high throughput sample screening with a minimum of operator attention, is also highly desirable.
Prior to this invention, size-based analysis of DNA samples relied upon separation by gel electrophoresis (GEP). Capillary gel electrophoresis (CGE) has also been used to separate and analyze mixtures of DNA fragments having different lengths, e.g., the different lengths resulting from restriction enzyme cleavage. However, these methods cannot distinguish DNA fragments which differ in base sequence, but have the same base pair length. Therefore, gel electrophoresis cannot be used directly for mutation detection. This is a serious limitation of GEP.
Gel-based analytical methods, such as denaturing gradient gel electrophoresis (DGGE) and denaturing gradient gel capillary electrophoresis (DGGC), can detect mutations in heteroduplex DNA strands under "partially denaturing" conditions. The phenomenon of "partial denaturation" is well known in the art and occurs because a heteroduplex will denature at the site of base pair mismatch at a lower temperature than is required to denature the remainder of the strand. However, these gel-based techniques are operationally difficult to implement and require highly skilled personnel. In addition, the analyses are lengthy and require a great deal of set up time. A denaturing capillary gel electrophoresis analysis is limited to relatively small fragments. Separation of a 90 base pair fragment takes more than 30 minutes. A gradient denaturing gel runs overnight and requires about a day of set up time. Additional deficiencies of gradient gels are the isolation of separated DNA fragments (which requires specialized techniques and equipment) and analysis conditions must be experimentally developed for each fragment (Laboratory Methods for the Detection of Mutations and Polymorphisms, ed. G. R. Taylor, CRC Press, 1997). The long analysis time of the gel methodology is further exacerbated by the fact that the movement of DNA fragments in a gel is inversely proportional, in a geometric relationship, to their length. Therefore, the analysis time of longer DNA fragments can be often be untenable.
Recently, an HPLC based ion pairing chromatographic method was introduced to effectively separate mixtures of single-stranded polynucleotides, in general and DNA, in particular, wherein the separations are based on base pair length. This method is described in the following references which are incorporated herein in their entireties: U.S. Pat. No. 5,795,976 (1998) to Oefner; U.S. Pat. No. 5,585,236 (1996) to Bonn; Huber, et al., Chromatographia 37:653 (1993); Huber, et al., Anal. Biochem. 212:351 (1993).
As the use and understanding of HPLC developed it became apparent that when HPLC analyses were carried out at a partially denaturing temperature, i.e., a temperature sufficient to denature a heteroduplex at the site of base pair mismatch, homoduplexes could be separated from heteroduplexes having the same base pair length (Hayward-Lester, et al., Genome Research 5:494 (1995); Underhill, et al., Proc. Natl. Acad. Sci. USA 93:193 (1996); Doris, et al., DHPLC Workshop, Stanford University, (1997)). These references and the references contained therein are incorporated herein in their entireties. Thus, the use of Denaturing HPLC (DHPLC) was applied to mutation detection (Underhill, et al., Genome Research 7:996 (1997); Liu, et al., Nucleic Acid Res., 26;1396 (1998)).
DHPLC can separate heteroduplexes that differ by as little as one base pair. However, in certain cases, separations of homoduplexes and heteroduplexes are poorly resolved. Artifacts and impurities can interfere with the interpretation of DHPLC separation chromatograms in the sense that it may be difficult to distinguish between an artifact or impurity and a putative mutation (Underhill, et al., Genome Res. 7:996 (1997)). The presence of mutations may even be missed entirely (Liu, et al., Nucleic Acid Res. 26:1396 (1998)). The references cited above and the references contained therein are incorporated in their entireties herein.
The accuracy, reproducibility, convenience and speed of DNA fragment separations and mutation detection assays based on HPLC have been compromised in the past because of HPLC system related problems. This invention addresses these problems and applies the term "Matched Ion Polynucleotide Chromatography" (MIPC) to the separation method and system which is used in connection with the present invention. When used under partially denaturing conditions, MIPC is defined herein as Denaturing Matched Ion Polynucleotide Chromatography (DMIPC).
MIPC systems (WAVE.RTM. DNA Fragment Analysis System, Transgenomic, Inc. San Jose, Calif.) are equipped with computer controlled ovens which enclose the columns and column inlet areas. Non-limiting examples of key distinguishing features of MIPC include the a) use of hardware having liquid contacting surfaces which do not release multivalent cations therefrom, b) protection of liquid contacting surfaces from exogenous multivalent cations by means cartridges containing multivalent cation capture resins, c) the use of a special washing protocol for MIPC separation media, d) automated selection of an optimum solvent gradient solvent gradient for elution of a specific base length DNA fragment, and e) automated determination of the temperature required to effect partial denaturation of a heteroduplex when MIPC is used under partially denaturing conditions (DMIPC) for mutation detection.
The present invention can be used in the separation of RNA or of double- or single-stranded DNA. For purposes of simplifying the description of the invention, and not by way of limitation, the separation of double-stranded DNA will be described in the examples herein, it being understood that all polynucleotides are intended to be included within the scope of this invention. The invention applies to size-dependent separations and denaturing separations by MIPC. Both these separations can include separations of DNA fragments having nonpolar tags.
Important aspects of DNA separation and mutation detection by HPLC and DHPLC which have not been heretofore addressed, comprise a) the treatment of, and materials comprising chromatography system components, b) the treatment of, and materials comprising separation media, c) solvent pre-selection to minimize methods development time, d) optimum temperature pre-selection to effect partial denaturation of a heteroduplex during HPLC and e) optimization of DHPLC for automated high throughput mutation detection screening assays. These factors, which comprise MIPC/DMIPC but not HPLC/DHPLC, are essential when using chromatographic methods in order to achieve unambiguous, accurate, reproducible and high throughput DNA separations and mutation detection results. A comprehensive description of MIPC systems and separation media, including the critical importance of maintaining an environment which is free of multivalent cations, is presented in U.S. Pat. No. 5,772,889 (1998) to Gjerde and U.S. patent applications Ser. No. 09/129,105 filed Aug. 4, 1998; Ser. No. 09/081,040 filed May 18, 1998 (now U.S. Pat. No. 6,017,457); Ser. No. 09/080,547 filed May 18, 1998; Ser. No. 09/058,580 filed Apr. 10, 1998; Ser. No. 09/058,337 filed Apr. 10, 1998; Ser. No. 09/065,913 filed Apr. 24, 1998 (now U.S. Pat. No. 5,986,085; Ser. No. 09/039,061 filed Mar. 13, 1998; Ser. No. 09/081,039 filed May 18, 1998. These references and the references contained therein are incorporated in their entireties herein.
All of the liquid chromatographic separations discussed herein above comprise gradient elution, i.e., they utilize a multi-component mobile phase wherein the concentration of the driving component, usually an organic solvent, is increased during the course of the chromatography. This approach reduces the time required to complete an analysis. However, the separation of mixture components can be compromised. Efforts have been made to improve the resolving power of MIPC. These efforts have centered on improving the gradient process, changing the column particle size, or changing the column length. However, only small improvements have been achieved with these efforts. Therefore, there exists a need improve the separation of poorly resolved or close running components. Such improvement is especially useful when it is important to isolate a component in pure form, as for example, for PCR amplification, sequencing, mutation detection, and numerous other applications.
Many tasks within molecular biology require prior purification of nucleic acids. Current strategies involve the use of gel electrophoresis or solid-phase extraction (typically on silica gel or an anion exchange resin). While these procedures lead to overall improvements in nucleic acid purity relative to original unpurified materials, they suffer from negative characteristics. See Hecker, Karl et al, "Optimization of cloning efficacy by pre-cloning DNA Fragment Analysis", Biotechniques, 26:216-222 February, 1999 which shows the limitations of the prior art separation methods and the superior purity obtained with the methods of this invention. Gel electrophoresis suffers from a lack of automation, incomplete separation of distinct fragments, as well as incomplete recovery of fragments. While solid-phase extraction procedures lend themselves to automation, they also can suffer from incomplete separation of distinct fragments and from incomplete recovery of fragments.
Many procedures for investigating or evaluating genetic materials require enzymatic cleavage of the materials and isolating a particular DNA fragment or range of DNA fragments from the DNA fragment mixture produced by the cleavage. These isolations are particularly important in the early diagnosis of certain diseases, especially cancer. In the case of cancer and other diseases of genetic origin, early detection often depends on the availability of an appropriate analytical method which can accurately and reliably detect a mutation in DNA samples.
As described in copending Sklar et al U.S. patent application Attorney Docket No. P-217 filed May 13, 1999, this problem is exacerbated by the fact that such samples may contain a very small population of cells containing mutant DNA in the presence of a very large predominantly normal cell population containing, for example, wild type DNA. Before the development of this invention, any separation techniques which were theoretically capable of detecting mutant DNA in the presence of wild type would fail because the concentration of mutant DNA was simply too low to be detected relative to wild type. That is to say, the concentration of mutant DNA may be too low to detect in absolute terms. Alternatively, the concentration of mutant DNA may be sufficient to detect, but will be completely obscured because of the very large relative amount of wild type in the sample.
Increasing the amount of mutant DNA by PCR amplification of the sample would not solve the problem described above. The mutant and wild type DNA in the sample are very similar. In fact, their sequence may differ by only a single base pair. Therefore, the primers used to amplify the mutant DNA would also amplify the wild type since both are present in the sample. As a result, the relative amounts of mutant and wild type DNA would not change.
For example as described in the copending Sklar et al application (supra), following radiation or chemotherapy, cancer patients are monitored for the presence of residual cancer cells to determine whether the patients are in remission. The effectiveness of these treatments can be monitored if small levels of residual cancer cells could be detected in a predominantly large wild type population. Traditionally, the remission status is assessed by a pathologist who conducts histological examination of tissues samples. However, these visual methods are largely qualitative, time-consuming, and costly. At best, the sensitivity of these methods permits detection of about 1 cancerous cell in 100 cells.
Analysis of DNA samples has historically been done using gel electrophoresis. Capillary electrophoresis has also been used to separate and analyze mixtures of DNA. However, these methods cannot distinguish point mutations from homoduplexes having the same base pair length.
In addition to the deficiencies of denaturing gel methods mentioned above, these techniques are not always reproducible or accurate since the preparation of a gel slab and running an analysis can be highly variable from one operator to another. As a result, the mobility of a DNA fragment is different on different gel slabs and even in one lane, compared to another on the same gel slab. The problems and deficiencies of gel-based DNA separation methods are well known in the art and are described in the published literature, e.g., G. R. Taylor, editor, LABORATORY METHODS FOR THE DETECTION OF MUTATIONS AND POLYMORPHISMS, CRC Press (1997).