DNA molecules are polymers comprising sub-units called deoxynucleotides. The four deoxynucleotides found in DNA comprise a common cyclic sugar, deoxyribose, which is covalently bonded to any of the four bases, adenine (a purine), guanine (a purine), cytosine (a pyrimidine), and thymine (a pyrimidine), referred to herein as A, G, C, and T respectively. A phosphate group links a 3′-hydroxyl of one deoxynucleotide with the 5′-hydroxyl of another deoxynucleotide to form a polymeric chain. In double stranded DNA, two strands are held together in a helical structure by hydrogen bonds between what are called complementary bases. The complementarity of bases is determined by their chemical structures. In double stranded DNA, each A pairs with a T and each G pairs with a C, i.e., a purine pairs with a pyrimidine. Ideally, DNA is replicated in exact copies by DNA polymerases during cell division in the human body or in other living organisms. DNA strands can also be replicated in vitro by means of the Polymerase Chain Reaction (PCR). Sometimes, exact replication fails and an incorrect base pairing occurs. Further replication of the new strand produces double stranded DNA offspring containing a heritable difference in the base sequence from that of the parent. Such heritable changes in base pair sequence are called mutations.
As used herein, double stranded DNA is referred to as a duplex. When a base sequence of one strand is entirely complementary to a base sequence of the other strand, the duplex is called a homoduplex. When a duplex contains at least one base pair which is not complementary, the duplex is called a heteroduplex. A heteroduplex is formed during DNA replication when an error is made by a DNA polymerase enzyme and a non-complementary base is added to a polynucleotide chain being replicated. Further replications of a heteroduplex will, ideally, produce homoduplexes which are heterozygous, i.e., these homoduplexes will have an altered sequence compared to the original parent DNA strand. When the parent DNA has a sequence which predominates in a naturally occurring population, the sequence is generally referred to as a “wild type.”
Many different types of DNA mutations are known. Examples of DNA mutations include, but are not limited to, “point mutation” or “single base pair mutations” in which an incorrect base pairing occurs. The most common point mutations comprise “transitions” in which one purine or pyrimidine base is replaced for another and “transversions” wherein a purine is substituted for a pyrimidine (and visa versa). Point mutations also comprise mutations in which a base is added or deleted from a DNA chain. Such “insertions” or “deletions” are also known as “frameshift mutations”. Although they occur with less frequency than point mutations, larger mutations affecting multiple base pairs can also occur and may be important. A more detailed discussion of mutations can be found in U.S. Pat. No. 5,459,039 to Modrich (1995), and U.S. Pat. No. 5,698,400 to Cotton (1997).
The sequence of base pairs in DNA is a code for the production of proteins. In particular, a DNA sequence in the exon portion of a DNA chain codes for a corresponding amino acid sequence in a protein. Therefore, a mutation in a DNA sequence may result in an alteration in the amino acid sequence of a protein. Such an alteration in the amino acid sequence may be completely benign or may inactivate a protein or alter its function to be life threatening or fatal. On the other hand, mutations in an intron portion of a DNA chain would not be expected to have a biological effect since an intron section does not contain code for protein production. Nevertheless, mutation detection in an intron section may be important, for example, in a forensic investigation.
Detection of mutations is therefore of great importance in diagnosing diseases, understanding the origins of disease, and the development of potential treatments. Detection of mutations and identification of similarities or differences in DNA samples is also of critical importance in increasing the world food supply by developing diseases resistant and/or higher yielding crop strains, in forensic science, in the study of evolution and populations, and in scientific research in general (Guyer, et al., Proc. Natl. Acad. Sci. USA 92:10841 (1995); Cotton, TIG 13:43 (1997)).
Alterations in a DNA sequence which are benign or have no negative consequences are sometimes called “polymorphisms”. For the purposes of this application, all alterations in the DNA sequence, whether they have negative consequences or not, are defined herein as “mutations”. For the sake of simplicity, the term “mutation” is used herein to mean an alteration in the base sequence of a DNA strand compared to a reference strand (generally, but not necessarily, a wild type). As used herein, the term “mutation” includes the term “polymorphism” or any other similar or equivalent term of art.
In the prior art, size based analysis of DNA samples is accomplished by standard gel electrophoresis (GEP). Capillary gel electrophoresis (CGE) has also been used to separate and analyze mixtures of DNA fragments having different lengths, e.g., the digests produced by restriction enzyme cleavage of DNA samples. However, these methods cannot distinguish DNA fragments which have the same base pair length but have a differing base sequence. This is a serious limitation of GEP.
Mutations in heteroduplex DNA strands under “partially denaturing” conditions can be detected by gel based analytical methods such as denaturing gradient gel electrophoresis (DGGE) and denaturing gradient gel capillary electrophoresis (DGGC). The term “partially denaturing” is defined to be the separation of a mismatched base pair (caused by temperature, pH, solvent, or other factors) in a DNA double strand while other portions of the double strand remain intact, that is, are not separated. The phenomenon of “partial denaturation” occurs because a heteroduplex will denature at the site of base pair mismatch at a lower temperature than is required to denature the remainder of the strand.
These gel-based techniques are difficult and require highly skilled laboratory scientists. In addition, each analysis requires a lengthy setup and separation. A denaturing capillary gel electrophoresis analysis can only be made of relatively small fragments. A separation of a 90 base pair fragment takes more than 30 minutes. A gradient denaturing gel runs overnight and requires about a day of set up time. Additional deficiencies of gradient gels are the difficulty of adapting these procedures to isolate separated DNA fragments (which requires specialized techniques and equipment), and establishing the conditions required for the isolation. The conditions must be experimentally developed for each fragment (Laboratory Methods for the Detection of Mutations and Polymorphisms, ed. G. R. Taylor, CRC Press, 1997). The long analysis time of the gel methodology is further exacerbated by the fact that the movement of DNA fragments in a gel is inversely proportional, in a geometric relationship, to the length of the DNA fragments. Therefore, the analysis time of longer DNA fragments can often be untenable.
In addition to the deficiencies of denaturing gel methods mentioned above, these techniques are not always reproducible or accurate since the preparation of a gel and running an analysis can be highly variable from one operator to another.
Separation of double stranded nucleic acid fragment mixtures by GEP or DGGE produces a linear array of bands, each band in the array representing a separated double stranded nucleic acid component of that mixture. Since many mixtures are typically separated and analyzed simultaneously in separate lanes on the same gel slab, a parallel series of such linear arrays of bands is produced. Bands are often curved rather than straight, their mobility and shape can change across the width of the gel, and lanes and bands can mix with each other. The sources of such inaccuracies stem from the lack of uniformity and homogeneity of the gel bed, electroendosmosis, thermal gradient and diffusion effects, as well as host of other factors. Inaccuracies of this sort are well known in the GEP art and can lead to serious distortions and inaccuracies in the display of the separation results.
In addition, the band display data obtained from GEP separations is not quantitative or accurate because of the uncertainties related to the shape and integrity of the bands. True quantitation of linear band array displays produced by GEP separations cannot be achieved, even when the linear band arrays are scanned with a detector and the resulting data is integrated, because the linear band arrays are scanned only across the center of the bands. Since the detector only sees a small portion of any given band and the bands are not uniform, the results produced by the scanning method are not accurate and can even be misleading.
Methods for visualizing GEP and DGGE separations, such as staining or autoradiography are also cumbersome and time consuming. In addition, separation data is in hard copy form and cannot be electronically stored for easy retrieval and comparison, nor can it be enhanced to improve the visualization of close separations.
Separation of double-stranded deoxyribonucleic acids (dsDNA) fragments and detection of DNA mutations is of great importance in medicine, in the physical and social sciences, and in forensic investigations. The Human Genome Project is providing an enormous amount of genetic information and yielding new information for evaluating the links between mutations and human disorders (Guyer, et al., Proc. Natl. Acad. Sci. USA 92:10841 (1995)). For example, the ultimate source of disease is described by genetic code that differs from the wild type (Cotton, TIG 13:43 (1997)). Understanding the genetic basis of disease can be the starting point for a cure. Similarly, determination of differences in genetic code can provide powerful and perhaps definitive insights into the study of evolution and populations (Cooper, et. al., Human Genetics vol. 69:201 (1985)). Understanding these and other issues related to genetic coding requires the ability to identify anomalies, i.e., mutations, in a DNA fragment relative to the wild type.
Traditional chromatography is a separation process based on partitioning of mixture components between a “stationary phase” and a “mobile phase”. The stationary phase is provided by the surface of solid materials which can comprise many different materials in the form of particles or passageway surfaces of cellulose, silica gel, coated silica gel, polymer beads, polysaccharides, and the like. These materials can be supported on solid surfaces such as on glass plates or packed in a column. The mobile phase can be a liquid or a gas in gas chromatography. This invention relates to liquid mobile phases.
The separation principles are generally the same regardless of the materials used, the form of the materials, or the apparatus used. The different components of a mixture have different respective degrees of solubility in the stationary phase and in the mobile phase. Therefore, as the mobile phase flows over the stationary phase, there is an equilibrium in which the sample components are partitioned between the stationary phase and the mobile phase. As the mobile phase passes through the column, the equilibrium is constantly shifted in favor of the mobile phase. This occurs because the equilibrium mixture, at any time, sees fresh mobile phase and partitions into the fresh mobile phase. As the mobile phase is carried down the column, the mobile phase sees fresh stationary phase and partitions into the stationary phase. Eventually, at the end of the column, there is no more stationary phase and the sample simply leaves the column in the mobile phase.
A separation of mixture components occurs because the mixture components have slightly different affinities for the stationary phase and/or solubilities in the mobile phase, and therefore have different partition equilibrium values. Therefore, the mixture components pass down the column at different rates.
Since chromatographic separations depend on interactions with the stationary phase, it is known that a separation can be improved by increasing the surface area of the stationary phase.
In traditional liquid chromatography, a glass column is packed with stationary phase particles and mobile phase passes through the column, pulled only by gravity. However, when smaller stationary phase particles are used in the column, the pull of gravity alone is insufficient to cause the mobile phase to flow through the column. Instead, pressure must be applied. However, glass columns can only withstand about 200 psi. Passing a mobile phase through a column packed with 5 micron particles requires a pressure of about 2000 psi or more to be applied to the column. 5 to 10 micron particles are standard today. Particles smaller than 5 microns are used for especially difficult separations or certain special cases). This process is denoted by the term “high pressure liquid chromatography” or HPLC.
HPLC has enabled the use of a far greater variety of types of particles used to separate a greater variety of chemical structures than was possible with large particle gravity columns. The separation principle, however, is still the same.
An HPLC-based ion pairing chromatographic method was recently introduced to effectively separate mixtures of double stranded polynucleotides in general, and DNA in particular, wherein the separations are based on base pair length (U.S. Pat. No. 5,585,236 to Bonn (1996); Huber, et al., Chromatographia 37:653 (1993); Huber, et al., Anal. Biochem. 212:351 (1993)). These references and the references contained therein are incorporated herein in their entireties. The term “Matched Ion Polynucleotide Chromatography” (MIPC) is defined herein and applied to this method because the mechanism of separation was found to be based on binding and release of the DNA from the separation surfaces rather than traditional partitioning. MIPC separates DNA fragments on the basis of base pair length and is not limited by the deficiencies associated with gel based separation methods.
Matched Ion Polynucleotide Chromatography, as used herein, is defined as a process for separating single and double stranded polynucleotides using non-polar separation media, wherein the process uses a counter-ion agent, and an organic solvent to release the polynucleotides from the separation media. MIPC separations can be complete in less than 10 minutes, and frequently in less than 5 minutes.
The MIPC separation process differs from the traditional HPLC separation processes in that the separation is not achieved by a series of equilibrium separations between the mobile phase and the stationary phase as the liquids pass through the column. Instead, the sample is fed into the column using a solvent strength which permits the sample dsDNA to bind to the separation media surface. Strands of a specific base pair length are removed from the stationary phase surface and are carried down the column by a specific solvent concentration. By passing an increasing gradient of solvent through the sample, successively larger base pair lengths are removed in succession and passed through the column. The separation is not column length or stationary phase area dependent.
This MIPC process is temperature sensitive, and precise temperature control is particularly important in the MIPC separation processes.
As the use and understanding of MIPC developed, it was discovered that when MIPC analyses were carried out at a partially denaturing temperature, i.e., a temperature sufficient to denature a heteroduplex at the site of base pair mismatch, homoduplexes could be separated from heteroduplexes having the same base pair length (U.S. Pat. No. 5,795,976; Hayward-Lester, et al., Genome Research 5:494 (1995); Underhill, et al., Proc. Natl. Acad. Sci. USA 93:193 (1996); Doris, et al., DHPLC Workshop, Stanford University, (1997)). These references and the references contained therein are incorporated herein in their entireties. Thus, the use of Denaturing HPLC (DHPLC) was applied to mutation detection (Underhill, et al., Genome Research 7:996 (1997); Liu, et al., Nucleic Acid Res., 26;1396 (1998)).
DHPLC can separate heteroduplexes that differ by as little as one base pair. However, separations of homoduplexes and heteroduplexes can be poorly resolved. Artifacts and impurities can also interfere with the interpretation of DHPLC separation chromatograms in the sense that it may be difficult to distinguish between an artifact or impurity and a putative mutation (Underhill, et al., Genome Res. 7:996 (1997)). The presence of mutations may even be missed entirely (Liu, et al., Nucleic Acid Res. 26:1396 (1998)).
Important aspects of DNA separation and mutation detection by HPLC and DHPLC include the treatment of materials comprising chromatography system components; the treatment of materials comprising separation media; solvent pre-selection to minimize methods development time; optimum temperature pre-selection to effect partial denaturation of a heteroduplex during MIPC; and optimization of DHPLC for automated high throughput mutation detection screening assays. These factors are essential in order to achieve unambiguous, accurate, reproducible and high throughput DNA separations and mutation detection results.
The application of the Matched Ion Polynucleotide Chromatography (MIPC) under the partially denaturing conditions used for separating heteroduplexes from homoduplexes in mutation detection is hereafter referred to as DMIPC. In DMIPC, precise temperature control is required for maintaining both mobile and stationary phases at a partially denaturing temperature, that is, a temperature at which mismatched DNA present at the mutation site of a heteroduplex strand will denature but at which the matched DNA will remain bound into the double strand.
Certain components and operations of HPLC separation systems have been partially automated to facilitate the traditional partition-based separations. An example is the HSM control system provided by Hitachi with their HPLC chromatography apparatus. In using these controls, a chromatography expert manually inputs detailed instructions to an autosampler to obtain a specific sample for separation, detailed simple instructions to proportioning valves to effect a desired solvent gradient, and specific temperature instructions to a column oven. The control system automatically implements these instructions to effect an HPLC separation.
The MIPC systems have introduced system operation requirements which cannot be satisfied with existing control systems. Sample trays with increased numbers of wells have been introduced, requiring corresponding detailed autosampler instructions for extracting a separation aliquot of each of the samples. More complex and varied solvent concentration and gradient instructions are required. More precise temperature control is essential, and in some operations, aliquots from the same sample are to be separated at different preset temperatures.
The MIPC system can be used to isolate pure fractions, each having a single base pair size; these are needed for PCR or cloning amplification techniques. This requires use of a fragment collector operating in coordination with the MIPC separation process. Furthermore, the expanding application of MIPC separation processes requires the system be operable by a trained technician rather than a chromatography expert.
In addition, a need exists for an HPLC system which can separate DNA fragments based on size differences, and can also separate DNA having the same length but differing in base pair sequence (mutations from wild type), in an accurate, reproducible, reliable manner. Such a system should be automated and efficient, should be adaptable to routine high throughput sample screening applications, and should provide high throughput sample screening with a minimum of operator attention.
These new requirements are not satisfied by the currently available equipment and associated control systems.