1. Field of the Invention
This invention relates to novel solvents for the anion-exchange separation of nucleic acid fragments. These solvents result in the marked improvement of chromatographic methods for the analysis of nucleic acids in molecular biology, analytical biochemistry, clinical chemistry, industrial and environmental microbiology, and molecular genetics.
2. Description of Related Art
A single strand of DNA consists of a chain of deoxyribose-phosphate monomers covalently linked via phosphodiester bonds and also contains a single aromatic heterocyclic "base" covalently attached to each deoxyribose ring. In aqueous solvents of pH greater than about 2, the very hydrophilic sugar-phosphate polymer backbone contributes one negative charge for each phosphodiester plus one or two negative charges for every terminal phosphomonoester. Therefore, DNA is a polyanion; the net negative charge is almost perfectly proportional to chain length. However, the bases are very hydrophobic, so that a single strand has mixed hydrophilic-hydrophobic character. Single-stranded DNA adopts a random-coil conformation which fluctuates constantly and rapidly and has a time-averaged spherical shape. Within the sphere, random hydrophobic aromatic stacking interactions and base pairing hydrogen bonds tend to draw together different parts of the chain, and electrostatic repulsion of the phosphate groups tends to drive apart different parts of the structure. The balance between these opposing tendencies and therefore the average spherical diameter depend on temperature and solvent composition, especially ionic strength.
Most naturally occurring DNA is double-stranded, consisting of two single strands of similar or identical length which interact noncovalently to form a double helix in which the four commonly occurring bases, adenine (A), thymine M, guanine (G), and cytosine (C), exist in complementary sequences on the two interacting strands, such that each A on one strand is hydrogen-bonded to T on the other strand (and vice versa) and each G on one strand is hydrogen-bonded to a C on the other strand (and vice versa). This base-paired structure sequesters the hydrophobic groups inside the double helix along its axis and away from the solvent; the two helical sugar-phosphate chains spiral down the outside of the double-stranded structure, presenting a hydrophilic, poly-anionic face to the solvent. Two helical grooves of different width, "major" and "minor," separate the two sugar-phosphate chains. The grooves are large enough to bind water molecules and solvent cations. The double-helical structure also stiffens double-stranded DNA so that segments less than several hundred base pairs are effectively rigid and linear rather than flexibly coiled into a sphere. On the length scale of many hundreds to thousands of base pairs, double-stranded DNA also is coiled but much more loosely than single-stranded DNA. Recently it has become clear that certain sequences in double-stranded DNA induce curvature in the double helix so that it no longer is linear on the length scale of tens to hundreds of base pairs (reviewed by Hagerman, 1990, Annual Reviews of Biochemistry 59:755-781).
Double-stranded DNA can be reversibly "melted" to yield two chains of single stranded DNA by heating to temperatures in the approximate range of 50.degree.-100.degree. C. G-C base pairs tend to melt at higher temperatures than A-T base pairs because their base pairing interactions are stronger. Solvent composition also affects double-helix stability; adding an organic cosolvent or lowering the salt concentration in an aqueous solvent lowers the T.sub.m (the temperature where half of the DNA has dissociated into single strands) of any DNA, regardless of base composition.
RNA structure resembles but is more complex than DNA structure. The single strand is almost identical to the DNA single strand, differing only in the replacement of deoxyribose by ribose and of thymine by uracil (which still can base pair to adenine). However, double-stranded RNA is rare, although single strands often contain relatively short self-paired double-stranded regions because adjoining base sequences are complementary. RNA has the same polyanionic properties as DNA, but the abundance of single-stranded regions renders it more hydrophobic, and the mixture of single-stranded and double-stranded regions destroys the shape regularity (spherical or linear) seen in single-stranded and double-stranded DNA.
The most common methods for separating different DNA molecules, whether for preparative or for analytical purposes, exploit the strictly length-dependent polyanionic properties and the considerable shape regularity and flexibility of both single-stranded and double-stranded structures. Although most electrophoretic and chromatographic DNA separations depend directly or indirectly on polymer net charge, the near proportionality between charge and polymer length results in size-dependent differences in displacement of different DNA species along the separation axis (commonly expressed as distance in gel electrophoresis, time in capillary electrophoresis, and time or volume in chromatography). Larger molecules migrate more slowly and therefore travel less distance in any given electrophoretic separation, because the gel acts as a sieve to exert viscous drag on charged solute molecules; this drag varies directly with solute size. Most chromatographic separations of DNA entail gradient elution, wherein an eluting solute in the solvent is systematically and usually continuously increased with time of elution and volume of continuously flowing solvent; larger molecules are bound more tightly to the chromatographic resin than smaller molecules are, and therefore require higher concentrations of the chromatographic eluting solute to be displaced from the resin. Under ideal separation conditions electrophoretic migration rate and distance or chromatographic elution time and volume depend monotonically on DNA molecular size and can be used to identify specific DNA fragments according to size. In fully optimized separations, electrophoretic displacement or chromatographic elution time is a linear function of the logarithm of molecular size.
The identification value of a size-dependent nucleic acid separation depends on four performance characteristics: size range, size resolution, precision of movement, and size accuracy. Few separations give linear log size calibration curves over a range of more than one order of magnitude of molecular size. As many analyte systems include DNA species ranging over several orders of size magnitude, several different electrophoretic gels (e.g., employing different gel densities) or chromatographic elutions must be run to characterize the system fully. Size resolution concerns how small a difference in DNA length results in distinguishable electrophoretic bands or chromatographic peaks; high-resolution systems usually access the narrowest size ranges. Faster separations usually sacrifice size resolution. Precision of movement is the most important performance criterion for DNA identification. How reproducibly a particular fragment migrates a given distance or elutes at a given time absolutely determines confidence that a fragment has a particular size and is not a completely different species. Precision limitations of both electrophoretic and chromatographic separations are reduced by frequent running of external molecular size standards in adjacent gel lanes or in a consecutive chromatographic separation and optimally by running internal molecular size standards together with the test sample, taking care that the standards do not interfere with the analyte bands or peaks. Molecular size accuracy refers to how exactly analyte species fall on a smooth calibration curve of migration distance or elution time versus molecular size (or its logarithm). Species which fall off the consensus calibration curve for the majority of size standards or analyte molecules may be assigned incorrect molecular size values, although they still can be correctly and precisely identified in test samples as long as the separation anomaly has been characterized previously with known samples. The major risk of size inaccuracy is mischaracterization of new analytes during research and discovery activities.
The total value of a separation method depends on performance in other ways besides the quality of the size information. Other important analytical properties include analyte quantitation (precision, accuracy, dynamic range, and detection limit), ease of recovery of separated species (for post-separation study such as DNA sequencing), reliability (freedom from interferences, equipment or reagent malfunction, and operator error), speed (time per sample), throughput (samples per hour, day or work week), and cost (in equipment, reagents, and labor, including labor quantity and quality).
In recent years, gel electrophoresis has become the standard method for size dependent nucleic acid separations. Two gel matrices are commonly used: agarose, which is easier to use but which gives lower size resolution, and polyacrylamide, which is harder and more hazardous to use and which gives the best size resolution, up to the maximum possible performance (in sufficiently long gels) of resolving single nucleotide differences, especially for single-stranded DNA. Any given gel density provides no more than about one order of magnitude of practical DNA size range. Size precision is rarely measured or expected; gel or electric field inhomogeneity often results in inconsistent migration among lanes or within a single lane in a slab gel. Species generally are identified by approximate movement relative to other species; and absolute analyte identification is based on nucleic acid probing, most commonly by dot blotting or Southern analysis, which gives a positive signal only if the separated species contains base sequence complementary to a polynucleotide or oligonucleotide of known sequence. Identification by blotting generally is slow, labor-intensive, and relatively unreliable, and often is hazardous because the analytical signal commonly is created by radioactive tags on probes.
Gel electrophoresis is unreliable with respect to double-stranded DNA size accuracy, because the molecular curvature described above can retard electrophorefic migration sufficiently to imply that a molecule is twice as long as it really is (Koo and Crothers, 1988, Proc. Natl Acad. Sci. USA 85:1763-1767; Hagerman, 1985, Biochemistry 24:7033-7037; and Shore et al, 1981, Proc. Natl. Acad. Sci. USA 78:4833-4837). Gel electrophoresis has other performance limitations. Bands are most commonly visualized by staining with ethidium, which shows strong fluorescence enhancement when it binds to double-stranded (but not single-stranded) DNA. Such staining is hazardous because ethidium is a cancer-suspect agent; it is very insensitive to single-stranded DNA; it is unreliable for quantitating electrophoretically separated species, because the reversible dye binding reaction is very sensitive to experimental conditions, and fluorescence requires careful calibration. Gel electrophoresis is labor-intensive and vulnerable to operator variability or error. It is relatively slow (at least several hours per run, including staining or other post-electrophoretic detection) but has acceptable throughput because several tens of samples can be run simultaneously. Recovery of separated species from the gel is slow and labor intensive.
Capillary electrophoresis has recently evolved to provide fast, high-resolution, size-dependent DNA separations which are very sensitive to low amounts (in mass units) of DNA. However, size precision is poor; quantitation of individual species is difficult and insensitive (with respect to DNA concentration, which is more important to molecular biologists and clinical chemists than DNA mass; DNA usually is abundantly obtainable, but often at relatively low concentrations); recovery of separated species in useful quantities is difficult because the mass of DNA processed in each separation is very small.
Liquid chromatography, and high-pressure liquid chromatography (HPLC) in particular, is still maturing as a DNA separation method (reviewed by Thompson, 1986, BioChromatography 1:16-20, 22-32, and 68-80; 1987, BioChromatography 2:4-18). The separation chemistry is independent of separation pressure; the latter variable varies inversely with the particle size of the chromatographic matrix and the time required for a separation. Separations that take many hours and hundreds of ml of solvent when run at atmospheric pressure on large-particle adsorbents can be completed in 3-30 minutes consuming 3-30 ml of solvent, at pressures of 5-10 atmospheres on 2-10 .mu.m--diameter particles. Quantitative sensitivity is inversely proportional to separation volume, and HPLC is a highly automated procedure which makes little demands on labor quality or quantity. Liquid chromatography normally detects DNA via its high ultraviolet (UV) absorbance at a wavelength of maximum absorbance (.lambda..sub.max) near 260 nm. HPLC is so automated that a single computer-controlled instrument can run the separation, measure eluted absorbance as a function of elution time, analyze the resulting elution profile to quantitate the absorbance in each peak (corresponding to a different DNA species), identify each peak in terms of elution time and even (by comparison to a calibration curve stored in the computer) molecular size, and collect each peak in a separate container for further analysis. The extreme sensitivity of HPLC UV absorbance detectors gives confident quantitation of peaks no higher than 10.sup.-4 absorbance units, containing approximately 10.sup.-10 g of nucleic acid in less than 0.1 ml of chromatographic solvent. Spectrophotometrically monitored HPLC has a lower DNA detection limit for double-stranded DNA than ethidium-stained gel electrophoresis. The broad dynamic range of HPLC UV absorbance detectors, measuring absorbances up to about 1 absorbance unit, allows HPLC to quantitate DNA ranging over 4 orders of magnitude in concentration, with none of the calibration difficulty of fluorescence measurements.
Two major liquid chromatographic separation chemistries are used for DNA: anion exchange and ion-paired reverse-phase. In anion exchange, the solid chromatographic matrix contains on its surface abundant fixed positive charges which bind the DNA polyanion with a strength related directly to DNA length. As the concentration of an eluting salt is increased, usually continuously with elution time and the volume of solvent passed through a cylindrical column of the densely packed matrix, DNA fragments are eluted in approximate order of increasing size, because dissolved salt weakens the binding of polyanion to matrix. In ion-paired reverse-phase separations, the solid chromatographic matrix contains on its surface abundant fixed hydrophobic groups, and the solvent contains a hydrophobic tetraalkylammonium or trialkylammonium chloride, bromide, or acetate salt. The alkylammonium cations bind weakly to the DNA polyanion to render it approximately electrically neutral and hydrophobic, and the hydrophobic DNA-alkylammonium complex binds to the hydrophobic matrix. As the concentration of a low-polarity organic cosolvent in the aqueous solvent is increased, usually continuously, over elution time, the hydrophobic interactions between DNA and matrix is weakened. Longer DNA molecules bind more tightly to the matrix than small ones, so that elution order again approximately parallels molecular size; larger DNA molecules require higher organic cosolvent concentrations to be eluted.
Despite the previously described advantages of HPLC as a reliable, economical, sensitive, quantitative method of analyzing DNA, neither separation chemistry is optimal with respect to size resolution, range, precision, or accuracy. Ion-paired reverse-phase separations tend to be slow (requiring several hours), often result in relatively low recoveries of eluted fragments, and give occasional inversions in retention time as a function of molecular size, jeopardizing their value for size-accurate identification (Erikson et al, 1986, J. Chromatography 359:265-274). The aqueous solvents used for these separations contain relatively low (10.sup.-3 -10.sup.-1 M) concentrations of trialkylammonium (e.g., triethylammonium) or tetraalkylammonium (e.g., tetrabutylammonium) salts and vary the concentration of an eluting organic cosolvent such as acetonitrile over the range of 5-50%.
Prior to the present invention, anion-exchange HPLC separation of double-stranded DNA on a variety of different ion-exchange solids has suffered from occasional to frequent inversions in retention time as a function of molecular size, preventing its use for size-accurate fragment identification (Kato et al, 1983, J. Chromatog. 265:342-346; Merion et al., 1988, BioTechniques 6:246-251; Kato et al., 1989, J. Chromatog. 478:264-268; Maa et al, 1990, J. Chromatogl. 508:61-73; Muller, 1986, Eur. J. Biochem. 155:203-212; Hecker et al. 1985, J. Chromatog. 326:251-261; and Westman et al., 1987, Anal. Biochem. 166: 158-171). In almost every case, it has been remarked that the DNA fragments most likely to have retention times longer than predicted on the basis of molecular size also have abnormally high AT content. However, no remedy was proposed or demonstrated for this effect and only one theory, curvature of A-T rich DNA, has been suggested to explain it (Hecker et al, supra). In almost every case, the eluting salt was NaCl, generally varied in the concentration range of 0.3-1.2M. The other alkali metal chloride salts have been tried without evidence of improved performance over NaCl, and some eluting salt anions such as acetate, trichloroacetate, chlorate, and sulfate, gave greatly reduced fragment resolution or no separation at all (Westman et al. and Hecker et al., supra).
It generally has been observed that the salt gradient must be rendered increasingly shallow to resolve fragments of increasing size, up to the point that for one anion-exchange solid, very little size resolution occurred above 500 base pairs (Westman et al., supra). It has been suggested that this phenomenon occurs because DNA elution is controlled by salt activity rather than concentration (Muller, supra). Salts, including NaCl, show a drop in activity coefficient as salt concentration is increased in the 0.1M range. This drop renders salt activity less than proportional to salt concentration, increasing the salt concentration rise needed to obtain a given activity rise and therefore reducing the salt concentration sensitivity of retention time. However, in the 0.5-1.0M salt concentration range, activity coefficient becomes increasingly salt concentration-independent, rendering the salt activity more strongly salt concentration-dependent and therefore increasing the salt concentration-sensitivity of retention time.
In only one case (Muller, supra). has the effect of temperature on anion-exchange HPLC retention time been observed. Increasing temperature increased the strength of DNA binding to the anion-exchange solid, but no effort was made to quantitate the phenomenon or relate it to technical requirements for retention-time precision. Retention-time precision has not been a concern in the prior art.
The preceding review has focused on the HPLC of double-stranded DNA, expected to be simpler than that of single-stranded DNA and RNA, because partly- or completely single-stranded nucleic acid exposes bases to the solvent and the chromatographic matrix and therefore should show strongly sequence- and composition-dependent retention times. However, these complicating interactions can be reduced by including organic cosolvents in the eluting solvent, by choosing a chromatographic matrix which interacts minimally with the bases, or by operating at such a high pH, generally above 10, that some of the bases become anionic. When the bases bear negative charges, base-stacking and hydrogen bonding interactions among them are weakened, and the nucleic acid resembles a purely random coil, the radius and net charge of which are directly related to polymer length. Organic cosolvents promote this behavioral simplification in two ways, by weakening base-stacking and hydrogen bonding and by weakening base-matrix interactions.
Given the observation that high A-T content tends to cause double-stranded DNA to bind to anion-exchange solids more tightly than expected simply on the basis of molecular size, the following question arises: could some simple change in eluting conditions, such as temperature or solvent composition, ablate whatever structural difference between A-T-rich and G-C-rich DNA is responsible for the phenomenon?
Melchior and von Hippel, 1973, Proc. Natl. Acad. Sci. USA 70:292-302, showed that tetraalkylammonium halide salts (especially tetramethylammonium chloride and tetraethyl-ammonium chloride) and at least one trialkylammonium salt (triethylammonium chloride) greatly reduced and could even eliminate the differences in melting behavior between G-C-rich and A-T-rich DNA. However, tetramethylammonium ion and tetraethylammonium ion had dramatically opposite effects on double-stranded DNA stability; the former increased T.sub.m whereas the latter decreased T.sub.m. These effects were most evident at very high (greater than 2M salt concentrations, one to two orders of magnitude higher than the concentration ranges in which these salts are used in ion-paired reverse-phase HPLC of nucleic acids. Shapiro et al., 1969, Biochemistry 9:3219-3232, showed that polylysine preferentially binds to and precipitates A-T-rich DNA and that adding tetraalkylammonium salts at very high concentration destroys and even reverses this preference.
The second phenomenon implies that the tetraalkylammonium ions also bind preferentially to A-T-rich DNA; this inference explains the results of Melchior and von Hippel as well. Shapiro et al., 1969, Biochemistry 9:3233-3241, showed directly that several tetraalkylammonium ions bind more tightly to A-T-rich than to G-C-rich DNA and suggested that this phenomenon, not seen for the alkali metal cations, arose from the tightness of steric fit of the tetraalkylammonium ions in the double helix major groove. Orosz and Wetmur, 1977, Biopolymers 16: 1183-1199, explored the steric interpretation by probing the effects on double-stranded DNA stability of a variety of tetraalkylammonium ions containing different combinations of methyl, ethyl, propyl, butyl, pentyl, and hexyl groups. Increasing alkyl group size tended to render the cation more helix-destabilizing and, for alkyl groups larger than ethyl, tended to reduce the ability to stabilize A-T-rich regions preferentially.
The question raised by these indications of preferential binding of tetraalkylammonium and trialkylammonium salts to A-T-rich regions of double-stranded DNA is whether such a preference could reduce the tendency of A-T-rich double-stranded DNA to bind especially tightly to anion-exchange solids. If alkylammonium cations could operate in this fashion in concentration range, presumably near 1M, where they might elute DNA from anion-exchange matrices, then they might render anion-exchange HPLC an accurate method of estimating DNA fragment size from chromatographic retention time. Furthermore, the observation that double-stranded DNA affinity for at least one anion-exchange solid increases with increasing temperatures (Muller, supra) suggests that interactions of the eluting cation with DNA and of the eluting anion with anion-exchange solid may control the temperature dependence. If the DNA-anion-exchange-solid interaction alone controlled the temperature dependence, affinity would fall as the temperature is increased. Therefore, alkylammonium eluting salts might change the temperature sensitivity of chromatographic retention time. The lower this temperature sensitivity, the easier it would be to attain high retention-time precision, improving the ability of HPLC to identify DNA fragments solely on the basis of retention time.
Although the choice of eluting salt cation has the best chance of influencing the size accuracy of anion-exchange HPLC of double-stranded DNA, the choice of salt anion has strong effects, positive or negative, on the quality of the separation. As noted above, some anions reduce peak resolution on at least some anion-exchange solids. Anions have profound effects on double-stranded DNA stability (Robinson and Grant, 1966, J. Biol. Chem. 241:4030-4042); salt anions which tend to melt DNA might increase retention-time sensitivity to DNA A-T content or sequence. Salt anion interaction with the anion-exchange solid will affect retention-time temperature dependence, because it contributes to the total enthalpy change of the anion-exchange process. In this regard, anions which bind weakly to the anion-exchange solid are preferred because they are likely to contribute the smallest enthalpy changes. Finally, the chloride anion, almost universally used in the eluting buffers for DNA anion-exchange chromatography, is well known to promote the corrosion of stainless steel, commonly used in HPLC pumps, fittings, columns, and tubing. Almost any other buffer anion would be preferred in the interest of improving HPLC hardware durability and minimizing contamination of columns and analytes with Fe(III).
The combination of cation and anion in the eluting salt can affect HPLC pump durability and maintenance in still another way. Small solvent leaks deposit elution solvent on moving parts. After the water evaporates, the buffer salts crystallize to form abrasive solids which scratch the pistons and seals. The especially high eluting salt concentrations of anion-exchange chromatography of DNA are particularly damaging to HPLC pumps and valves. However, eluting salts differ in crystalline hardness and shape and therefore in abrasive potential; NaCl is particularly abrasive whereas salts of alkylammonium cations and of carboxylate anions should form softer crystals. Some salts, like those between dialkylamines or trialkylamines and short-chain aliphatic carboxylic acids (for example, formic and acetic acids) have the additional advantage of being volatile, because the component acids and bases are volatile (tetraalkylammonium salts do not share this property). Volatile salts are less likely to abrade moving parts and also are easier to remove from recovered samples of chromatographed DNA if they interfere with post-HPLC processing.
Clearly, optimizing solvent composition for the anion-exchange HPLC of DNA involves multiple criteria, some of which may be mutually incompatible. It is equally clear that the conventional eluting salt, NaCl, is suboptimal for multiple reasons: (1) retention-time sensitivity to DNA A-T content, which reduces size accuracy, (2) a very high retention-time temperature sensitivity which reduces size precision, (3) an escalating retention-time sensitivity to salt concentration as DNA fragment size increases, resulting in a reduced practical size range, and (4) a propensity to damage HPLC hardware chemically and physically. The present invention provides improved HPLC solvents which address all of these concerns.