Hybridization between complementary nucleic acids is an implicit feature in the Watson-Crick model for DNA structure that is exploited for many applications of the biological and biomedical arts. For example, virtually all methods for replicating and/or amplifying nucleic acid molecules are initiated by a step in which a complementary oligonucleotide (typically referred to as a “primer”) hybridizes to some portion of a “target” nucleic acid molecule. A polymerase then synthesizes a complementary nucleic acid from the primer, using the target nucleic acid as a “template”. See, Kleppe et al., J. Mol. Biol. 1971, 56:341-361.
One particular application, known as the polymerase chain reaction (PCR), is widely used in a variety of biological and medical arts. For a description, see Saiki et al., Science 1985, 230:1350-1354. In PCR, two or more primers are used that hybridize to separate regions of a target nucleic acid and its complementary sequence. The sample is then subjected to multiple cycles of heating and cooling, repeatedly hybridizing and dissociating the complementary strands so that multiple replications of the target nucleic acid and its complement are performed. As a result, even very small initial quantities of a target nucleic acid may be enormously increased or “amplified” for subsequent uses (e.g., for detection, sequencing, etc.).
Multiplex PCR is a particular version of PCR in which several different primers are used to amplify and detect a plurality of different nucleic acids in a sample—usually ten to a hundred different target nucleic acids. Thus, the technique allows a user to simultaneously amplify and evaluate large numbers of different nucleic acids simultaneously in a single sample. The enormous benefits of high throughput, speed and efficiency offered by this technique has made multiplex PCR increasingly popular. However, achievement of successful multiplex PCR usually involves empirical testing as existing computer programs that pick and/or design PCR primers have errors. In multiplex PCR, the errors become additive and therefore good results are seldom achieved without some amount of trial and error. Markouatos et al., J. Clin. Lab Anal. 2002, 16(1):47-51; Henegarin et al., Biotechniques 1997, 23(3):504-11.
Other techniques that are widely used in the biological and medical arts exploit nucleic acid hybridization to detect target nucleic acid sequences in a sample. See, for example, Southern, J. Mol. Biol. 1975, 98:503-517; Denhardt, Biochem. Biophys. Res. Commun. 1966, 23:641-646; Meinhoth & Wahl, Anal. Biochem. 1984, 138:267-284. For instance, Southern blotting and similar techniques have long been used in which nucleic acid molecules from a sample are immobilized onto a solid surface or support (e.g., a membrane support). A target nucleic acid molecule of interest may then be detected by contacting one or more complementary nucleic acids (often referred to as a nucleic acid “probes”) and detecting their hybridization to nucleic acid molecules on the surface or support (for example, through a signal generated by some detectable label on the probes).
Similar techniques are also known in which one or more nucleic acid probes are immobilized onto a solid surface or support, and a sample of nucleic acid molecules is hybridized thereto. Nucleic acid arrays, for example, are known and have become increasingly popular in the art. See, e.g., DeRisi et al., Science 1997, 278:680-686; Schena et al., Science 1995, 270:467-470; and Lockhart et al., Nature Biotech. 1996, 14:1675. See also, U.S. Pat. No. 5,510,270 issued Apr. 23, 1996 to Fodor et al. Nucleic acid arrays typically comprise a plurality (often many hundreds or even thousands) of different probes, each immobilized at a defined location on the surface or support. A sample of nucleic acids (for example, an mRNA sample, or a sample of cDNA or cRNA derived therefrom) that are preferably detectably labeled may then be contacted to the array, and hybridization of those nucleic acids to the different probes may be assessed, e.g., by detecting labeled nucleic acids at each probe's location on the array. Thus, hybridization techniques using nucleic acid arrays have the potential for simultaneously detecting a large number of different nucleic acid molecules in a sample, by simultaneously detecting their hybridization to the different probes of the array.
The successful implementation of all techniques involving nucleic acid hybridization (including the exemplary techniques described, supra) is dependent upon the use of nucleic acid probes and primers that specifically hybridize with complementary nucleic acids of interest while, at the same time, avoiding non-specific hybridization with other nucleic acid molecules that may be present. For a review, see Wetmur, Critical Reviews in Biochemistry and Molecular Biology 1991, 26:227-259. These properties are even more critical in techniques, such as multiplex PCR and microarray hybridization, where a plurality of different probes or primers is used, each of which is preferably specific for a different target nucleic acid.
Duplex stability between complementary nucleic acid molecules is frequently expressed by the duplex's “melting temperature” (Tm). Roughly speaking, the Tm indicates the temperature at which a duplex nucleic acid dissociates into single-stranded nucleic acids. Preferably, nucleic acid hybridization is performed at a temperature slightly below the Tm, so that hybridization between a probe or primer and its target nucleic acid is optimized, while minimizing non-specific hybridization of the probe or primer to other, non-target nucleic acids. Duplex stability and Tm are also important in applications, such as PCR, where thermocycling may be involved. During such thermocycling steps, it is important that the sample temperature be raised sufficiently above the Tm so that duplexes of the target nucleic acid and its complement are dissociated. In subsequent steps of reannealing, however, the temperature must be brought sufficiently below the Tm that duplexes of the target nucleic acid and primer are able to form, while still remaining high enough to avoid non-specific hybridization events. For a general discussion, see Rychlik et al., Nucleic Acids Research 1990, 18:6409-6412.
Traditionally, theoretical or empirical models that relate duplex stability to nucleotide sequence have been used to predict or estimate melting temperatures for particular nucleic acids. For example, Breslauer et al. (Proc. Natl. Acad. Sci. U.S.A. 1986, 83:3746-3750) describe a model for predicting melting temperatures that is widely used in the art, known as the “nearest neighbor model”. See also, SantaLucia et al., Biochemistry 1996, 35:3555-3562; and SantaLucia, Proc. Natl. Acad. Sci. U.S.A. 1998, 95:1460-1465. Such models are usually calibrated or optimized for particular salt conditions, typically 1 M Na+. However, applications that exploit nucleic acid hybridization may be implemented in a variety of different salt conditions, with cation concentrations typically being on the order of magnitude of 10-100 mM. Thus, melting temperatures for particular probes or primers in an assay are typically predicted by predicting a melting temperature at a first salt concentration using the nearest neighbor or other model, and then using another theoretical or empirical model to predict what effect(s) the salt conditions of the particular assay will have on that melting temperature.
Most, if not all of the existing models used to estimate Tm treat the effects of salt concentration as being separate from and independent of the nucleotide sequence. For example, Schildkraut et al. (Biopolymers 1965, 3:195-208) proposed the following formula to estimate nucleic acid melting temperatures at different sodium ion concentrations, [Na+]:Tm([Na+])=Tm0+16.6×log[Na+]  (Equation 1.1)where Tm0 is the melting temperature of the DNA duplex in 1 M sodium ions. Equation 1.1, above, is based on empirical data from the specific study of Escherichia coli genomic DNA in buffer of between 0.01-0.2 M Na+. Nevertheless, the use of this equation has been routinely generalized to model any DNA duplex oligomer pair. See, for example, Rychlik et al., Nucleic Acids Res. 1990, 18:6409-6412, Ivanov & AbouHaidar, Analytical Biochemistry 1995, 232:249-251; Wetmur, Critical Review in Biochemistry and Molecular Biology 1991, 26:227-259.
There is evidence, however, indicating that the effects of salt concentration on the melting temperature of nucleotide duplexes are not sequence independent but, rather, depend substantially on sequence composition of the particular nucleic acids. For a review see, Bloomfield et al., Nucleic Acids: Structure, Properties, and Functions (University Science Books, Sausalito California 2000): pages 307-308. For example, Owen et al. (Biopolymers 1969, 7:503-516) have proposed one empirical formula, based on melting experiments of bacterial DNA, that relates melting temperature (Tm) of long polymeric DNAs to log[Na+] and the nucleic acid's G-C content, ƒ(G-C):ƒ(G-C)=tan(70.077+3.32×log[Na+])×(Tm−175.95)+260.34  (Equation 1.2)Still others (Frank-Kamenetskii, Biopolymers 1971, 10:2623-2624) have reanalyzed the same experimental data and suggested simplified equations, purportedly reflecting the linear dependence of melting temperature on log[Na+]:Tm=176.0−(2.60−ƒ(G-C))×(36.0−7.04×log[Na+])  (Equation 1.3)
Doktycz et al. (Biopolymers 1992, 32:849-864) have applied Equation 1.3, above, to estimate the salt dependence of Tm for average G-C and A-T base pairs in a DNA duplex, and concludes that the dependence is governed by different equations for each type of base pair. Blake & Delcourt (Nucl. Acids Res. 1998, 26:3323-3332; Corrigendum, Nucl. Acids Res. 1999, 27, No.3) also report that the rate at which Tm changes as a linear function of log[Na+] varies with each nearest neighbor, based on melting curves of synthetic tandemly repeating nucleic acid inserts in recombinant pN/MCS plasmids. However, their experiments were conducted in the narrow range of Na+ concentrations from 34 mM to 114 mM.
Rouzina & Bloomfield (Biophysical Journal 1999, 77:3242-3251) have also analyzed melting data from large, polymeric DNA molecules and propose an alternative interpretation for the salt dependence of melting temperatures. In particular, the publication suggests a new explanation of empirical Frank-Kamenetskii's relationship (Equation 1.3) that salt dependence of Tm may be due to small differences between the heat capacities of duplex and single-stranded nucleic acid molecules in solution. The publication suggests that this effect may be at least partially sequence dependent. Yet, no new relationship between nucleotide sequence and the effect is proposed or suggested.
Finally, Owczarzy et al., Biopolymers 1997, 44:217-239 describe experiments evaluating melting temperatures for oligonucleotide duplexes with various G-C content, ƒ(G-C). However, melting temperatures were evaluated at only two concentrations of sodium ions, 1 M and 115 mM. Consequently, the publication provides an equation that only predicts Tm values between those two conditions.
Despite the existence of such data, sequence-independent formulas such as Equations 1.1, supra, are still used in the art to estimate salt-corrected melting temperatures. For instance, as recently as 1998 SantaLucia et al. (Proc. Natl. Acad. Sci. U.S.A. 1998, 95:1460-1465) have advocated formulas that estimate salt dependence of a melting temperature by assuming the effects are sequence independent. Thus, even though there may be data suggesting that the effects of salt on a nucleic acid's melting temperature depend on the nucleotide sequence, the available data is incomplete and, in many instances, obtained under conditions which are, at best, remote from those of biological or biomedical techniques that involve nucleic acid hybridization. Specifically, effects of sodium ions on Tm have been systematically studied only for long DNA polymers and DNA dumbbells. See Blake & Delcourt, Nucl. Acids Res. 1998, 26:3323-3332 and Doktycz et al. (Biopolymers 192, 32:849-864). As a result, the exact effect salt conditions will have on a probe or primer's melting temperature in such assays remains poorly characterized and unknown. Consequently, currently available methods for estimating melting temperatures of particular probe or primer sequences in hybridization assays are inaccurate and unreliable.
Yet, given the prevalence and importance of such assays in the biological and biomedical arts, there is a significant need for methods of estimating and predicting melting temperatures with improved accuracy. In particular, there is a need for methods which predict or estimate the melting temperature for a nucleic acid, particularly for an oligonucleotide (e.g., an oligonucleotide probe or primer) in a PCR or other assay that involve nucleic acid hybridization. There exists, moreover, a need for reliable and accurate methods that estimate effects of changing salt concentration on the melting temperature of particular nucleic acid sequences. There further exists a need for methods of designing oligonucleotides, e.g., as probes or primers for a particular hybridization, PCR or other method, in which the melting temperature of each oligonucleotide is optimized for the particular method or assay.
The citation or discussion of any reference in this section or elsewhere in the specification is made only to clarify the description of the present invention and is not an admission that any such reference is “prior art” against any invention described herein.