Research in biotechnology directed to sequencing nucleic acids is of major importance and has been used to sequence the human genome, make diagnostic probes, treat diseases and for other research applications. Primer extension plays a major role in conducting this research.
Primer extension has diverse applications, such as, for example, various DNA and RNA sequencing methods, polymerase chain reaction (PCR) applications, cloning, enzymatic synthesis of nucleic acids, polymorphism identification, haplotyping, and multitude of other nucleic acid analysis methods. Over the last decade, minisequencing, or single base extension (SBE) technologies, have been used to identify nucleic acids at particular sites, especially in the field of genomic analysis. SBE has been developed into high throughput processes through multiplexing that allows multiple simultaneous nucleic acid identification. Multiplexing has been extended with the advent of new immobilization chemistries to the scale of nucleic acid microchips, or arrays. Thus, SBE can now be performed where the primers are extended and immobilized on a chip, for example, by tag capture methods where extended primers are captured at known positions on the surface of addressable arrays to identify nucleic acids at particular sites. Many other methods are known in the art that use primer extension to identify nucleic acids. Each of these methods relies on the fidelity, efficiency, and accuracy of the primer extension reaction.
There has been tremendous interest in identifying polymorphic sites across the genome. Of particular interest are single nucleotide polymorphisms (SNPs). SNPs are the most common variations in the human genome, occurring at a frequency of about one in 1,000 base pairs. One method of detecting SNPs is by SBE. In one embodiment of SBE, primers are immobilized on solid supports, such as for example, chips, beads or microspheres. These solid supports may have detectable signals or bear tags that allow easy SNP identification. These detectable signals or tags allow genetic information to be easily sorted by any number of methods, including, for example, electrophoresis, mass, volume, or by applying the tagged beads or microspheres to complementarily tagged arrays. Accordingly, large amounts of information can be analyzed in a single experimental run. However, difficulties can occur in analyzing this information due to signal to background noise. For example, the magnitude of noise relative to the expected signal should be low in order to accurately determine the identity of a polymorphic site with a high degree of precision and accuracy. Noise in such a primer extension assay is often sequence-dependent and primarily results from template-independent primer extension, particularly in platforms employing primers immobilized in close proximity to one another, such as on arrays or chips.
Unwanted primer-primer interactions are a common source of template-independent noise. These interactions may occur between two separate but identical primer molecules, between two portions of a single primer molecule, or between two separate but not identical primer molecules. Types of interactions that commonly generate significant noise are those that allow hybridization to occur between two primers, or within a single primer, so that primer extension occurs in the absence of template or target nucleic acid(s). For example, sometimes a single primer molecule can fold into a hairpin structure, resulting in the outcome that the primer also acts as its own template. Alternatively, two primers may interact sufficiently well so that one is capable of serving as template for the other. This results in spurious signal generation because extension of the primer occurs independent of the target nucleic acid sequence.
Attempts have been made to reduce or eliminate unwanted primer interactions by conducting hybridization at elevated temperatures. However, on micro-arrays, primer interactions are more difficult to overcome because primers are immobilized to the surface of the array or chip in very close proximity to one another and array-based protocols may not allow for elevated temperature. Further, temperature effects would tend to disperse primers spatially, but array technologies are based on localized immobilization of the primers, and so spatial separation, by definition, is contradictory to the definition of arrays.
In the case of primer extension in thermal cycling, such as by a thermostable polymerase in PCR, multiple factors are optimized in order to obtain a desired outcome and reduce the likelihood of undesirable primer-primer interactions. For example, varying magnesium chloride concentrations, employing stabilizing additives such as dimethyl sulfoxide, glycerol, formamide, betaine, ammonium sulfate, or commercial or proprietary enhancers, stabilizers, and hybridizing agents are some approaches aimed at minimizing unfavorable primer-primer interactions, or mispriming events, as well as manipulation of reaction conditions such as pH, buffers, ionic and non-ionic detergents and surfactants. Optimization kits are commercially available that aid users in manipulating these factors to some extent. However, multiple factors must be optimized, which can result in costly and time-consuming optimization trials, and certain optimization conditions my not be compatible with array technologies and methods. Further, higher multiplexing levels will only exacerbate sequence-dependent and sequence-independent noise issues, rendering such optimization efforts increasingly difficult—if not impossible—to achieve by biochemical means alone.
Other attempts to avoid primer-primer interactions involve hot start PCR methods. Some hot start methods include inhibition of polymerase by low temperature, or adding polymerase after the melting and annealing step. Hot start PCR improves specificity, sensitivity and yield of the PCR reaction. Other methods avoid nonspecific, or target-independent primer extension by omitting, inhibiting, or sequestering the thermostable polymerase in the initial denaturation and annealing reaction, such as by reversibly binding to the polymerase an inactivating antibody.
Primer self-complementarity also adds to unwanted target-independent extension resulting in inaccurate primer extension and noise. Self-complementarity is the ability of a primer to self-anneal or anneal with another copy of itself. Although self-complementarity can be highly undesirable, it may not be possible to avoid every instance of self-complementarity in primer design for a variety of reasons. For example, the target nucleic acid may contain regions or repetitive stretches of self-complementarity, and thus amplification or primer extension might require primer structure reflecting these repetitive regions.
There are many computer programs used in primer design that limit use of primers that exhibit self-complementarity, or limit the degree of self-complementarity. However, such programs may not be useful in the situation where the target sequence for which the primer is being synthesized bears a region of self-complementarity. For example, if a target nucleic acid molecule has a sequence that requires the use of primers having a palindromic sequence at its 3′ terminus, such as for example a restriction site, self-complementarity might not be avoided.
Palindromes are sequences that symmetric. That is, they read the same from the 5′ to 3′ direction as they read from the 3′ to 5′ direction. One characteristic of a palindromic sequence is that it is self-complementary, and thus generally able to hybridize to itself, or other copies of itself. In the case that a palindrome is at the 3′ terminus of the primer, it can hybridize with the 3′ terminus of another primer molecule that is acting as template, confounding attempts to extend the primer employing the target nucleic acid as a template. Generally, the closer a region of self-complementarity is to the 3′ end of a primer, the more likely it is to interfere with primer extension. Efforts can be made to minimize the likelihood of interference, such as maintaining short primers 30 to 40 nucleotides in length, and matching thermodynamic parameters for primers. However, this is not always a viable solution due, to constraints set by the sequences of the target nucleic acid, as discussed above.
Physical separation of an essential reaction component prior to the first denaturation step can often reduce target-independent priming. Denaturation of target nucleic acid before the addition of a polymerizing agent or magnesium chloride can improve the specificity and sensitivity of the primer extension reaction, but requires an added step where tubes are reopened to add the missing component. This added step becomes unwieldy when running a multitude of samples.
Other attempts to reduce target-independent primer extension employ reversible inhibition of the polymerizing agent by, for example, an antibody, which can be effective in some circumstances. However, this approach is more expensive and time consuming and requires an added step, which is undesirable when running a multitude of samples.
Enhancers can be employed to increase yield, specificity and the likelihood of overcoming high GC content or long templates. Such enhancers include compounds which often can increase yield, such as nonionic detergents like, for example, Tween-20, or enzyme-stabilizing agents such as bovine serum albumin, or other compounds such as betaine, glycerol, dimethyl sulfoxide, polyethylene glycol, or salts. Improvements in specificity can be achieved through addition of formamide to the reaction mixture, which tends to destabilize mismatched primers.
Target-independent primer extension may occur not only with heat-stable polymerases in a polymerase chain reaction, but in any application employing primer extension on a target nucleic acid. Such applications include those for nucleic acid sequencing, genotyping, and a host of other primer extension methods.
Based on the foregoing, there is a need for methods and compositions that allow primer extension reactions to run maximally with respect to key parameters, such as efficiency, fidelity, accuracy, and high signal-to-noise ratio. The present invention addresses this need and provides methods and compositions that reduce target-independent primer extension or enhance template dependent primer extension. The methods and compositions of the present invention are applicable not only in PCR, but also in nucleic acid sequencing, genotyping, and other applications employing extension of a primer in a target-dependent manner.