Nucleic acid analysis techniques that identify alterations or polymorphisms within known sequences are useful in many aspects of scientific, medical and forensic fields. For example, these techniques can be used in the genotyping of individuals in order to diagnose hereditary diseases or provide prognosis based on known genetic lesions. These techniques can also be used for clinical purposes such as tissue typing for histocompatibility or for forensic purposes such as identity or paternity determination. Furthermore, nucleic acid analysis techniques can be used for the identification of organisms or to distinguish or identify pathogenic organisms or infectious agents. In addition, these techniques are useful in the identification and monitoring of genetically modified agricultural organisms such as crops and livestock. As genomic sequence of organisms from bacteria to humans become known, the need for nucleic acid analysis techniques that are rapid and inexpensive increases.
Probe- or primer-based assays are useful in the detection, quantitation and analysis of nucleic acids. Nucleic acid probes are used to analyze samples for the presence of nucleic acid sequences from bacteria, fungi, viruses or other organisms and are also useful in examining genotypes, genetically-based disease states or clinical conditions of interest. Genotypes of interest include, for example, point mutations, deletions, insertions and inversions. Furthermore, these assays are useful to detect and monitor polymorphisms within nucleic acid sequences of interest.
Unfortunately, there are a number of potential and actual sources of error in hybridization based assays. For example, pseudogenes (gene duplication) may cause errors in identifying the appropriate target during PCR. Moreover, mispriming of primers and undesired hybridization between primers and other primers and between primers and unintended targets is also a concern.
As described herein, it has been discovered that the phenomenon of xe2x80x9cfoldbackxe2x80x9d within a primer is a source of error in hybridization-based assays, and particularly in single-base extension (SBE) assays. SBE assays for, e.g., identifying and analyzing polymorphisms, are based on single-base extension of a primer with a fluorescently-labeled dideoxynucleotide. Single-base extension occurs in a template-dependent manner, and each distinct nucleotide can be labeled with a distinct detectable fluorescent label. Once the labeled dideoxynucleotide is incorporated onto the primer, the fluorescence of the primer is assessed by any of several methods to determine which dideoxynucleotide was incorporated, and, accordingly, which nucleotide was present at the position of interest. Thus, it will be appreciated that the primer must hybridize to the appropriate locus on the target sequence in order to generate an accurate signal by SBE. The phenomenon of primer foldback interferes with this accuracy.
As shown in FIG. 1, foldback occurs when a portion of the primer is the reverse complement of another portion of the primer. The intra-molecular interaction within the primer is stronger than the inter-molecular interaction between the primer and the target, and the primer folds back on itself to create a hairpin structure in the primer. The primer can then undergo single base extension based on the sequence of the primer adjacent to the intra-molecular hybridization rather then on the sequence of the intended target.
Thus, the present invention relates to a method for selecting primers which have a reduced incidence of undesired interactions. The method comprises subjecting one or more primers to analysis to identify primers in which the 3xe2x80x2 end or the 5xe2x80x2 end of the primer hybridizes to another portion of the primer sequence and excluding identified primers, wherein the remaining primers have a reduced incidence of undesired interactions. In a particular embodiment, the method comprises identifying primers in which: (1) at least 4 bases (nucleotides) (the reverse complementary region) from either the 3xe2x80x2 or the 5xe2x80x2 end of the primer exhibit reverse complementarity to 4 bases elsewhere on the primer (the homologous sequence); and (2) a 2-base span exists between the reverse complementary region and the homologous sequence.
In one embodiment, the primers are selected for use in SBE-based assays. In the method according to this embodiment, the method comprises subjecting one or more primers to analysis to identify primers in which the 3xe2x80x2 end (where single-base extension occurs) hybridizes to another portion of the primer sequence. More specifically, the method comprises identifying primers in which: (1) at least 4 bases (the reverse complementary region) from the 3xe2x80x2 end exhibit reverse complementarity to 4 bases elsewhere on the primer (the homologous sequence); (2) at least 1 base 5xe2x80x2 to the reverse complementary region of the primer is exposed to allow SBE to occur; and (3) there exists a 2-base span between the reverse complementary region and the homologous region.
In a particularly preferred embodiment, the analysis is carried out using a processing means (e.g., a computer) to assess the nucleotide sequence of one or more primers. A variety of data processor programs and formats can be used to store the nucleotide sequence information of the primers on computer readable medium. The sequence information can be represented in a word processing text file, formatted in commercially-available software such as WordPerfect and MicroSoft Word, or represented in the form of an ASCII file, stored in a database application, such as DB2, Sybase, Oracle, or the like. The skilled artisan can readily adapt any number of dataprocessor structuring formats (e.g., text file or database) in order to obtain computer readable medium having recorded thereon the nucleotide sequence information of the present invention. This computer readable form can then be subjected to processing steps suitable for carrying out the analysis of the method.
For example, with respect to primers for use in SBE, the processing means can be programmed to determine all possible matches within a given nucleotide sequence, wherein the shortest possible match sequence extends from the 3xe2x80x2 end and includes 4 bases and the longest possible match sequence extends from the 3xe2x80x2 end to the halfway point of the nucleotide sequence. For each possible match, from longest to shortest, the processing means queries whether there is a homologous sequence in the rest of the primer, from the second 5xe2x80x2 base to 2 bases from the possible match sequence. If so, the processing means reports the primer and match information and proceeds to test the nucleotide sequence of the next primer. Primers reported by the processing means are excluded from use in the SBE assay.
Primers selected by the methods of the present invention retain all of the advantages and uses of traditionally-selected primers but have the advantage of a reduced incidence of unwanted interactions (e.g., foldback).
The invention also pertains to methods of improving the accuracy of SBE-fluorescence polarization (SBE-FP) methods. These methods include decreasing the dideoxynucleotide and primer concentrations used, and/or utilizing SBE primers of a length sufficient to reduce formation of secondary structures in the primers (e.g., hairpin structures resulting from foldback).