1. Field of the Invention
The present invention relates to oligonucleotide compositions containing cleavable primers and diagnostic and analytical methods employing such primers.
2. Description of Related Art
DNA, the primary genetic material, is a complex molecule consisting of two intertwined polynucleotide chains, each nucleotide containing a deoxyribose unit, a phosphate group and a nitrogenous heterocyclic base. The two polynucleotide strands are held together via hydrogen bonding interactions between complementary base pairs. A normal human being possesses 23 pairs of chromosomes containing a total of about 100,000 genes. The length of DNA contained within the human chromosomes totals about 3.3 billion base pairs, with a typical gene containing about 30,000 base pairs.
Approximately 4,000 human disorders are attributed to genetic causes. Hundreds of genes responsible for various disorders have been mapped, and sequence information is being accumulated rapidly. A principal goal of the Human Genome Project is to find all genes associated with each disorder. The definitive diagnostic test for any specific genetic disease (or predisposition to disease) will be the identification of polymorphic variations in the DNA sequence of affected cells that result in alterations of gene function. Furthermore, response to specific medications may depend on the presence of polymorphisms. Developing DNA (or RNA) screening as a practical tool for medical diagnostics requires a method that is inexpensive, accurate, expeditious, and robust.
Due to the vast amount of genetic information yet to be gathered in both human and non-human genomes, intense efforts are underway to develop new and faster methods of DNA analysis, including DNA detection, sizing, quantification, sequencing, gene identification, and the mapping of human disease genes. Efforts to analyze DNA have been greatly aided by the development of a process for in vitro amplification of DNA, namely, the polymerase chain reaction (PCR™). PCR™ provides the ability to amplify and obtain direct sequence information from as little as one copy of a target DNA sequence.
Typically, PCR™ amplification is carried out by placing a mixture of target double-stranded DNA, a mixture of deoxynucleotide triphosphates, buffer, two primers (one phosphate-labeled) and DNA polymerase (e.g., heat stable Taq polymerase) in a thermocycler which cycles between temperatures for denaturation, annealing, and synthesis. The selection of primers defines the region to be amplified. In the first stage of the cycle, the temperature is raised to separate the double stranded DNA strands to form the single-stranded templates for amplification. The temperature is then lowered to generate the primed templates for DNA polymerase. In a third stage, the temperature is raised to promote Taq-promoted DNA synthesis, and the cycle of strand separation, annealing of primers, and synthesis is repeated for about as many as 30-60 cycles. Standard detection, sizing, and sequencing methods as described above, while providing useful information,. are often tedious and costly. Many of the commonly employed techniques involve multiple handling steps. Further, the most common method of fragment analysis—gel electrophoresis—is a relatively time-consuming process.
Oligonucleotide sizing and sequence analysis is typically carried out by first utilizing either the enzymatic method developed by Sanger and Coulson, or by chemical degradation, developed by Maxam and Gilbert. The Sanger method uses enzymatic chain extension coupled with chain-terminating dideoxy- precursors to produce randomly terminated DNA fragments. The Maxam and Gilbert technique involves four different base-specific reactions carried out on portions of the DNA target to produce four sets of radiolabeled fragments. Both techniques utilize gel electrophoresis to separate resultant DNA fragments of differing lengths.
In conventional DNA analysis, the DNA fragments are labeled with radioisotopes. After separation on sequencing gels, the fragments are visualized by the image they generate upon a piece of film applied to the gel.
Other methods of DNA analysis have been described which eliminate the use of radioisotopes. One example of such a method uses fluorophores or fluorescent tags. In general, four different fluorophores, each having a different absorption and emission spectrum, are attached to the DNA primers using chemical DNA synthesis techniques. Primers with different fluorescent labels are used in each of the four enzymatic sequencing reactions. In an alternate approach to the four dye fluorescence-based detection, a dye is chemically attached to a chain-terminating base analog after enzymatic extension. In this approach, synthesis of the different dye-primers is avoided. Mono- and poly- functional intercalator compounds have also been developed as reagents for high-sensitivity fluorescence detection (Glazer et al., 1992). These planar aromatic fluorophores (e.g., ethidium homodimer, thiazole orange homodimer, oxazole yellow homodimer) insert between adjacent base pairs of double stranded DNA.
Although the efficiency of these processes has been improved by automation, faster and cheaper methods must still be developed to efficiently carry out large-scale DNA analyses. The advantages of using mass spectrometry for analyzing DNA include a dramatic increase in both the speed of analysis (a few seconds per sample) and the accuracy of direct mass measurements. In contrast, electrophoretic methods require significantly longer lengths of time (minutes to hours) and can only measure the size of DNA fragments as a function of relative mobility to comigrating standards. Gel-based separation systems also suffer from a number of artifacts that reduce the accuracy of size measurements. These mobility artifacts are related to the specific sequences of DNA fragments and the persistence of secondary and tertiary structural elements even under highly denaturing conditions.
The inventors have performed significant work in developing time-of-flight mass spectrometry (“TOF-MS”) as a means for separating and sizing DNA molecules, although other forms of mass spectrometry can be used and are within the scope of this invention. Balancing the throughput and high mass accuracy advantages of TOF-MS is the limited size range for which the accuracy and resolution necessary for characterizing DNA by mass spectrometry is available. Current state of the art for TOF-MS offers single nucleotide resolution up to ˜100 nucleotides in size and four nucleotide resolution up to ˜160 nucleotides in size. These numbers are expected to grow as new improvements are developed in the mass spectrometric field.
Existing gel-based protocols for the analysis of DNA often do not work with TOF-MS because the PCR™ product size range, typically between 100 and 800 nucleotides, is outside the current resolution capabilities of TOF-MS. Application of DNA analysis to TOF-MS requires the development of new primer sets that produce small PCR™ products 50 to 160 nucleotides in size, preferably 50 to 100 nucleotides in size.
Gel-based systems are capable of multiplexing the analysis of 2 or more DNA sequences contained at multiple loci using two approaches. The first approach is to size partition the different PCR™ product loci. Size partitioning involves designing the PCR™ primers used to amplify different loci so that that the allele PCR™ product size range for each locus covers a different and separable part of the gel size spectrum. As an example, the PCR™ primers for Locus A might be designed so that the allele size range is from 250 to 300 nucleotides, while the primers for Locus B are designed to produce an allele size range from 340 to 410 nucleotides.
The second approach to multiplexing 2 or more DNA sequences contained at different loci on gel-based systems is the use of spectroscopic partitioning. Current state of the art for gel-based systems involves the use of fluorescent dyes as specific spectroscopic markers for different PCR™ amplified loci. Different chromophores that emit light at different color wavelengths provide the means for differential detection of two different PCR™ products even if they are exactly the same size, thus 2 or more loci can produce PCR™ products with allele size ranges that overlap. For example, Locus A with a green fluorescent tag produces an allele size range from 250 to 300 nucleotides, while Locus B with a red fluorescent tag produces an allele size range of 270 to 330 nucleotides. A scanning, laser-excited fluorescence detection device monitors the wavelength of emissions and assigns different PCR™ product sizes, and their corresponding allele values, to their specific loci based on their fluorescent color.
In contrast, mass spectrometry directly detects the molecule eliminating the need for optical spectroscopic partitioning as a means for multiplexing. Because of this direct detection, MS also allows for improved accuracy in DNA analysis. MS also lends itself to high-throughput, highly-automated processes for analyzing DNA. Therefore, new methods and primers are needed in order to employ mass spectrometry for the analysis of DNA and for multiplexed analysis.