1. Field of the Invention
The field of this invention is multiplex sequence detection, exemplified by single nucleotide polymorphisms and gene expression, using electrophoretic separation.
2. Background of the Invention
As the human genome is elucidated, there will be numerous opportunities for performing assays to determine the presence of specific sequences, distinguishing between alleles in homozygotes and heterozygotes, determining the presence of mutations, evaluating cellular expression patterns, etc. In many of these cases one will wish to determine in a single reaction, a number of different characteristics of the same sample. Also, there will be an interest in determining the presence of one or more pathogens, their antibiotic resistance genes, and the like.
In many assays, there will be an interest in determining the presence of specific sequences, whether genomic or cDNA. These sequences may be associated with particularly genes, regulatory sequences, repeats, multimeric regions, expression patterns, and the like
There is and will continue to be comparisons of the sequences of different individuals. It is believed that there will be about one polymorphism per 1,000 bases, so that one may anticipate that there will be an extensive number of differences between individuals. By single nucleotide polymorphism (snp""s) is intended that there will be a prevalent nucleotide at the site, with one or more of the remaining bases being present in substantially smaller percent of the population.
For the most part, the snp""s will be in non-coding regions, primarily between genes, but will also be present in exons and introns. In addition, the great proportion of the snp""s will not affect the phenotype of the individual, but will clearly affect the genotype. The snp""s have a number of properties of interest. Since the snp""s will be inherited, individual snp""s and/or snp patterns may be related to genetic defects, such as detections, insertions and mutations involving one or more bases in genes. Rather than isolating and sequencing the target gene, it will be sufficient to identify the snp""s involved.
In addition, the snp""s may be used in forensic medicine to identify individuals. While other genetic markers are available, the large number of snp""s and their extensive distribution in the chromosomes, make the snp""s an attractive target. Also, by determining a plurality of snp""s associated with a specific phenotype, one may use the snp pattern as an indication of the phenotype, rather than requiring a determination of the genes associated with the phenotype.
The need to determine many analytes or nucleic acid sequences (for example multiple pathogens or multiple genes or multiple genetic variants) in blood or other biological fluids has become increasingly apparent in many branches of medicine. The need to study differential expression of multiple genes to determine toxicologically-relevant outcomes or the need to screen transfused blood for viral contaminants with high sensitivity is clearly evident.
Thus most multi-analyte assays or assays which detect multiple nucleic acid sequences involve multiple steps, have poor sensitivity and poor dynamic range (2 to 100-fold differences in concentration of the analytes is determined) and some require sophisticated instrumentation.
Some of the known classical methods for multianalyte assays include the following:
a. The use of two different radioisotope labels to distinguish two different analytes.
b. The use of two or more different fluorescent labels to distinguish two or more analytes.
c. The use of lanthanide chelates where both lifetime and wavelength are used to distinguish two or more analytes.
d. The use of fluorescent and chemiluminescent labels to distinguish two or more analytes.
e. The use of two different enzymes to distinguish two or more analytes.
f. The use of enzyme and acridinium esters to distinguish two or more analytes.
g. Spatial resolution of different analytes for example arrays to identify and quantify multiple analytes.
h. The use of acridinium ester labels where lifetime or dioxetanone formation is used to quantify two different viral targets.
Thus an assay that has higher sensitivity, large dynamic range (103 to 104-fold differences in target nucleic acids levels), and fewer and more stable reagents would increase the simplicity and reliability of multianalyte assays.
The need to identify and quantify a large number of bases or sequences distributed over potentially centimorgans of DNA offers a major challenge. Any method should be accurate, reasonably economical in limiting the amount of reagents required and providing for a single assay, which allows for differentiation of the different snp""s or differentiation and quantitation of multiple genes.
Holland (Proc. Natl. Acad. Sci. USA (1991) 88:7276) discloses that the exonuclease activity of the thermostable enzyme Thermus aquaticus DNA polymerase in PCR amplification to generate specific detectable signal concomitantly with amplification.
The TAQMAN assay is discussed by Lee in Nucleic Acid Research (1993) 21:16 3761).
White (Trends Biotechnology (1996) 14(12):478-483) discusses the problems of multiplexing in the TAQMAN assay.
Marino, Electrophoresis (1996) 17:1499 describes low-stringency-sequence specific PCR (LSSP-PCR). A PCR amplified sequence is subjected to single primer amplification under conditions of low stringency to produce a range of different length amplicons. Different patterns are obtained when there are differences in sequence. The patterns are unique to an individual and of possible value for identity testing.
Single strand conformational polymorphism (SSCP) yields similar results. In this method the PCR amplified DNA is denatured and sequence dependent conformations of the single strands are detected by their differing rates of migration during gel electrophoresis. As with LSSP-PCR above, different patterns are obtained that signal differences in sequence. However, neither LSSP-PCR nor SSCP gives specific sequence information and both depend on the questionable assumption that any base that is changed in a sequence will give rise to a conformational change that can be detected. Pastinen, Clin. Chem. (1996) 42:1391 amplifies the target DNA and immobilizes the amplicons. Multiple primers are then allowed to hybridize to sites 3xe2x80x2 and contiguous to an SNP site of interest. Each primer has a different size that serves as a code. The hybridized primers are extended by one base using a fluorescently labeled dideoxynucleoside triphosphate. The size of each of the fluorescent products that is produced, determined by gel electrophoresis, indicates the sequence and, thus, the location of the SNP. The identity of the base at the SNP site is defined by the triphosphate that is used. A similar approach is taken by Haff, Nucleic Acids Res. (1997) 25:3749 except that the sizing is carried out by mass spectroscopy and thus avoids the need for a label. However, both methods have the serious limitation that screening for a large number of sites will require large, very pure primers that can have troublesome secondary structures and be very expensive to synthesize.
Hacia, Nat. Genet. (1996) 14:441 uses a high density array of oligonucleotides. Labeled DNA samples were allowed to bind to 96,600 20-base oligonucleotides and the binding patterns produced from different individuals were compared. The method is attractive in that SNP""s can be directly identified but the cost of the arrays is high.
Fan (Oct. 6-8, 1997 IBC, Annapolis Md.) has reported results of a large scale screening of human sequence-tagged sites. The accuracy of single nucleotide polymorphism screening was determined by conventional ABI resequencing.
Allele specific oligonucleotide hybridization along with mass spectroscopy has been discussed by Ross in Anal. Chem. (1997) 69:4197.
Holland, et al., PNAS USA (1991) 88, 7276-7280, describes use of DNA polymerase 5xe2x80x2-3xe2x80x2 exonuclease activity for detection of PCR products.
Multiplexed sequence detection, exemplified by snp""s and gene expression analysis, is provided by employing a combination of a primer and labeled detector sequence probe in the presence of primer extension reagents, where the polymerase includes 5xe2x80x2-3xe2x80x2 exonuclease activity. The labeled detector sequence probe in the case of snps has at least one nucleotide, which is substituted with an electrophoretic tag. One combines the target nucleic acid, which will usually have been processed, with the primer extension reagents and at least one pair for each nucleic acid sequence of interest under conditions for primer extension. After sufficient time for primer extension to occur with degradation of detector sequences bound to target nucleic acid, the electrophoretic tag labeled nucleotides are separated and detected. By having a different electrophoretic tag for each nucleic acid sequence of interest, having a different electrophoretic mobility, which may require further treatment depending on the total number of snp""s or target sequences to be detected, one can readily determine the snp""s or measure multiple sequences, which are present in a sample..
Electrophoretic tags are small molecules (Molecular weight of 150 to 10,000), usually other than oligonucleotides, which can be used in any measurement technique that permits identification by mass, e.g. mass spectrometry, and or mass/charge ratio, as in mobility in electrophoresis. Simple variations in mass and/or mobility of the electrophoretic tag leads to generation of a library of electrophoretic tags, that can then be used to detect multiple snp""s or multiple target sequences. The electrophoretic tags are easily and rapidly separated in free solution without the need for a polymeric separation media. Quantitation is achieved using internal controls. Enhanced separation of the electrophoretic tags in electrophoresis is achieved by modifying the tags with positively charged moieties.