The detection of specific nucleic acids is an important tool for diagnostic medicine and molecular biology research. Gene probe assays currently play roles in identifying infectious organisms such as bacteria and viruses, in probing the expression of normal and mutant genes and identifying mutant genes such as oncogenes, in typing tissue for compatibility preceding tissue transplantation, in matching tissue or blood samples for forensic medicine, and for exploring homology among genes from different species.
Ideally, a gene probe assay should be sensitive, specific and easily automatable (for a review, see Nickerson, Current Opinion in Biotechnology 4:48-51 (1993)). The requirement for sensitivity (i.e. low detection limits) has been greatly alleviated by the development of the polymerase chain reaction PCR) and other amplification technologies which allow researchers to amplify exponentially a specific nucleic acid sequence before analysis (for a review, see Abramson et al., Current Opinion in Biotechnology, 4:41-47 (1993)).
Sensitivity, i.e. detection limits, remain a significant obstacle in nucleic acid detection systems, and a variety of techniques have been developed to address this issue. Briefly, these techniques can be classified as either target amplification or signal amplification. Target amplification involves the amplification (i.e. replication) of the target sequence to be detected, resulting in a significant increase in the number of target molecules. Target amplification strategies include the polymerase chain reaction (PCR), strand displacement amplification (SDA), and nucleic acid sequence based amplification (NASBA).
Alternatively, rather than amplify the target, alternate techniques use the target as a template to replicate a signalling probe, allowing a small number of target molecules to result in a large number of signalling probes, that then can be detected. Signal amplification strategies include the ligase chain reaction (LCR), cycling probe technology (CPT), invasive cleavage techniques such as Invader™ technology, Q-Beta replicase (Q(3R) technology, and the, use of “amplification probes” such as “branched DNA” that result in multiple label probes binding to a single target sequence.
The polymerase chain reaction (PCR) is widely used and described, and involves the use of primer extension combined with thermal cycling to amplify a target sequence; see U.S. Pat. Nos. 4,683,195 and 4,683,202, and PCR Essential Data, J. W. Wiley & Sons, Ed. C. R. Newton, 1995, all of which are incorporated by reference. In addition, there are a number of variations of PCR which also find use in the invention, including “quantitative competitive PCR” or “QC-PCR” “arbitrarily primed PCR” or “AP-PCR” “immuno-PCR”, “Alu-PCR”, “PCR single strand conformational polymorphism” or “PCR-SSCP”, allelic PCR (see Newton et al. Nucl. Acid Res. 17:2503 91989); “reverse transcriptase PCR” or “RT-PCR”, “biotin capture PCR”, “vectorette PCR”. “panhandle PCR”, and “PCR select cDNA, subtraction”, among others.
Strand displacement amplification (SDA) is generally described in Walker et al., in Molecular Methods for Virus Detection, Academic Press, Inc., 1995, and U.S. Pat. Nos. 5,455,166 and 5,130,238, all of which are hereby incorporated by reference.
Nucleic acid sequence based amplification (NASBA) is generally described in U.S. Pat. No. 5,409,818 and “Profiting from Gene-based Diagnostics”, CTB International Publishing Inc., N.J., 1996, both of which are incorporated by reference.
Cycling probe technology (CPT) is a nucleic acid detection system based on signal or probe amplification rather than target amplification, such as is done in polymerase chain reactions (PCR). Cycling probe technology relies on a molar excess of labeled probe which contains a scissile linkage of RNA. Upon hybridization of the probe to the target, the resulting hybrid contains a portion of RNA:DNA. This area of RNA:DNA duplex is recognized by RNAseH and the RNA is excised, resulting in cleavage of the probe. The probe now consists of two smaller sequences which may be released, thus leaving the target intact for repeated rounds of the reaction. The unreacted probe is removed and the label is then detected. CPT is generally described in U.S. Pat. Nos. 5,011,769, 5,403,711, 5,660,988, and 4,876,187, and PCT published applications WO 95/05480, WO 95/1416, and WO 95/00667, all of which are specifically incorporated herein by reference.
The oligonucleotide ligation assay (OLA; sometimes referred to as the ligation chain reaction (LCR)) involve the ligation of at least two smaller probes into a single long probe, using the target sequence as the template for the ligase. See generally U.S. Pat. Nos. 5,185,243, 5,679,524 and 5,573,907; EP 0 320 308 B1; EP 0 336 731 B1; EP 0 439 182 B1; WO 90/01069; WO 89/12696; and WO 89/09835, all of which are incorporated by reference.
Invader™ technology is based on structure-specific polymerases that cleave nucleic acids in a site-specific manner. Two probes are used: an “invader” probe and a “signalling” probe, that adjacently hybridize to a target sequence with a non-complementary overlap. The enzyme cleaves at the overlap due to its recognition of the “tail”, and releases the “tail” with a label. This can then be detected. The Invader™ technology is described in U.S. Pat. Nos. 5,846,717; 5,614,402; 5,719,028; 5,541,311; and 5,843,669, all of which are hereby incorporated by reference.
“Rolling circle amplification” is based on extension of a circular probe that has hybridized to a target sequence. A polymerase is added that extends the probe sequence. As the circular probe has no terminus, the polymerase repeatedly extends the circular probe resulting in concatamers of the circular probe. As such, the probe is amplified. Rolling-circle amplification is generally described in Baner et al. (1998) Nuc. Acids Res. 26:5073-5078; Barany, F. (1991) Proc. Natl Acad. Sci. USA 88:189-193; and Lizardi et al. (1998) Nat. Genet. 19:225-232, all of which are incorporated by reference in their entirety.
“Branched DNA” signal amplification relies on the synthesis of branched nucleic acids, containing a multiplicity of nucleic acid “arms” that function to increase the amount of label that can be put onto one probe. This technology is generally described in U.S. Pat. Nos. 5,681,702, 5,597,909, 5,545,730, 5,594,117, 5,591,584, 5,571,670, 5,580,731, 5,571,670, 5,591,584, 5,624,802, 5,635,352, 5,594,118, 5,359,100, 5,124,246 and 5,681,697, all of which are hereby incorporated by reference.
Similarly, dendrimers of nucleic acids serve to vastly increase the amount of label that can be added to a single molecule, using a similar idea but different compositions. This technology is as described in U.S. Pat. No. 5,175,270 and Nilsen et al., J. Theor. Biol. 187:273 (1997), both of which are incorporated herein by reference.
Specificity, in contrast, remains a problem in many currently available gene probe assays. The extent of molecular complementarity between probe and target defines the specificity of the interaction. In a practical sense, the degree of similarity between the target and other sequences in the sample also has an impact on specificity. Variations in the concentrations of probes, of targets and of salts in the hybridization medium, in the reaction temperature, and in the length of the probe may alter or influence the specificity of the probe/target interaction.
It may be possible under some circumstances to distinguish targets with perfect complementarity from targets with mismatches; this is generally very difficult using traditional technology such as filter hybridization, in situ hybridization etc., since small variations in the reaction conditions will alter the hybridization, although this may not be a problem if appropriate mismatch controls are provided. New experimental techniques for mismatch detection with standard probes include DNA ligation assays where single point mismatches prevent ligation and probe digestion assays in which mismatches create sites for probe cleavage.
Recent focus has been on the analysis of the relationship between genetic variation and phenotype by making use of polymorphic DNA markers. Previous work utilized short tandem repeats (STRs) as polymorphic positional markers; however, recent focus is on the use of single nucleotide polymorphisms (SNPs), which occur at an average frequency of more than 1 per kilobase in human genomic DNA. Some SNPs, particularly those in and around coding sequences, are likely to be the direct cause of therapeutically relevant phenotypic variants and/or disease predisposition. There are a number of well known polymorphisms that cause clinically important phenotypes; for example, the apoE2/3/4 variants are associated with different relative risk of Alzheimer's and other diseases (see Cordor et al., Science 261(1993). Multiplex PCR amplification of SNP loci with subsequent hybridization to oligonucleotide arrays has been shown to be an accurate and reliable method of simultaneously genotyping at least hundreds of SNPs; see Wang et al., Science, 280:1077 (1998); see also Schafer et al., Nature Biotechnology 16:33-39 (1998). The compositions of the present invention may easily be substituted for the arrays of the prior art.
There are a variety of particular techniques that are used to detect sequence, including mutations and SNPs. These include, but are not limited to, ligation based assays, cleavage based assays (mismatch and invasive cleavage such as Invader™), single base extension methods (see WO 92/15712, EP 0371437 B1, EP 0317 074 B1; Pastinen et al., Genome Res. 7:606-614 (1997); Syvänen, Clinica Chimica Acta 226:225-236 (1994); and WO 91/13075), and competitive probe analysis (e.g. competitive sequencing by hybridization; see below).
In addition, DNA sequencing is a crucial technology in biology today, as the rapid sequencing of genomes, including the human genome, is both a significant goal and a significant hurdle. Thus there is a significant need for robust, high-throughput methods. Traditionally, the most common method of DNA sequencing has been based on polyacrylamide gel fractionation to resolve a population of chain-terminated fragments (Sanger et al., Proc. Natl. Acad. Sci. USA 74:5463 (1977); Maxam & Gilbert). The population of fragments, terminated at each position in the DNA sequence, can be generated in a number of ways. Typically, DNA polymerase is used to incorporate dideoxynucleotides that serve as chain terminators.
Several alternative methods have been developed to increase the speed and ease of DNA sequencing. For example, sequencing by hybridization has been described (Drmanac et al., Genomics 4:114 (1989); Koster et al., Nature Biotechnology 14:1123 (1996); U.S. Pat. Nos. 5,525,464; 5,202,231 and 5,695,940, among others). Similarly, sequencing by synthesis is an alternative to gel-based sequencing. These methods add and read only one base (or at most a few bases, typically of the same type) prior to polymerization of the next base. This can be referred to as “time resolved” sequencing, to contrast from “gel-resolved” sequencing. Sequencing by synthesis has been described in U.S. Pat. No. 4,971,903 and Hyman, Anal. Biochem. 174:423 (1968); Rosenthal, International Patent Application Publication 761107 (1989); Metzker et al., Nucl. Acids Res. 22:4259 (1994); Jones, Biotechniques 22:938 (1997); Ronaghi et al., Anal. Biochem. 242:84 (1996), Nyren et al., Anal. Biochem. 151:504 (1985). Detection of ATP sulfurylase activity is described in Karamohamed and Nyren, Anal. Biochem. 271:81 (1999). Sequencing using reversible chain terminating nucleotides is described in U.S. Pat. Nos. 5,902,723 and 5,547,839, and Canard and Arzumanov, Gene 11:1 (1994), and Dyatkina and Arzumanov, Nucleic Acids Symp Ser 18:117 (1987). Reversible chain termination with DNA ligase is described in U.S. Pat. No. 5,403,708. Time resolved sequencing is described in Johnson et al., Anal. Biochem. 136:192 (1984). Single molecule analysis is described in U.S. Pat. No. 5,795,782 and Elgen and Rigler, Proc. Natl Acad Sci USA 91(13):5740 (1994), all of which are hereby expressly incorporated by reference in their entirety.
One promising sequencing by synthesis method is based on the detection of the pyrophosphate (PPi) released during the DNA polymerase reaction. As nucleotriphosphates are added to a growing nucleic acid chain, they release PPi. This release can be quantitatively measured by the conversion of PPi to ATP by the enzyme sulfurylase, and the subsequent production of visible light by firefly luciferase.
Several assay systems have been described that capitalize on this mechanism. See for example WO 93/23564, WO 98/28440 and WO 98/13523, all of which are expressly incorporated by reference. A preferred method is described in Ronaghi et al., Science 281:363 (1998). In this method, the four deoxynucleotides (dATP, dGTP, dCTP and dTTP; collectively dNTPs) are added stepwise to a partial duplex comprising a sequencing primer hybridized to a single stranded DNA template and incubated with DNA polymerase, ATP sulfurylase, luciferase, and optionally a nucleotide-degrading enzyme such as apyrase. A dNTP is only incorporated into the growing DNA strand if it is complementary to the base in the template strand. The synthesis of DNA is accompanied by the release of PPi equal in molarity to the incorporated dNTP. The PPi is converted to ATP and the light generated by the luciferase is directly proportional to the amount of ATP. In some cases the unincorporated dNTPs and the produced ATP are degraded between each cycle by the nucleotide degrading enzyme.
In some cases the DNA template is associated with a solid support. To this end, there are a wide variety of known methods of attaching DNAs to solid supports. Recent work has focused on the attachment of binding ligands, including nucleic acid probes, to micro spheres that are randomly distributed on a surface, including a fiber optic bundle, to form high density arrays. See for example PCTs US98/21193, PCT US99114387 and PCT US98/05025; WO98/50782; and U.S. Ser. Nos. 09/287,573, 09/151,877, 09/256,943, 09/316,154, 60/119,323, 09/315,584; all of which are expressly incorporated by reference.
An additional technique utilizes sequencing by hybridization. For example, sequencing by hybridization has been described (Drmanac et al., Genomics 4: 114 (1989); U.S. Pat. Nos. 5,525,464; 5,202,231 and 5,695,940, among others, all of which are hereby expressly incorporated by reference in their entirety).
In addition, sequencing using mass spectrometry techniques has been described; see Koster et al., Nature Biotechnology 14: 1123 (1996).
Finally, the use of adapter-type sequences that allow the use of universal arrays has been described in limited contexts; see for example Chee et al., Nucl. Acid Res. 19:3301 (1991); Shoemaker et al., Nature Genetics 14:450 (1998); Barany, F. (1991) Proc. Natl. Acad. Sci. USA 88:189-193; EP 0 799897 A1; WO 97/31256, all of which are expressly incorporated by reference.
PCTs US98/21193, PCT US99/14387 and PCT US98 105025; WO98/50782; and U.S. Ser. Nos. 09/287,573, 09/151,877, 09/256,943, 09/316,154, 60/119,323, 09/315,584; all of which are expressly incorporated by reference, describe novel compositions utilizing substrates with microsphere arrays, which allow for novel detection methods of nucleic acid hybridization.
Accordingly, it is an object of the present invention to provide detection and quantification methods for a variety of nucleic acid reactions, including genotyping, amplification reactions and sequencing reactions, utilizing microsphere arrays.