The following description provides a summary of information relevant to the present invention. It is not an admission that any of the information provided herein is prior art to the presently claimed invention, nor that any of the publications specifically or implicitly referenced are prior art to that invention.
Molecular biology comprises a wide variety of techniques for the analysis of nucleic acid and protein sequences. Many of these techniques and procedures form the basis of clinical diagnostic assays and tests. These techniques include nucleic acid hybridization analysis, restriction enzyme analysis, genetic sequence analysis, and the separation and purification of nucleic acids and proteins (See, e.g., J. Sambrook, E. F. Fritsch, and T. Maniatis, Molecular Cloning: A Laboratory Manual, 2 Ed., Cold spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 1989).
Most of these techniques involve carrying out numerous operations (e.g., pipetting, centrifugation, and electrophoresis) on a large number of samples. They are often complex and time consuming, and generally require a high degree of accuracy. Many a technique is limited in its application by a lack of sensitivity, specificity, or reproducibility.
For example, the complete process for carrying out a DNA hybridization analysis for a genetic or infectious disease is very involved. Broadly speaking, the complete process may be divided into a number of steps and sub-steps. In the case of genetic disease diagnosis, the first step involves obtaining the sample (e.g., saliva, blood or tissue). Depending on the type of sample, various pre-treatments would be carried out. The second step involves disrupting or lysing the cells which releases the crude DNA material along with other cellular constituents.
Generally, several sub-steps are necessary to remove cell debris and to further purify the DNA from the crude sample. At this point several options exist for further processing and analysis. One option involves denaturing the DNA and carrying out a direct hybridization analysis in one of many formats (dot blot, microbead, microplate, etc.). A second option, called Southern blot hybridization, involves cleaving the DNA with restriction enzymes, separating the DNA fragments on an electrophoretic gel, blotting the DNA to a membrane filter, and then hybridizing the blot with specific DNA probe sequences. This procedure effectively reduces the complexity of the genomic DNA sample, and thereby helps to improve the hybridization specificity and sensitivity. Unfortunately, this procedure is long and arduous. A third option is to carry out an amplification procedure such as the polymerase chain reaction (PCR) or the strand displacement amplification (SDA) method. These procedures amplify (increase) the number of target DNA sequences relative to non-target sequences. Amplification of target DNA helps to overcome problems related to complexity and sensitivity in genomic DNA analysis. After these sample preparation and DNA processing steps, the actual hybridization reaction is performed. Finally, detection and data analysis convert the hybridization event into an analytical result.
Nucleic acid hybridization analysis generally involves the detection of a very small number of specific target nucleic acids (DNA or RNA) with an excess of probe DNA, among a relatively large amount of complex non-target nucleic acids. A reduction in the complexity of the nucleic acid in a sample is helpful to the detection of low copy numbers (i.e. 10,000 to 100,000) of nucleic acid targets. DNA complexity reduction is achieved to some degree by amplification of target nucleic acid sequences. (See, M. A. Innis et al., PCR Protocols: A Guide to Methods and Applications, Academic Press, 1990, Spargo et al., 1996, Molecular & Cellular Probes, in regard to SDA amplification). This is because amplification of target nucleic acids results in an enormous number of target nucleic acid sequences relative to non-target sequences thereby improving the subsequent target hybridization step.
The actual hybridization reaction represents one of the most important and central steps in the whole process. The hybridization step involves placing the prepared DNA sample in contact with a specific reporter probe at set optimal conditions for hybridization to occur between the target DNA sequence and probe.
Hybridization may be performed in any one of a number of formats. For example, multiple sample nucleic acid hybridization analysis has been conducted in a variety of filter and solid support formats (See G. A. Beltz et al., in Methods in Enzymology, Vol. 100, Part B, R. Wu, L. Grossman, K. Moldave, Eds., Academic Press, New York, Chapter 19, pp. 266-308, 1985). One format, the so-called “dot blot” hybridization, involves the non-covalent attachment of target DNAs to a filter followed by the subsequent hybridization to a radioisotope labeled probe(s). “Dot blot” hybridization gained wide-spread use over the past two decades during which time many versions were developed (see M. L. M. Anderson and B. D. Young, in Nucleic Acid Hybridization—A Practical Approach, B. D. Hames and S. J. Higgins, Eds., IRL Press, Washington, D.C. Chapter 4, pp. 73-111, 1985). For example, the dot blot method has been developed for multiple analyses of genomic mutations (D. Nanibhushan and D. Rabin, in EPA 0228075, Jul. 8, 1987) and for the detection of overlapping clones and the construction of genomic maps (G. A. Evans, in U.S. Pat. No. 5,219,726, Jun. 15, 1993).
New techniques are being developed for carrying out multiple sample nucleic acid hybridization analysis on micro-formatted multiplex or matrix devices (e.g., DNA chips) (see M. Barinaga, 253 Science, pp. 1489, 1991; W. Bains, 10 Bio/Technology, pp. 757-758, 1992). These methods usually attach specific DNA sequences to very small specific areas of a solid support, such as micro-wells of a DNA chip. These hybridization formats are micro-scale versions of the conventional “dot blot” and “sandwich” hybridization systems.
The micro-formatted hybridization can be used to carry out “sequencing by hybridization” (SBH) (see M. Barinaga, 253 Science, pp. 1489, 1991; W. Bains, 10 Bio/Technology, pp. 757-758, 1992). SBH makes use of all possible n-nucleotide oligomers (n-mers) to identify n-mers in an unknown DNA sample, which are subsequently aligned by algorithm analysis to produce the DNA sequence (see R. Drmanac and R. Crkvenjakov, Yugoslav Patent Application #570/87, 1987; R. Drmanac et al., 4 Genomics, 114, 1989; Strezoska et al., 88 Proc. Natl. Acad. Sci. USA 10089, 1992; and R. Drmanac and R. B. Crkvenjakov, U.S. Patent No. 5,202,231, Apr. 13, 1993).
There are two formats for carrying out SBH. The first format involves creating an array of all possible n-mers on a support, which is then hybridized with the target sequence. The second format involves attaching the target sequence to a support, which is sequentially probed with all possible n-mers. Both formats have the fundamental problems of direct probe hybridizations and additional difficulties related to multiplex hybridizations.
Southern, (United Kingdom Patent Application GB 8810400, 1988; E. M. Southern et al., 13 Genomics 1008, 1992), proposed using the first format to analyze or sequence DNA. Southern identified a known single point mutation using PCR amplified genomic DNA. Southern also described a method for synthesizing an array of oligonucleotides on a solid support for SBH. However, Southern did not address how to achieve optimal stringency conditions for each oligonucleotide on an array.
Drmanac et al., (260 Science 1649-1652, 1993), used the second format to sequence several short (116 bp) DNA sequences. Target DNAs were attached to membrane supports (“dot blot” format). Each filter was sequentially hybridized with 272 labeled 10-mer and 11-mer oligonucleotides. Wide ranges of stringency conditions were used to achieve specific hybridization for each n-mer probe. Washing times varied from 5 minutes to overnight using temperatures from 0° C. to 16° C. Most probes required 3 hours of washing at 16° C. The filters had to be exposed from 2 to 18 hours in order to detect hybridization signals. The overall false positive hybridization rate was 5% in spite of the simple target sequences, the reduced set of oligomer probes, and the use of the most stringent conditions available.
Currently, a variety of methods are available for detection and analysis of the hybridization events. Depending on the reporter group (fluorophore, enzyme, radioisotope, etc.) used to label the DNA probe, detection and analysis are carried out fluorimetrically, calorimetrically, or by autoradiography. By observing and measuring emitted radiation, such as fluorescent radiation or particle emission, information may be obtained about the hybridization events. Even when detection methods have very high intrinsic sensitivity, detection of hybridization events is difficult because of the background presence of non-specifically bound materials. Thus, detection of hybridization events is dependent upon how specific and sensitive hybridization can be made. Concerning genetic analysis, several methods have been developed that have attempted to increase specificity and sensitivity.
One form of genetic analysis is analysis centered on elucidation of single nucleic acid polymorphisms or (“SNPs”). Factors favoring the usage of SNPs are their high abundance in the human genome (especially compared to short tandem repeats, (STRs)), their frequent location within coding or regulatory regions of genes (which can affect protein structure or expression levels), and their stability when passed from one generation to the next (Landegren et al., Genome Research, Vol. 8, pp. 769-776, 1998).
A SNP is defined as any position in the genome that exists in two variants and the most common variant occurs less than 99% of the time. In order to use SNPs as widespread genetic markers, it is crucial to be able to genotype them easily, quickly, accurately, and cost-effectively. It is of great interest to type both large sets of SNPs in order to investigate complex disorders where many loci factor into one disease (Risch and Merikangas, Science, Vol. 273, pp. 1516-1517, 1996), as well as small subsets of SNPs previously demonstrated to be associated with known afflictions.
Numerous techniques are currently available for typing SNPs (for review, see Landegren et al., Genome Research, Vol. 8, pp. 769-776,1998), all of which require target amplification. They include direct sequencing (Carothers et al., BioTechniques, Vol. 7, pp. 494-499, 1989), single-strand conformation polymorphism (Orita et al., Proc. Natl. Acad. Sci. USA, Vol. 86, pp. 2766-2770, 1989), allele-specific amplification (Newton et al., Nucleic Acids Research, Vol. 17, pp. 2503-2516, 1989), restriction digestion (Day and Humphries, Analytical Biochemistry, Vol. 222, pp. 389-395, 1994), and hybridization assays. In their most basic form, hybridization assays function by discriminating short oligonucleotide reporters against matched and mismatched targets. Due to difficulty in determining optimal denaturation conditions, many adaptations to the basic protocol have been developed. These include ligation chain reaction (Wu and Wallace, Gene, Vol. 76, pp. 245-254, 1989) and minisequencing (Syvänen et al., Genomics, Vol. 8, pp. 684-692, 1990). Other enhancements include the use of the 5′-nuclease activity of Taq DNA polymerase (Holland et al., Proc. Natl. Acad. Sci. USA, Vol. 88, pp. 7276-7280, 1991), molecular beacons (Tyagi and Kramer, Nature Biotechnology, Vol. 14, pp.303-308, 1996), heat denaturation curves (Howell et al., Nature Biotechnology, Vol. 17, pp. 87-88, 1999) and DNA “chips” (Wang et al., Science, Vol. 280, pp. 1077-1082, 1998). While each of these assays are functional, they are limited in their practical application in a clinical setting.
An additional phenomenon discovered to be useful in distinguishing SNPs is the nucleic acid interaction energies or base-stacking energies derived from the hybridization of multiple target specific probes to a single target. (see R. Ornstein et al., “An Optimized Potential Function for the Calculation of Nucleic Acid Interaction Energies”, in Biopolymers, Vol. 17, 2341-2360 (1978); J. Norberg and L. Nilsson, Biophysical Journal, Vol. 74, pp. 394-402, (1998); and J. Pieters et al., Nucleic Acids Research, Vol. 17, no. 12, pp. 4551-4565 (1989)). This base-stacking phenomenon is used in a unique format in the current invention to provide highly sensitive Tm differentials allowing the direct detection of SNPs in a nucleic acid sample.
Prior to the format of the current invention, other methods have been used to distinguish nucleic acid sequences in related organisms or to sequence DNA. For example, U.S. Pat. No. 5,030,557 by Hogan et al. disclosed that the secondary and tertiary structure of a single stranded target nucleic acid may be affected by binding “helper” oligonucleotides in addition to “probe” oligonucleotides causing a higher Tm to be exhibited between the probe and target nucleic acid. That application however was limited in its approach to using hybridization energies only for altering the secondary and tertiary structure of self-annealing RNA strands which if left unaltered would tend to prevent the probe from hybridizing to the target.
With regard to DNA sequencing, K. Khrapko et al., Federation of European Biochemical Societies Letters, Vol. 256, no. 1,2, pp. 118-122 (1989), for example, disclosed that continuous stacking hybridization resulted in duplex stabilization. Additionally, J. Kieleczawa et al., Science, Vol. 258, pp. 1787-1791 (1992), disclosed the use of contiguous strings of hexamers to prime DNA synthesis wherein the contiguous strings appeared to stabilize priming. Likewise, L. Kotler et al., Proc. Natl. Acad. Sci. USA, Vol. 90, pp. 4241-4245, (1993) disclosed sequence specificity in the priming of DNA sequencing reactions by use of hexamer and pentamer oligonucleotide modules. Further, S. Parinov et al., Nucleic Acids Research, Vol. 24, no. 15, pp. 2998-3004, (1996), disclosed the use of base-stacking oligomers for DNA sequencing in association with passive DNA sequencing microchips. Moreover, G. Yershov et al., Proc. Natl. Acad. Sci. USA, Vol. 93, pp. 4913-4918 (1996), disclosed the application of base-stacking energies in SBH on a passive microchip. In Yershov's example, 10-mer DNA probes were anchored to the surface of the microchip and hybridized to target sequences in conjunction with additional short probes, the combination of which appeared to stabilize binding of the probes. In that format, short segments of nucleic acid sequence could be elucidated for DNA sequencing. Yershov further noted that in their system the destabilizing effect of mismatches was increased using shorter probes (e.g., 5-mers). Use of such short probes in DNA sequencing provided the ability to discern the presence of mismatches along the sequence being probed rather than just a single mismatch at one specified location of the probe/target hybridization complex. Use of longer probes (e.g., 8-mer, 10-mer, and 13-mer oligos) were less functional for such purposes.
An additional example of methodologies that have used base-stacking in the analysis of nucleic acids includes U.S. Pat. No. 5,770,365 by Lane et al., wherein is disclosed a method of capturing nucleic acid targets using a unimolecular capture probe having a single stranded loop and a double stranded region which acts in conjunction with a binding target to stabilize duplex formation by stacking energies.
Despite the knowledge of base-stacking phenomenon, applications as described above have not resulted in commercially acceptable methods or protocols for either DNA sequencing or the detection of SNPs for clinical purposes. We provide herein such a commercially useful method for making such distinctions in numerous genetic and medical applications by combining the use of base-stacking principles and electronically addressable microchip formats.