Molecular biology comprises a wide variety of techniques for the analysis of nucleic acid and protein. Many of these techniques and procedures form the basis of clinical diagnostic assays and tests. These techniques include nucleic acid hybridization analysis, restriction enzyme analysis, genetic sequence analysis, and the separation and purification of nucleic acids and proteins (See, e.g., J. Sambrook, E. F. Fritsch, and T. Maniatis, Molecular Cloning: A Laboratory Manual, 2 Ed., Cold spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 1989).
Most of these techniques involve carrying out numerous operations (e.g., pipetting, centrifugation, electrophoresis)on a large number of samples. They are often complex and time consuming, and generally require a high degree of accuracy. Many a technique is limited in its application by a lack of sensitivity, specificity, or reproducibility. For example, these problems have limited many diagnostic applications of nucleic acid hybridization analysis.
The complete process for carrying out a DNA hybridization analysis for a genetic or infectious disease is very involved. Broadly speaking, the complete process may be divided into a number of steps and substeps. In the case of genetic disease diagnosis, the first step involves obtaining the sample (blood or tissue). Depending on the type of sample, various pre-treatments would be carried out. The second step involves disrupting or lysing the cells, which then release the crude DNA material along with other cellular constituents. Generally, several sub-steps are necessary to remove cell debris and to purify further the crude DNA. At this point several options exist for further processing and analysis. One option involves denaturing the purified sample DNA and carrying out a direct hybridization analysis in one of many formats (dot blot, microbead, microplate, etc.). A second option, called Southern blot hybridization, in volves cleaving the DNA with restriction enzymes, separating the DNA fragments on an electrophoretic gel, blotting to a membrane filter, and then hybridizing the blot with specific DNA probe sequences. This procedure effectively reduces the complexity of the genomic DNA sample, and thereby helps to improve the hybridization specificity and sensitivity. Unfortunately, this procedure is long and arduous. A third option is to carry out an amplification procedure such as polymerase chain reaction (PCR), strand displacement amplification or other method. These procedures amplify (increase) the number of target DNA sequences relative to non-target sequences. Amplification of target DNA helps to overcome problems related to complexity and sensitivity in genoinic DNA analysis. After these sample preparation and DNA processing steps, the actual hybridization reaction is performed. Finally, detection and data analysis convert the hybridization event into an analytical result.
Nucleic acid hybridization analysis generally involves the detection of a very small number of specific target nucleic acids (DNA or RNA) with an excess of probe DNA, among a relatively large amount of complex non-target nucleic acids. The substeps of DNA complexity reduction in sample preparation have been utilized to help detect low copy numbers (i.e. 10,000 to 100,000) of nucleic acid targets. DNA complexity is overcome to some degree by amplification of target nucleic acid sequences using polymerase chain reaction (PCR) and other methods. (See, M. A. Innis et al, PCR Protocols: A Guide to Methods and Applications, Academic Press, 1990, Spargo et al., 1996, Molecular & Cellular Probes, in regard to SDA amplification). Amplification results in an enormous number of target nucleic acid sequences that improves the subsequent direct probe hybridization step.
The actual hybridization reaction represents one of the most important and central steps in the whole process. The hybridization step involves placing the prepared DNA sample in contact with a specific reporter probe, at a set of optimal conditions for hybridization to occur to the target DNA sequence. Hybridization may be performed in any one of a number of formats. For example, multiple sample nucleic acid hybridization analysis has been conducted on a variety of filter and solid support formats (See G. A. Beltz et al., in Methods in Enzymology, Vol. 100, Part B, R. Wu, L. Grossman, K. Moldave, Eds., Academic Press, New York, Chapter 19, pp. 266-308, 1985). One format, the so-called "dot blot" hybridization, involves the non-covalent attachment of target DNAs to filter, which are subsequently hybridized with a radioisotope labeled probe(s). "Dot blot" hybridization gained wide-spread use, and many versions were developed (see M. L. M. Anderson and B. D. Young, in Nucleic Acid Hybridization - A Practical Approach, B. D. Hames and S. J. Higgins, Eds., IRL Press, Washington, D.C. Chapter 4, pp. 73-111, 1985). It has been developed for multiple analysis of genomic mutations (D. Nanibhushan and D. Rabin, in EPA 0228075, Jul. 8, 1987) and for the detection of overlapping clones and the construction of genomic maps (G. A. Evans, in U.S. Pat. No. 5,219,726, Jun. 15, 1993).
New techniques are being developed for carrying out multiple sample nucleic acid hybridization analysis on micro-formatted multiplex or matrix devices (e.g., DNA chips) (see M. Barinaga, 253 Science, pp. 1489, 1991; W. Bains, 10 Bio/Technology, pp. 757-758, 1992). These methods usually attach specific DNA sequences to very small specific areas of a solid support, such as micro-wells of a DNA chip. These hybridization formats are micro-scale versions of the conventional "dot blot" and "sandwich" hybridization systems.
The micro-formatted hybridization can be used to carry out "sequencing by hybridization" (SBH) (see M. Barinaga, 253 Science, pp. 1489, 1991; W. Bains, 10 Bio/Technology, pp. 757-758, 1992). SBH makes use of all possible n-nucleotide oligomers (n-mers) to identify n-mers in an unknown DNA sample, which are subsequently aligned by algorithm analysis to produce the DNA sequence (R. Drmanac and R. Crkvenjakov, Yugoslav Patent Application #570/87, 1987; R. Drmanac et al., 4 Genomics, 114, 1989; Strezoska et al., 88 Proc. Natl. Acad. Sci. USA 10089, 1992; and R. Drmanac and R. B. Crkvenjakov, U.S. Pat. No. #5,202,231, Apr. 13, 1993).
There are two formats for carrying out SBH. The first format involves creating an array of all possible n-mers on a support, which is then hybridized with the target sequence. The second format involves attaching the target sequence to a support, which is sequentially probed with all possible n-mers. Both formats have the fundamental problems of direct probe hybridizations and additional difficulties related to multiplex hybridizations.
Southern, United Kingdom Patent Application GB 8810400, 1988; E. M. Southern et al., 13 Genomics 1008, 1992, proposed using the first format to analyze or sequence DNA. Southern identified a known single point mutation using PCR amplified genomic DNA. Southern also described a method for synthesizing an array of oligonucleotides on a solid support for SBH. However, Southern did not address how to achieve optimal stringency condition for each oligonucleotide on an array.
Concurrently, Drmanac et al., 260 Science 1649-1652, 1993, used the second format to sequence several short (116 bp) DNA sequences. Target DNAs were attached to membrane supports ("dot blot" format). Each filter was sequentially hybridized with 272 labeled 10-mer and 11-mer oligonucleotides. A wide range of stringency condition was used to achieve specific hybridization for each n-mer probe; washing times varied from 5 minutes to overnight, and temperatures from 0% C to 16% C. Most probes required 3 hours of washing at 16% C. The filters had to be exposed for 2 to 18 hours in order to detect hybridization signals. The overall false positive hybridization rate was 5% in spite of the simple target sequences, the reduced set of oligomer probes, and the use of the most stringent conditions available.
A variety of methods exist for detection and analysis of the hybridization events. Depending on the reporter group (fluorophore, enzyme, radioisotope, etc.) used to label the DNA probe, detection and analysis are carried out fluorimetrically, colorimetrically, or by autoradiography. By observing and measuring emitted radiation, such as fluorescent radiation or particle emission, information may be obtained about the hybridization events. Even when detection methods have very high intrinsic sensitivity, detection of hybridization events is difficult because of the background presence of non-specifically bound materials. A number of other factors also reduce the sensitivity and selectivity of DNA hybridization assays.
One form of genetic analysis consists of determining the nature of relatively short repeating sequences within a gene sequence. Short tandem repeats (STR's) have been identified as a useful tool in both forensics and in other areas (paternity testing, tumor detection, D. Sidransky, genetic disease, animal breeding). Indeed, the United States Federal Bureau of Investigation has announced that it is considering the use of short tandem repeat sequences for forensic purposes. (Dr. Bruce Budowle, DNA Forensics, Science, Evidence and Future Prospects, McLean, Va. Nov. 1997).
Various proposals have been made for identifying, amplifying, detecting and using polymorphic repeat sequences. For example, Tautz PCT W090/04040-PCT/EP98/01203, in an application entitled "Process for the Analysis of Length Polymorphorisms in DNA Regions" (translated from German), discloses a process for the analysis of length polymorphisms in regions of simple or cryptically simple DNA sequences. Tautz discloses a method which includes these steps of addition of at least one primer pair onto the DNA that is to be analyzed, wherein one of the molecules of the primer pair is substantially complementary to the complementary strands of the 5' respectively 3' flank of a simple or cryptically simple DNA sequence and wherein the addition takes place within orientation that is such that the synthesis products obtained from a primer controlled polymerization reaction with one of the two primers can be used, following denaturation, as matrices for the addition of the other primer, performing a primer-controlled polymerization reaction and separating, such as by normal gel electrophoresis the products and analyzing the polymerase chain reaction products.
Caskey et al. at the Baylor College of Medicine also detected polymorphisms in a short tandem repeat by performing DNA profiling assays. In Caskey et al., U.S. Pat. No. 5,364,759, issued Nov. 15, 1994, entitled "DNA Typing With Short Tandem Repeat Polymorphisms and Identification of Polymorphic Short Tandem Repeats" discloses a method including steps of extracting DNA from a sample to be tested, amplifying the extracted DNA and identifying the amplified extension products for each different sequence. Caskey required that each different sequence be differentially labeled. A physical separation was performed utilizing electrophoresis.
C. R. Cantor and others more recently disclosed a technique for scoring short tandem DNA repeats. The method is disclosed in Yarr, R. et al., "In Situ Detection of Tandem DNA Repeat Length", Genetic Analysis: Biomolecular Engineering, 13(1996) 113-118, and PCT Application W096/36731, PCT/US96/06527 entitled "Nucleic Acid Detection Methods". These disclose hybridization of an oligonucleotide target containing tandem repeats embedded in a unique sequence with a set of complementary probes containing tandem repeats of known lengths. Single-stranded loop structures result in duplexes containing a mismatched (defined there to be a different) number of tandem repeats. When a matched (defined there to be identical) number of tandem repeats existed on the duplex, no loop structure formed. The loop structures were digested with a single-stranded nuclease. Differential wavelength, such as through differentially colored fluoriflors of the various length probes identified where matched sites existed. No express use of electrophoretic separation was required in accordance with this method.
Despite the knowledge of the existence of polymorphism in repeat units now for approximately 15 years, as well as their known desirability for application in forensics and genetic testing, commercially acceptable implementations have yet to be achieved.