The methods and apparatus of these inventions relate to systems for genetic identification for disease state identification. More particularly, the methods and apparatus relate to systems for the detection of repeat unit states, such as the number of short tandem repeat units for the identification of individuals such as in a forensic or paternity sense, or for determination of disease states, such as for clonal tumor detection.
Molecular biology comprises a wide variety of techniques for the analysis of nucleic acid and protein. Many of these techniques and procedures form the basis of clinical diagnostic assays and tests. These techniques include nucleic acid hybridization analysis, restriction enzyme analysis, genetic sequence analysis, and the separation and purification of nucleic acids and proteins (See, e.g., J. Sambrook, E. F. Fritsch, and T. Maniatis, Molecular Cloning: A Laboratory Manual, 2 Ed., Cold spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 1989).
Most of these techniques involve carrying out numerous operations (e.g., pipetting, centrifugation, electrophoresis) on a large number of samples. They are often complex and time consuming, and generally require a high degree of accuracy. Many a technique is limited in its application by a lack of sensitivity, specificity, or reproducibility. For example, these problems have limited many diagnostic applications of nucleic acid hybridization analysis.
The complete process for carrying out a DNA hybridization analysis for a genetic or infectious disease is very involved. Broadly speaking, the complete process may be divided into a number of steps and substeps. In the case of genetic disease diagnosis, the first step involves obtaining the sample (blood or tissue). Depending on the type of sample, various pre-treatments would be carried out. The second step involves disrupting or lysing the cells, which then release the crude DNA material along with other cellular constituents. Generally, several sub-steps are necessary to remove cell debris and to purify further the crude DNA. At this point several options exist for further processing and analysis. One option involves denaturing the purified sample DNA and carrying out a direct hybridization analysis in one of many formats (dot blot, microbead, microplate, etc.). A second option, called Southern blot hybridization, involves cleaving the DNA with restriction enzymes, separating the DNA fragments on an electrophoretic gel, blotting to a membrane filter, and then hybridizing the blot with specific DNA probe sequences. This procedure effectively reduces the complexity of the genomic DNA sample, and thereby helps to improve the hybridization specificity and sensitivity. Unfortunately, this procedure is long and arduous. A third option is to carry out an amplification procedure such as polymerase chain reaction (PCR), strand displacement amplification or other method. These procedures amplify (increase) the number of target DNA sequences relative to non-target sequences. Amplification of target DNA helps to overcome problems related to complexity and sensitivity in genomic DNA analysis. After these sample preparation and DNA processing steps, the actual hybridization reaction is performed. Finally, detection and data analysis convert the hybridization event into an analytical result.
Nucleic acid hybridization analysis generally involves the detection of a very small number of specific target nucleic acids (DNA or RNA) with an excess of probe DNA, among a relatively large amount of complex non-target nucleic acids. The substeps of DNA complexity reduction in sample preparation have been utilized to help detect low copy numbers (i.e. 10,000 to 100,000) of nucleic acid targets. DNA complexity is overcome to some degree by amplification of target nucleic acid sequences using polymerase chain reaction (PCR) and other methods. (See, M. A. Innis et al, PCR Protocols: A Guide to Methods and Applications, Academic Press, 1990, Spargo et al., 1996, Molecular and Cellular Probes, in regard to SDA amplification). Amplification results in an enormous number of target nucleic acid sequences that improves the subsequent direct probe hybridization step.
The actual hybridization reaction represents one of the most important and central steps in the whole process. The hybridization step involves placing the prepared DNA sample in contact with a specific reporter probe, at a set of optimal conditions for hybridization to occur to the target DNA sequence. Hybridization may be performed in any one of a number of formats. For example, multiple sample nucleic acid hybridization analysis has been conducted on a variety of filter and solid support formats (See G. A. Beltz et al., in Methods in Enzymology, Vol. 100, Part B, R. Wu, L. Grossman, K. Moldave, Eds., Academic Press, New York, Chapter 19, pp. 266-308, 1985). One format, the so-called xe2x80x9cdot blotxe2x80x9d hybridization, involves the non-covalent attachment of target DNAs to filter, which are subsequently hybridized with a radioisotope labeled probe(s). xe2x80x9cDot blotxe2x80x9d hybridization gained wide-spread use, and many versions were developed (see M. L. M. Anderson and B. D. Young, in Nucleic Acid Hybridizationxe2x80x94A Practical Approach, B. D. Hames and S. J. Higgins, Eds., IRL Press, Washington, D.C. Chapter 4, pp. 73-111, 1985). It has been developed for multiple analysis of genomic mutations (D. Nanibhushan and D. Rabin, in EPA 0228075, Jul. 8, 1987) and for the detection of overlapping clones and the construction of genomic maps (G. A. Evans, in U.S. Pat. No. 5,219,726, Jun. 15, 1993).
New techniques are being developed for carrying out multiple sample nucleic acid hybridization analysis on micro-formatted multiplex or matrix devices (e.g., DNA chips) (see M. Barinaga, 253 Science, pp. 1489, 1991; W. Bains, 10 Bio/Technology, pp. 757-758, 1992). These methods usually attach specific DNA sequences to very small specific areas of a solid support, such as micro-wells of a DNA chip. These hybridization formats are micro-scale versions of the conventional xe2x80x9cdot blotxe2x80x9d and xe2x80x9csandwichxe2x80x9d hybridization systems.
The micro-formatted hybridization can be used to carry out xe2x80x9csequencing by hybridizationxe2x80x9d(SBH) (see M. Barinaga, 253 Science, pp. 1489, 1991; W. Bains, 10 Bio/Technology, pp. 757-758, 1992). SBH makes use of all possible n-nucleotide oligomers (n-mers) to identify n-mers in an unknown DNA sample, which are subsequently aligned by algorithm analysis to produce the DNA sequence (R. Drmanac and R. Crkvenjakov, Yugoslav Patent Application #570/87, 1987; R. Drmanac et al., 4 Genomics, 114, 1989; Strezoska et al., 88 Proc. Natl. Acad. Sci. USA 10089, 1992; and R. Drmanac and R. B. Crkvenjakov, U.S. Pat. No. 5,202,231, Apr. 13, 1993).
There are two formats for carrying out SBH. The first format involves creating an array of all possible n-mers on a support, which is then hybridized with the target sequence. The second format involves attaching the target sequence to a support, which is sequentially probed with all possible n-mers. Both formats have the fundamental problems of direct probe hybridizations and additional difficulties related to multiplex hybridizations.
Southern, United Kingdom Patent Application GB 8810400, 1988; E. M. Southern et al., 13 Genomics 1008, 1992, proposed using the first format to analyze or sequence DNA. Southern identified a known single point mutation using PCR amplified genomic DNA. Southern also described a method for synthesizing an array of oligonucleotides on a solid support for SBH. However, Southern did not address how to achieve optimal stringency condition for each oligonucleotide on an array.
Concurrently, Drmanac et al., 260 Science 1649-1652, 1993, used the second format to sequence several short (116 bp) DNA sequences. Target DNAs were attached to membrane supports (xe2x80x9cdot blotxe2x80x9d format). Each filter was sequentially hybridized with 272 labeled 10-mer and 11-mer oligonucleotides. A wide range of stringency condition was used to achieve specific hybridization for each n-mer probe; washing times varied from 5 minutes to overnight, and temperatures from 0% C to 16% C. Most probes required 3 hours of washing at 16% C. The filters had to be exposed for 2 to 18 hours in order to detect hybridization signals. The overall false positive hybridization rate was 5% in spite of the simple target sequences, the reduced set of oligomer probes, and the use of the most stringent conditions available.
A variety of methods exist for detection and analysis of the hybridization events. Depending on the reporter group (fluorophore, enzyme, radioisotope, etc.) used to label the DNA probe, detection and analysis are carried out fluorimetrically, colorimetrically, or by autoradiography. By observing and measuring emitted radiation, such as fluorescent radiation or particle emission, information may be obtained about the hybridization events. Even when detection methods have very high intrinsic sensitivity, detection of hybridization events is difficult because of the background presence of non-specifically bound materials. A number of other factors also reduce the sensitivity and selectivity of DNA hybridization assays.
One form of genetic analysis consists of determining the nature of relatively short repeating sequences within a gene sequence. Short tandem repeats (STR""s) have been identified as a useful tool in both forensics and in other areas (paternity testing, tumor detection, D. Sidransky, genetic disease, animal breeding). Indeed, the United States Federal Bureau of Investigation has announced that it is considering the use of short tandem repeat sequences for forensic purposes. (Dr. Bruce Budowle, DNA Forensics, Science, Evidence and Future Prospects, McLean, Va. November, 1997).
Various proposals have been made for identifying, amplifying, detecting and using polymorphic repeat sequences. For example, Tautz PCT W090/04040-PCT/EP98/01203, in an application entitled xe2x80x9cProcess for the Analysis of Length Polymorphorisms in DNA Regionsxe2x80x9d (translated from German), discloses a process for the analysis of length polymorphisms in regions of simple or cryptically simple DNA sequences. Tautz discloses a method which includes these steps of addition of at least one primer pair onto the DNA that is to be analyzed, wherein one of the molecules of the primer pair is substantially complementary to the complementary strands of the 5xe2x80x2 respectively 3xe2x80x2 flank of a simple or cryptically simple DNA sequence and wherein the addition takes place within orientation that is such that the synthesis products obtained from a primer controlled polymerization reaction with one of the two primers can be used, following denaturation, as matrices for the addition of the other primer, performing a primer-controlled polymerization reaction and separating, such as by normal gel electrophoresis the products and analyzing the polymerase chain reaction products.
Caskey et al. at the Baylor College of Medicine also detected polymorphisms in a short tandem repeat by performing DNA profiling assays. In Caskey et al., U.S. Pat. No. 5,364,759, issued Nov. 15, 1994, entitled xe2x80x9cDNA Typing With Short Tandem Repeat Polymorphisms and Identification of Polymorphic Short Tandem Repeatsxe2x80x9d discloses a method including steps of extracting DNA from a sample to be tested, amplifying the extracted DNA and identifying the amplified extension products for each different sequence. Caskey required that each different sequence be differentially labeled. A physical separation was performed utilizing electrophoresis.
C. R. Cantor and others more recently disclosed a technique for scoring short tandem DNA repeats. The method is disclosed in Yarr, R. et al., xe2x80x9cIn Situ Detection of Tandem DNA Repeat Lengthxe2x80x9d, Genetic Analysis: Biomolecular Engineering, 13(1996) 113-118, and PCT Application W096/3673 1, PCT/US96/06527 entitled xe2x80x9cnucleic Acid Detection Methodsxe2x80x9d. These disclose hybridization of an oligonucleotide target containing tandem repeats embedded in a unique sequence with a set of complementary probes containing tandem repeats of known lengths. Single-stranded loop structures result in duplexes containing a mismatched (defined there to be a different) number of tandem repeats. When a matched (defined there to be identical) number of tandem repeats existed on the duplex, no loop structure formed. The loop structures were digested with a single-stranded nuclease. Differential wavelength, such as through differentially colored fluoriflors of the various length probes identified where matched sites existed. No express use of electrophoretic separation was required in accordance with this method.
Despite the knowledge of the existence of polymorphism in repeat units now for approximately 15 years, as well as their known desirability for application in forensics and genetic testing, commercially acceptable implementations have yet to be achieved.
Methods and apparatus are provided for the analysis and determination of the nature of repeat units in a genetic target. In one method of this invention, the nature of the repeat units in the genetic target is determined by the steps of providing a plurality of hybridization complex assays arrayed on a plurality of test sites, where the hybridization complex assay includes at least a nucleic acid target containing a simple repetitive DNA sequence, a capture probe having a first unique flanking sequence and n repeat units, where n=0, 1, 2 . . . , being complementary to the target sequence, and a reporter probe having a selected sequence complementary to the same target sequence strand wherein the selected sequence of the reporter includes a second unique flanking sequence and m repeat units, where m=0, 1, 2 . . . , but where the sum of repeat units in the capture probe plus reporter probe is greater than 0 (n+m greater than 0). In accordance with this method, the sequence of the capture probe differs at least two test sites. The hybridization complex assays are then monitored to determine concordance and discordance among the hybridization complex assays at the test sites as determined at least in part by hybridization stability. Ultimately, the nature of the repeat units in the target sequence may be determined based upon the concordant/discordant determination coupled with knowledge of the probes located in the hybridization complex at that site.
By way of example, in implementation of this method, assume that a target contains six repeat units. In a system simplified merely for expository convenience, the plurality of hybridization complex assays might be three assays arrayed on an APEX type bioelectronic system, wherein a first assay includes a capture probe having four repeat units (n=4), the second assay has a capture probe with five repeat units (n=5) and the third assay has capture probes with six repeat units (n=6). If the reporter probe is selected to have one repeat unit (m=l), the total number of repeat units at the first assay will be five (n+m=4+1=5), the total number of repeat units at the second assay will equal six (n+m=5=1=6), and the total number of repeat units at the third assay will equal seven (n+m=6+1=7). The second test site will be the concordant test site since the number of repeat units in the target in this case equals the number of repeat units in the capture plus the reporter probes, that is it is the test site with six repeat units both in the target and in the combination of the capture probe and the reporter probe. Utilizing the knowledge regarding probe placement, the second test site is known to include a capture probe having five repeat units (n=5), such that when coupled with the knowledge of the reporter probe including one repeat unit, the total number of six repeat units in the target is determined.
In the preferred embodiment of these inventions, electronically aided hybridization or concordance and discordance determination, or both, are utilized in the process. In one aspect, during the hybridization of the nucleic acid target with the capture probe and/or the reporter probe, electronic stringent conditions may be utilized, preferably along with other stringency affecting conditions, to aid in the hybridization. This technique is particularly advantageous to reduce or eliminate slippage hybridization among repeat units, and to promote more effective hybridization. In yet another aspect, electronic stringency conditions may be varied during the hybridization complex stability determination so as to more accurately or quickly determine the state of concordance or discordance.
In yet another aspect of this invention, a method is provided for the determination of the nature of the repeat units in a genetic target by providing a bioelectronic device including a set of probes arrayed at a set of test sites, the probes having a first unique flanking sequence, a second unique flanking sequence, and an intervening repeat unit series having variable numbers of repeat units. The target is hybridized with the set of probes at the set of test sites, under electronic stringency hybridization conditions, and the concordance/discordance at the test sites is then determined. In the preferred embodiment, the concordance/discordance is determined at least in part through the use of electronic hybridization stability determinations. The concordant test site indicates which probe includes the number of repeat units identical to that in the target. In a variation of this embodiment, electronic stringency control is utilized only during the concordance/discordance determination.
In yet another aspect of this invention, methods and apparatus are provided for the determination of target alleles which vary in size in a sample. A platform is provided for the identification of target alleles which includes probes selected from the group consisting of (i) a probe having a first unique flanking sequence, an intervening repeat region and a second unique flanking sequence, and (ii) a sandwich assay comprising a capture probe having a first unique flanking sequence and 0,1,2 . . . repeat units and a reporter probe having 0,1,2 . . . repeat units in sequence with a second unique flanking sequence. Thereafter, the target is hybridized with the probes, preferably under electronic stringent conditions so as to aid in proper indexing, or alternatively, utilizing electronic stringency conditions during subsequent steps, or using electronic stringency both during hybridization and at later steps, thereafter determining concordance and discordance at the test sites as determined at least in part by hybridization stability.
In one aspect of the inventions, the location of the concordant test site represents the nature of the target sequence repeat units by the number of repeat units present in the target, and that in turn is based upon the knowledge of the probes located at that test site. Namely, the particular probes associated with a given physical test site typically will be known in terms of their sequence, especially including the number of repeat units, and the physical position of those test sites results in a knowledge for the concordant sites of the nature of the target, especially the number of repeat units. Typically, at a concordant test site, the number of repeat units in the target equals the sum of the number of repeat units in the capture probe and the number of repeat units in the reporter probe.
One advantageous aspect of the inventions is that the methods and apparatus are effective in determining the presence of microvariants in the target sequence. Such microvariants may include one or more deletions, insertions, transitions and/or transversions. These may be for a single base or for more than a single base. Deletions or insertions within repeat units can be detected by gel separation methods when using highly controlled conditions. This requires single base resolution and is near the limit of detection for most gel separation techniques. For transitional or transversional mutations, the size of the allele doesn""t change, even though the sequence has become altered. Conventional gel sieving methods have a very difficult time detecting these types of mutations, and recent findings by other investigators (Sean Walsh, Dennis Roeder, DNA Forensics: Science, Evidence and Future Prospects, McLean, Va. November, 1997) suggest that transitional and transversional mutations can cause subtle anomalies resulting in difficult gel analysis sometimes resulting in obfuscation of STR analysis. Our method is an hybridization technique and is quite adept at reliably detecting single nucleotide polymorphisms as described above. Additionally, by designing specific capture and reporter oligonucleotides these assays can be done on the same platform used to discriminate the nature of STR alleles by repeat unit number. The general strategy of designing capture oligos for microvariant analysis is the same as it is for integral repeat units, however reporter oligos may differ in that they may or may not contain unique flanking sequence. The condition of effectively determining concordance by maximizing the hybridization complex stability remains since oligo design parameters which yield base stacking (as described above) are still followed.
In yet another aspect of these inventions, various additional steps may be utilized in order to promote distinguishing concordant and discordant test sites. One mode of concordance may be that in which there is a complementary match of bases in the hybridization complex including the capture, reporter and target in the sandwich assay format. In yet another highly advantageous arrangement, the use of juxtaposed terminal nucleotides of the reporter and capture may be utilized, wherein their contiguous nature permits interaction, such as base stacking. Advantageously, the juxtaposed terminal nucleotide identities may be selected, as allowed by the existing repeat unit or otherwise relevant sequence, so as to increase the energy difference between concordance and discordance. It has been reported that base stacking between different bases varies in stability through an approximately 4-fold range (Saenger, Principles of Nucleic Acid Structure, 1984, Springer-Verlag, New York, N.Y.). Experimental results have shown at least a ten-fold, and often times at least more than twenty-fold, improvement in discrimination ratios for the pairings 5xe2x80x2G-A3xe2x80x2 versus 5xe2x80x2T-A3xe2x80x2, when analyzed in our system. While this result is generally in concert with the published findings that 5xe2x80x2G-A3xe2x80x2 base stacking provides greater stability than 5xe2x80x2T-A3xe2x80x2 pairs, the differential stability increase seen with our assay greatly exceeds the reported values. It is highly beneficial that this invention exploits this natural condition to provide a superior assay advantage. In yet other embodiments, the terminal nucleotides may be modified to increase base stacking effects, such as with the addition of propynyl groups, methyl groups or cholesterol groups. In yet another related aspect, ligation techniques may be utilized, such as enzyme ligation or chemical ligation, so as to increase the energy difference between a concordant and discordant site.
Discordance may be manifested in various ways, such as in the sandwich assay format wherein a gap or overlap exists, or in the loop out method where a loop out exists. Further, discordance may exist in the repeat region where there is a base variation, such as a deletion, insertion, transition and/or transversion.
In distinguishing concordant and discordant test sites, the distinction is preferably drawn in part based on hybridization stability. Hybridization stability may be influenced by numerous factors, including thermoregulation, chemical regulation, as well as electronic stringency control, either alone or in combination with the other listed factors. Through the use of electronic stringency conditions, in either or both of the target hybridization step or the reporter oligonucleotide stringency step, rapid completion of the process may be achieved. Electronic stringency hybridization of the target is one distinctive aspect of this method since it is amenable with double stranded DNA and results in rapid and precise hybridization of the target to the capture. This is desirable to achieve properly indexed hybridization of the target DNA to attain the maximum number of molecules at a test site with an accurate hybridization complex. By way of example, with the use of electronic stringency, the initial hybridization step may be completed in ten minutes or less, more preferably five minutes or less, and most preferably one minute or less. Overall, the analytical process may be completed in less than half an hour.
As to detection of the hybridization complex, it is preferred that the complex is labeled. Typically, in the step of determining concordance and discordance, there is a detection of the amount of labeled hybridization complex at the test site or a portion thereof. Any mode or modality of detection consistent with the purpose and functionality of the invention may be utilized, such as optical imaging, electronic imaging, use of charge coupled devices or other methods of quantification. Labeling may be of the target, capture or reporter. Various labeling may be by fluorescent labeling, colormetric labeling or chemiluminescent labeling. In yet another implementation, detection may be via energy transfer between molecules in the hybridization complex. In yet another aspect, the detection may be via fluorescence perturbation analysis. In another aspect the detection may be via conductivity differences between concordant and discordant sites.
In yet another aspect of these inventions, a redundant assay may be conveniently performed. In one implementation, a serial redundant assay may be utilized, such as where after an initial hybridization complex assay is performed, the stringency conditions are increased so as to effect denaturation, thereby removing the reporter from the first hybridization complex assay. A second reporter may then be hybridized to the remaining complex target and capture probe, wherein the second reporter includes a number of repeat units which differs from the number or type of repeat units in the first reporter. In this way, through the practice of the other steps as described for other applications, the physical test site at which concordance exists will have moved. The result is that a redundant assay has been performed on the same device and sample material.
Yet another redundant assay may be performed wherein multiple, e.g., two or more, independent sets of assays exist. A first reporter is hybridized to a first set of assays, and a second reporter is hybridized to a second set of assays, wherein the number of repeat units in the first reporter differs from the number or nature of repeat units in the second reporter. Determination of concordance/discordance at the test site of the arrays, when coupled with the knowledge of the probes located as those test sites, provides two complexes from the hybridization assays for confirmation of the target repeat number or nature.
The systems and methods of these inventions are particularly useful for determining the nature of complex samples, such as heterozygous samples, and mixed samples such as those from multiple sources or donors. In application, the methods and systems of these inventions may be utilized for a broad array of applications. Among them include identification, such as for paternity testing or for other forensic use. Yet another application is in disease diagnostics, such as for the identification of the existence of a clonal tumor, where the tumor includes repeat units of a nature or number different than the patient""s undiseased genetic state.
Accordingly, it is an object of this invention to provide methods and systems for the rapid identification of the nature and/or number of repeat units in a polymorphic system.
It is yet a further object of this invention to provide methods and apparatus which may effectively provide for genetic identification.
It is yet a further object of this invention to provide systems and methods for the accurate detection of diseased states, especially clonal tumor disease states, neurological disorders and predisposition to genetic disease.
It is yet a further object of this invention to provide a rapid and effective system and methods for identification, such as in forensics and paternity applications.