The present invention relates to methods and compositions for treating nucleic acid, and in particular, methods and compositions for detection and characterization of nucleic acid sequences and sequence changes.
The detection and characterization of specific nucleic acid sequences and sequence changes have been utilized to detect the presence of viral or bacterial nucleic acid sequences indicative of an infection, the presence of variants or alleles of mammalian genes associated with disease and cancers, and the identification of the source of nucleic acids found in forensic samples, as well as in paternity determinations.
Various methods are known in the art which may be used to detect and characterize specific nucleic acid sequences and sequence changes. Nonetheless, as nucleic acid sequence data of the human genome, as well as the genomes of pathogenic organisms accumulates, the demand for fast, reliable, cost-effective and user-friendly tests for specific sequences continues to grow. Importantly, these tests must be able to create a detectable signal from a very low copy number of the sequence of interest. The following discussion examines three levels of nucleic acid detection currently in use: I. Signal Amplification Technology for detection of rare sequences; II. Direct Detection Technology for detection of higher copy number sequences; and III. Detection of Unknown Sequence Changes for rapid screening of sequence changes anywhere within a defined DNA fragment.
The xe2x80x9cPolymerase Chain Reactionxe2x80x9d (PCR) comprises the first generation of methods for nucleic acid amplification. However, several other methods have been developed that employ the same basis of specificity, but create signal by different amplification mechanisms. These methods include the xe2x80x9cLigase Chain Reactionxe2x80x9d (LCR), xe2x80x9cSelf-Sustained Synthetic Reactionxe2x80x9d (3SR/NASBA), and xe2x80x9cQxcex2-Replicasexe2x80x9d (Qxcex2).
Polymerase Chain Reaction (PCR)
The polymerase chain reaction (PCR), as described in U.S. Pat. Nos. 4,683,195 and 4,683,202 to Mullis and Mullis et al., describe a method for increasing the concentration of a segment of target sequence in a mixture of genomic DNA without cloning or purification. This technology provides one approach to the problems of low target sequence concentration. PCR can be used to directly increase the concentration of the target to an easily detectable level. This process for amplifying the target sequence involves introducing a molar excess of two oligonucleotide primers which are complementary to their respective strands of the double-stranded target sequence to the DNA mixture containing the desired target sequence. The mixture is denatured and then allowed to hybridize. Following hybridization, the primers are extended with polymerase so as to form complementary strands. The steps of denaturation, hybridization, and polymerase extension can be repeated as often as needed, in order to obtain relatively high concentrations of a segment of the desired target sequence.
The length of the segment of the desired target sequence is determined by the relative positions of the primers with respect to each other, and, therefore, this length is a controllable parameter. Because the desired segments of the target sequence become the dominant sequences (in terms of concentration) in the mixture, they are said to be xe2x80x9cPCR-amplified.xe2x80x9d
Ligase Chain Reaction (LCR or LAR)
The ligase chain reaction (LCR; sometimes referred to as xe2x80x9cLigase Amplification Reactionxe2x80x9d (LAR) described by Barany, Proc. Natl. Acad. Sci., 88:189 (1991); Barany, PCR Methods and Applic., 1:5 (1991); and Wu and Wallace, Genomics 4:560 (1989) has developed into a well-recognized alternative method for amplifying nucleic acids. In LCR, four oligonucleotides, two adjacent oligonucleotides which uniquely hybridize to one strand of target DNA, and a complementary set of adjacent oligonucleotides, which hybridize to the opposite strand are mixed and DNA ligase is added to the mixture. Provided that there is complete complementarity at the junction, ligase will covalently link each set of hybridized molecules. Importantly, in LCR, two probes are ligated together only when they base-pair with sequences in the target sample, without gaps or mismatches. Repeated cycles of denaturation, hybridization and ligation amplify a short segment of DNA. LCR has also been used in combination with PCR to achieve enhanced detection of single-base changes. Segev, PCT Public. No. WO9001069 A1 (1990). However, because the four oligonucleotides used in this assay can pair to form two short ligatable fragments, there is the potential for the generation of target-independent background signal. The use of LCR for mutant screening is limited to the examination of specific nucleic acid positions.
Self-Sustained Synthetic Reaction (3SR/NASBA)
The self-sustained sequence replication reaction (3SR) (Guatelli et al., Proc. Natl. Acad. Sci., 87:1874-1878 [1990], with an erratum at Proc. Natl. Acad. Sci., 87:7797 [1990]) is a transcription-based in vitro amplification system (Kwok et al., Proc. Natl. Acad. Sci., 86:1173-1177 [1989]) that can exponentially amplify RNA sequences at a uniform temperature. The amplified RNA can then be utilized for mutation detection (Fahy et al., PCR Meth. Appl., 1:25-33 [1991]). In this method, an oligonucleotide primer is used to add a phage RNA polymerase promoter to the 5xe2x80x2 end of the sequence of interest. In a cocktail of enzymes and substrates that includes a second primer, reverse transcriptase, RNase H, RNA polymerase and ribo-and deoxyribonucleoside triphosphates, the target sequence undergoes repeated rounds of transcription, cDNA synthesis and second-strand synthesis to amplify the area of interest. The use of 3SR to detect mutations is kinetically limited to screening small segments of DNA (e.g., 200-300 base pairs).
Q-Beta (Qxcex2) Replicase
In this method, a probe which recognizes the sequence of interest is attached to the replicatable RNA template for Qxcex2 replicase. A previously identified major problem with false positives resulting from the replication of unhybridized probes has been addressed through use of a sequence-specific ligation step. However, available thermostable DNA ligases are not effective on this RNA substrate, so the ligation must be performed by T4 DNA ligase at low temperatures (37xc2x0 C.). This prevents the use of high temperature as a means of achieving specificity as in the LCR, the ligation event can be used to detect a mutation at the junction site, but not elsewhere.
Table 1 below, lists some of the features desirable for systems useful in sensitive nucleic acid diagnostics, and summarizes the abilities of each of the major amplification methods (See also, Landgren, Trends in Genetics 9:199 [1993]).
A successful diagnostic method must be very specific. A straight-forward method of controlling the specificity of nucleic acid hybridization is by controlling the temperature of the reaction. While the 3SR/NASBA, and Qxcex2 systems are all able to generate a large quantity of signal, one or more of the enzymes involved in each cannot be used at high temperature (i.e.,  greater than 55xc2x0 C.). Therefore the reaction temperatures cannot be raised to prevent non-specific hybridization of the probes. If probes are shortened in order to make them melt more easily at low temperatures, the likelihood of having more than one perfect match in a complex genome increases. For these reasons, PCR and LCR currently dominate the research field in detection technologies.
The basis of the amplification procedure in the PCR and LCR is the fact that the products of one cycle become usable templates in all subsequent cycles, consequently doubling the population with each cycle. The final yield of any such doubling system can be expressed as: (1+X)n=y, where xe2x80x9cXxe2x80x9d is the mean efficiency (percent copied in each cycle), xe2x80x9cnxe2x80x9d is the number of cycles, and xe2x80x9cyxe2x80x9d is the overall efficiency, or yield of the reaction (Mullis, PCR Methods Applic., 1:1 [1991]). If every copy of a target DNA is utilized as a template in every cycle of a polymerase chain reaction, then the mean efficiency is 100%. If 20 cycles of PCR are performed, then the yield will be 2220, or 1,048,576 copies of the starting material. If the reaction conditions reduce the mean efficiency to 85%, then the yield in those 20 cycles will be only 1.85220, or 220,513 copies of the starting material. In other words, a PCR running at 85% efficiency will yield only 21% as much final product, compared to a reaction running at 100% efficiency. A reaction that is reduced to 50% mean efficiency will yield less than 1% of the possible product.
In practice, routine polymerase chain reactions rarely achieve the theoretical maximum yield, and PCRs are usually run for more than 20 cycles to compensate for the lower yield. At 50% mean efficiency, it would take 34 cycles to achieve the million-fold amplification theoretically possible in 20, and at lower efficiencies, the number of cycles required becomes prohibitive. In addition, any background products that amplify with a better mean efficiency than the intended target will become the dominant products.
Also, many variables can influence the mean efficiency of PCR, including target DNA length and secondary structure, primer length and design, primer and dNTP concentrations, and buffer composition, to name but a few. Contamination of the reaction with exogenous DNA (e.g., DNA spilled onto lab surfaces) or cross-contamination is also a major consideration. Reaction conditions must be carefully optimized for each different primer pair and target sequence, and the process can take days, even for an experienced investigator. The laboriousness of this process, including numerous technical considerations and other factors, presents a significant drawback to using PCR in the clinical setting. Indeed, PCR has yet to penetrate the clinical market in a significant way. The same concerns arise with LCR, as LCR must also be optimized to use different oligonucleotide sequences for each target sequence. In addition, both methods require expensive equipment, capable of precise temperature cycling.
Many applications of nucleic acid detection technologies, such as in studies of allelic variation, involve not only detection of a specific sequence in a complex background, but also the discrimination between sequences with few, or single, nucleotide differences. One method for the detection of allele-specific variants by PCR is based upon the fact that it is difficult for Taq polymerase to synthesize a DNA strand when there is a mismatch between the template strand and the 3xe2x80x2 end of the primer. An allele-specific variant may be detected by the use of a primer that is perfectly matched with only one of the possible alleles; the mismatch to the other allele acts to prevent the extension of the primer, thereby preventing the amplification of that sequence. This method has a substantial limitation in that the base composition of the mismatch influences the ability to prevent extension across the mismatch, and certain mismatches do not prevent extension or have only a minimal effect (Kwok et al., Nucl. Acids Res., 18:999 [1990]).)
A similar 3xe2x80x2-mismatch strategy is used with greater effect to prevent ligation in the LCR (Barany, PCR Meth. Applic., 1:5 [1991]). Any mismatch effectively blocks the action of the thermostable ligase, but LCR still has the drawback of target-independent background ligation products initiating the amplification. Moreover, the combination of PCR with subsequent LCR to identify the nucleotides at individual positions is also a clearly cumbersome proposition for the clinical laboratory.
When a sufficient amount of a nucleic acid to be detected is available, there are advantages to detecting that sequence directly, instead of making more copies of that target, (e.g., as in PCR and LCR). Most notably, a method that does not amplify the signal exponentially is more amenable to quantitative analysis. Even if the signal is enhanced by attaching multiple dyes to a single oligonucleotide, the correlation between the final signal intensity and amount of target is direct. Such a system has an additional advantage that the products of the reaction will not themselves promote further reaction, so contamination of lab surfaces by the products is not as much of a concern. Traditional methods of direct detection including Northern and Southern blotting and RNase protection assays usually require the use of radioactivity and are not amenable to automation. Recently devised techniques have sought to eliminate the use of radioactivity and/or improve the sensitivity in automatable formats. Two examples are the xe2x80x9cCycling Probe Reactionxe2x80x9d (CPR), and xe2x80x9cBranched DNAxe2x80x9d (bDNA) The cycling probe reaction (CPR) (Duck et al., BioTech., 9:142 [1990]), uses a long chimeric oligonucleotide in which a central portion is made of RNA while the two termini are made of DNA. Hybridization of the probe to a target DNA and exposure to a thermostable RNase H causes the RNA portion to be digested. This destabilizes the remaining DNA portions of the duplex, releasing the remainder of the probe from the target DNA and allowing another probe molecule to repeat the process. The signal, in the form of cleaved probe molecules, accumulates at a linear rate. While the repeating process increases the signal, the RNA portion of the oligonucleotide is vulnerable to RNases that may carried through sample preparation.
Branched DNA (bDNA), described by Urdea et al., Gene 61:253-264 (1987), involves oligonucleotides with branched structures that allow each individual oligonucleotide to carry 35 to 40 labels (e.g., alkaline phosphatase enzymes). While this enhances the signal from a hybridization event, signal from non-specific binding is similarly increased.
The demand for tests which allow the detection of specific nucleic acid sequences and sequence changes is growing rapidly in clinical diagnostics. As nucleic acid sequence data for genes from humans and pathogenic organisms accumulates, the demand for fast, cost-effective, and easy-to-use tests for as yet unknown mutations within specific sequences is rapidly increasing.
A handful of methods have been devised to scan nucleic acid segments for mutations. One option is to determine the entire gene sequence of each test sample (e.g., a bacterial isolate). For sequences under approximately 600 nucleotides, this may be accomplished using amplified material (e.g., PCR reaction products). This avoids the time and expense associated with cloning the segment of interest. However, specialized equipment and highly trained personnel are required, and the method is too labor-intense and expensive to be practical and effective in the clinical setting.
In view of the difficulties associated with sequencing, a given segment of nucleic acid may be characterized on several other levels. At the lowest resolution, the size of the molecule can be determined by electrophoresis by comparison to a known standard run on the same gel. A more detailed picture of the molecule may be achieved by cleavage with combinations of restriction enzymes prior to electrophoresis, to allow construction of an ordered map. The presence of specific sequences within the fragment can be detected by hybridization of a labeled probe, or the precise nucleotide sequence can be determined by partial chemical degradation or by primer extension in the presence of chain-terminating nucleotide analogs.
For detection of single-base differences between like sequences, the requirements of the analysis are often at the highest level of resolution. For cases in which the position of the nucleotide in question is known in advance, several methods have been developed for examining single base changes without direct sequencing. For example, if a mutation of interest happens to fall within a restriction recognition sequence, a change in the pattern of digestion can be used as a diagnostic tool (e.g., restriction fragment length polymorphism [RFLP] analysis).
Single point mutations have been also detected by the creation or destruction of RFLPs. Mutations are detected and localized by the presence and size of the RNA fragments generated by cleavage at the mismatches. Single nucleotide mismatches in DNA heteroduplexes are also recognized and cleaved by some chemicals, providing an alternative strategy to detect single base substitutions, generically named the xe2x80x9cMismatch Chemical Cleavagexe2x80x9d (MCC) (Gogos et al., Nucl. Acids Res., 18:6807-6817 [1990]). However, this method requires the use of osmium tetroxide and piperidine, two highly noxious chemicals which are not suited for use in a clinical laboratory.
RFLP analysis suffers from low sensitivity and requires a large amount of sample. When RFLP analysis is used for the detection of point mutations, it is, by its nature, limited to the detection of only those single base changes which fall within a restriction sequence of a known restriction endonuclease. Moreover, the majority of the available enzymes have 4 to 6 base-pair recognition sequences, and cleave too frequently for many large-scale DNA manipulations (Eckstein and Lilley (eds.), Nucleic Acids and Molecular Biology, vol. 2, Springer-Verlag, Heidelberg [1988]). Thus, it is applicable only in a small fraction of cases, as most mutations do not fall within such sites.
A handful of rare-cutting restriction enzymes with 8 base-pair specificities have been isolated and these are widely used in genetic mapping, but these enzymes are few in number, are limited to the recognition of G+C-rich sequences, and cleave at sites that tend to be highly clustered (Barlow and Lehrach, Trends Genet., 3:167 [1987]). Recently, endonucleases encoded by group I introns have been discovered that might have greater than 12 base-pair specificity (Perlman and Butow, Science 246:1106 [1989]), but again, these are few in number.
If the change is not in a recognition sequence, then allele-specific oligonucleotides (ASOs), can be designed to hybridize in proximity to the unknown nucleotide, such that a primer extension or ligation event can be used as the indicator of a match or a mis-match. Hybridization with radioactively labeled allelic specific oligonucleotides (ASO) also has been applied to the detection of specific point mutations (Conner et al., Proc. Natl. Acad. Sci., 80:278-282 [1983]). The method is based on the differences in the melting temperature of short DNA fragments differing by a single nucleotide. Stringent hybridization and washing conditions can differentiate between mutant and wild-type alleles. The ASO approach applied to PCR products also has been extensively utilized by various researchers to detect and characterize point mutations in ras genes (Vogelstein et al., N. Eng. J. Med., 319:525-532 [1988]; and Farr et al., Proc. Natl. Acad. Sci., 85:1629-1633 [1988]), and gsp/gip oncogenes (Lyons et al., Science 249:655-659 [1990]). Because of the presence of various nucleotide changes in multiple positions, the ASO method requires the use of many oligonucleotides to cover all possible oncogenic mutations.
With either of the techniques described above (i.e., RFLP and ASO), the precise location of the suspected mutation must be known in advance of the test. That is to say, they are inapplicable when one needs to detect the presence of a mutation of an unknown character and position within a gene or sequence of interest.
Two other methods rely on detecting changes in electrophoretic mobility in response to minor sequence changes. One of these methods, termed xe2x80x9cDenaturinig Gradient Gel Electrophoresisxe2x80x9d (DGGE) is based on the observation that slightly different sequences will display different patterns of local melting when electrophoretically resolved on a gradient gel. In this manner, variants can be distinguished, as differences in melting properties of homoduplexes versus heteroduplexes differing in a single nucleotide can detect the presence of mutations in the target sequences because of the corresponding changes in their electrophoretic mobilities. The fragments to be analyzed, usually PCR products, are xe2x80x9cclampedxe2x80x9d at one end by a long stretch of Gxe2x88x92C base pairs (30-80) to allow complete denaturation of tile sequence of interest without complete dissociation of the strands. The attachment of a GC xe2x80x9cclampxe2x80x9d to the DNA fragments increases the fraction of mutations that can be recognized by DGGE (Abrams et al., Genomics 7:463-475 [1990]). Attachinig a GC clamp to one primer is critical to ensure that the amplified sequence has a low dissociation temperature (Sheffield et al., Proc. Natl. Acad. Sci., 86:232-236 [1989]; and Lerman and Silverstein, Meth. Enzymol., 155:482-501 [1987]). Modifications of the technique have been developed, using temperature gradients (Wartell et al., Nucl. Acids Res., 18:2699-2701 [1990]), and the method can be also applied to RNA:RNA duplexes (Smith et al., Genomics 3:217-223 [1988]).
Limitations on the utility of DGGE include the requirement that the denaturing conditions must be optimized for each type of DNA to be tested. Furthermore, the method requires specialized equipment to prepare the gels and maintain the needed high temperatures during electrophoresis. The expense associated with the synthesis of the clamping tail on one oligonucleotide for each sequence to be tested is also a major consideration. In addition, long running times are required for DGGE. The long running time of DGGE was shortened in a modification of DGGE called constant denaturant gel electrophoresis (CDGE) (Borrensen et al., Proc. Natl. Acad. Sci. USA 88:8405 [1991]). CDGE requires that gels be performed under different denaturant conditions in order to reach high efficiency for the detection of unknown mutations.
An technique analogous to DGGE, termed temperature gradient gel electrophoresis (TGGE), uses a thermal gradient rather than a chemcial denaturant gradient (Scholz, et al., Hum. Mol. Genet. 2:2155 [1993]). TGGE requires the use of specialized equipment which can generate a temperature gradient perpendicularly oriented relative to the electrical field. TGGE can detect mutations in relatively small fragments of DNA therefore scanning of large gene segments requires the use of multiple PCR products prior to running the gel.
Another common method, called xe2x80x9cSingle-Strand Conformation Polymorphismxe2x80x9d (SSCP) was developed by Hayashi, Sekya and colleagues (reviewed by Hayashi, PCR Meth. Appl., 1:34-38, [1991]) and is based on the observation that single strands of nucleic acid can take on characteristic conformations in non-denaturing conditions, and these conformations influence electrophoretic mobility. The complementary strands assume sufficiently different structures that one strand may be resolved from the other. Changes in sequences within the fragment will also change the conformation, consequently altering the mobility and allowing this to be used as an assay for sequence variations (Orita, et al., Genomics 5:874-879, [1989]).
The SSCP process involves denaturing a DNA segment (e.g., a PCR product) that is labelled on both strands, followed by slow electrophoretic separation on a non-denaturing polyacrylamide gel, so that intra-molecular interactions can form and not be disturbed during the run. This technique is extremely sensitive to variations in gel composition and temperature. A serious limitation of this method is the relative difficulty encountered in comparing data generated in different laboratories, under apparently similar conditions.
The dideoxy fingerprinting (ddF) is another technique developed to scan genes for the presence of unknown mutations (Liu and Sommer, PCR Methods Appli., 4:97 [1994]). The ddF technique combines components of Sanger dideoxy sequencing with SSCP. A dideoxy sequencing reaction is performed using one dideoxy terminator and then the reaction products are electrophoresised on nondenaturing polyacrylamide gels to detect alterations in mobility of the termination segments as in SSCP analysis. While ddF is an improvement over SSCP in terms of increased sensitivity, ddF requires the use of expensive dideoxynucleotides and this technique is still limited to the analysis of fragments of the size suitable for SSCP (i.e., fragments of 200-300 bases for optimal detection of mutations).
In addition to the above limitations, all of these methods are limited as to the size of the nucleic acid fragment that can be analyzed. For the direct sequencing approach, sequences of greater than 600 base pairs require cloning, with the consequent delays and expense of either deletion sub-cloning or primer walking, in order to cover the entire fragment. SSCP and DGGE have even more severe size limitations. Because of reduced sensitivity to sequence changes, these methods are not considered suitable for larger fragments. Although SSCP is reportedly able to detect 90% of single-base substitutions within a 200 base-pair fragment, the detection drops to less than 50% for 400 base pair fragments. Similarly, the sensitivity of DGGE decreases as the length of the fragment reaches 500 base-pairs. The ddF technique, as a combination of direct sequencing and SSCP, is also limited by the relatively small size of the DNA that can be screened.
Clearly, there remains a need for a method that is less sensitive to size so that entire genes, rather than gene fragments, may be analyzed. Such a tool must also be robust, so that data from different labs, generated by researchers of diverse backgrounds and skills will be comparable. Ideally, such a method would be compatible with xe2x80x9cmultiplexing,xe2x80x9d (i.e., the simultaneous analysis of several molecules or genes in a single reaction or gel lane, usually resolved from each other by differential labelling or probing). Such an analytical procedure would facilitate the use of internal standards for subsequent analysis and data comparison, and increase the productivity of personnel and equipment. The ideal method would also be easily automatable.
The present invention relates to methods and compositions for treating nucleic acid, and in particular, methods and compositions for detection and characterization of nucleic acid sequences and sequence changes in microbial gene sequences. The present invention provides means for cleaving a nucleic acid cleavage structure in a site-specific manner. In one embodiment, the means for cleaving is an enzyme capable of cleaving cleavage structures on a nucleic acid substrate, forming the basis of a novel method of detection of specific nucleic acid sequences. The present invention contemplates use of the novel detection method for, among other uses, clinical diagnostic purposes, including but not limited to detection and identification of pathogenic organisms.
In one embodiment, the present invention contemplates a DNA sequence encoding a DNA polymerase altered in sequence (i.e., a xe2x80x9cmutantxe2x80x9d DNA polymerase) relative to the native sequence such that it exhibits altered DNA synthetic activity from that of the native (i.e., xe2x80x9cwild typexe2x80x9d) DNA polymerase. With regard to the polymerase, a complete absence of synthesis is not required; it is desired that cleavage reactions occur in the absence of polymerase activity at a level where it interferes with the method. It is preferred that the encoded DNA polymerase is altered such that it exhibits reduced synthetic activity from that of the native DNA polymerase. In this manner, the enzymes of the invention are nucleases and are capable of cleaving nucleic acids in a structure-specific manner. Importantly, the nucleases of the present invention are capable of cleaving cleavage structures to create discrete cleavage products.
The present invention contemplates nucleases from a variety of sources, including nucleases that are thermostable. Thermostable nucleases are contemplated as particularly useful, as they are capable of operating at temperatures where nucleic acid hybridization is extremely specific, allowing for allele-specific detection (including single-base mismatches). In one embodiment, the thermostable 5xe2x80x2 nucleases are selected from the group consisting of altered polymerases derived from the native polymerases of various Thermus species, including, but not limited to Thermus aquaticus, Thermus flavus and Thermus thermophilus. 
The present invention utilizes such enzymes in methods for detection and characterization of nucleic acid sequences and sequence changes. The present invention relates to means for cleaving a nucleic acid cleavage structure in a site-specific manner. Nuclease activity is used to screen for known and unknown mutations, including single base changes, in nucleic acids.
In one embodiment, the present invention contemplates a process or method for identifying strains of microorganisms comprising the steps of providing a cleavage means and a nucleic acid substrate containing sequences derived from one or more microorganism; treating the nucleic acid substrate under conditions such that the substrate forms one or more cleavage structures; and reacting the cleavage means with the cleavage structures so that one or more cleavage products are produced. In one embodiment of this invention, the cleavage means is an enzyme. In one preferred embodiment, the enzyme is a nuclease. In an alternative preferred embodiment, the nuclease is selected from the group consisting of Cleavase(trademark) BN enzyme, Thermus aquaticus DNA polymerase, Thermus thermophilus DNA polymerase, Escherichia coli Exo III, and the Saccharomyces cerevisiae Rad1/Rad10 complex. It is also contemplated that the enzyme may have a portion of its amino acid sequence that is homologous to a portion of the amino acid sequence of a thermostable DNA polymerase derived from a eubacterial thermophile, the latter being selected from the group consisting of Thermus aquaticus, Thermus flavus and Thermus thermophilus. 
It is contemplated that the nucleic acid substrate comprise a nucleotide analog, including but not limited to the group comprising 7-deaza-dATP, 7-deaza-dGTP and dUTP. In one embodiment, the nucleic acid substrate is substantially single-stranded. It is not intended that the nucleic acid substrate be limited to any particular form, indeed, it is contemplated that the nucleic acid substrate is single stranded or double-stranded RNA or DNA.
In one embodiment of the present invention, the treating step comprises rendering double-stranded nucleic acid substantially single-stranded, and exposing the single-stranded nucleic acid to conditions such that the single-stranded nucleic acid assumes a secondary or characteristic folded structure. In one preferred embodiment, double-stranded nucleic acid is rendered substantially single-stranded by increased temperature.
In an alternative embodiment, the method of the present invention further comprises the step of detecting one or more cleavage products.
It is contemplated that the microorganism(s) of the present invention be selected from a variety of microorganisms. It is not intended that the present invention be limited to any particular type of microorganism. Rather, it is intended that the present invention be used with organisms including, but not limited to, bacteria, fungi, protozoa, ciliates, and viruses. It is not intended that the microorganisms be limited to a particular genus, species, strain, or serotype. Indeed, it is contemplated that the bacteria be selected from the group including, but not limited to members of the genera Campylobacter, Escherichia, Mycobacterium, Salmonella, Shigella, and Staphylococcus. In one preferred embodiment, the microorganism(s) comprise strains of multi-drug resistant Mycobacterium tuberculosis. It is also contemplated that the present invention be used with viruses, including but not limited to hepatitis C virus and simian immunodeficiency virus.
Another embodiment of the present invention contemplates a method for detecting and identifying strains of microorganisms, comprising the steps of extracting nucleic acid from a sample suspected of containing one or more microorganisms; and contacting the extracted nucleic acid with a cleavage means under conditions such that the extracted nucleic acid folons one or more secondary structures, and the cleavage means cleaves the secondary structures to produce one or more cleavage products.
In one embodiment, the method further comprises the step of separating the cleavage products. In yet another embodiment, the method further comprises the step of detecting the cleavage products.
In one preferred embodiment, the present invention further comprises comparing the detected cleavage products generated from cleavage of the extracted nucleic acid isolated from the sample with separated cleavage products generated by cleavage of nucleic acids derived from one or more reference microorganisms. In such a case the sequence of the nucleic acids from one or more reference microorganisms may be related but different (e.g., a wild type control for a mutant sequence or a known or previously characterized mutant sequence).
In an alternative preferred embodiment, the present invention further comprises the step of isolating a polymorphic locus from the extracted nucleic acid after the extraction step, so as to generate a nucleic acid substrate, wherein the substrate is contacted with the cleavage means. In one embodiment, the isolation of a polymorphic locus is accomplished by polymerase chain reaction amplification. In an alternate embodiment, the polymerase chain reaction is conducted in the presence of a nucleotide analog, including but not limited to the group comprising 7-deaza-dATP, 7-deaza-dGTP and dUTP. It is contemplated that the polymerase chain reaction amplification will employ oligonucleotide primers matching or complementary to consensus gene sequences derived from the polymorphic locus. In one embodiment, the polymorphic locus comprises a ribosomal RNA gene. In a particularly preferred embodiment, the ribosomal RNA gene is a 16S ribosomal RNA gene.
In one embodiment of this method, the cleavage means is an enzyme. In one preferred embodiment, the enzyme is a nuclease. In a particularly preferred embodiment, the nuclease is selected from the group including, but not limited to Cleavase(trademark) BN enzyme, Thermus aquaticus DNA polymerase, Thermus thermophilus DNA polymerase, Escherichia coli Exo III, and the Saccharomyces cerevisiae Rad1/Rad10 complex. It is also contemplated that the enzyme may have a portion of its amino acid sequence that is homologous to a portion of the amino acid sequence of a thermostable DNA polymerase derived from a eubacterial thermophile, the latter being selected from the group consisting of Thermus aquaticus, Thermus flavus and Thermus thermophilus. 
It is contemplated that the nucleic acid substrate of this method will comprise a nucleotide analog, including but not limited to the group comprising 7-deaza-dATP, 7-deaza-dGTP and dUTP. In one embodiment, the nucleic acid substrate is substantially single-stranded. It is not intended that the nucleic acid substrate be limited to any particular form, indeed, it is contemplated that the nucleic acid substrate is single stranded or double-stranded RNA or DNA.
In another embodiment of the present invention, the treating step of the method comprises rendering double-stranded nucleic acid substantially single-stranded, and exposing the single-stranded nucleic acid to conditions such that the single-stranded nucleic acid has secondary structure. In one preferred embodiment, double-stranded nucleic acid is rendered substantially single-stranded by increased temperature.
It is contemplated that the microorganism(s) of the present invention be selected from a variety of microorganisms; it is not intended that the present invention be limited to any particular type of microorganism. Rather, it is intended that the present invention will be used with organisms including, but not limited to, bacteria, fungi, protozoa, ciliates, and viruses. It is not intended that the microorganisms be limited to a particular genus, species, strain, or serotype. Indeed, it is contemplated that the bacteria be selected from the group comprising, but not limited to members of the genera Campylobacter, Escherichia, Mycobacterium, Salmonella, Shigella, and Staphylococcus. In one preferred embodiment, the microorganism(s) comprise strains of multi-drug resistant Mycobacterium tuberculosis. It is also contemplated that the present invention be used with viruses, including but not limited to hepatitis C virus and simian immunodeficiency virus.
In yet another embodiment, the present invention contemplates a method for treating nucleic acid comprising an oligonucleotide containing microbial gene sequences, comprising providing a cleavage means in a solution containing manganese and nucleic acid substrate containing microbial gene sequences; treating the nucleic acid substrate with increased temperature such that the substrate is substantially single-stranded; reducing the temperature under conditions such that the single-stranded substrate forms one or more cleavage structures; reacting the cleavage means with the cleavage structures so that one or more cleavage products are produced; and detecting the one or more cleavage products produced by the method.
The present invention also contemplates a process for creating a record reference library of genetic fingerprints characteristic (i.e., diagnostic) of one or more alleles of the various microorganisms, comprising the steps of providing a cleavage means and nucleic acid substrate derived from microbial gene sequences; contacting the nucleic acid substrate with a cleavage means under conditions such that the extracted nucleic acid forms one or more secondary structures and the cleavage means cleaves the secondary structures, resulting in the generation of multiple cleavage products; separating the multiple cleavage products; and maintaining a testable record reference of the separated cleavage products.
By the term xe2x80x9cgenetic fingerprintxe2x80x9d it is meant that changes in the sequence of the nucleic acid (e.g., a deletion, insertion or a single point substitution) alter the structures formed, thus changing the banding pattern (i.e., the xe2x80x9cfingerprintxe2x80x9d or xe2x80x9cbar codexe2x80x9d) to reflect the difference in the sequence, allowing rapid detection and identification of variants.
The methods of the present invention allow for simultaneous analysis of both strands (e.g., the sense and antisense strands) and are ideal for high-level multiplexing. The products produced are amenable to qualitative, quantitative and positional analysis. The methods may be automated and may be practiced in solution or in the solid phase (e.g., on a solid support). The methods are powerful in that they allow for analysis of longer fragments of nucleic acid than current methodologies.