Genetic mutations are the primary cause of heritable disease and cancer. The genetic basis of disease, however, is complex and diverse, e.g., more than 700 presumed disease-causing mutations have been identified in the cystic fibrosis gene alone. Multiple mutations may be present in a single affected individual, and may be spaced within a few base pairs of each other, each of which may or may not be pathogenic. Thus, the ability to precisely locate and identify mutations is important for disease diagnosis, prediction, prevention and treatment.
Assays which detect the existence of nucleic acid mutations have been developed using various molecular biological techniques. One of the earliest methods involved the detection of restriction fragment length polymorphisms (RFLPs) using the Southern blotting technique. (Southern, E. M., J. Mol. Biol. 98:503-517 (1975)). RFLPs determine genetic variations in certain DNA fragments by cleaving the fragments with a type II restriction endonuclease. The differences in DNA length are due to the presence or absence of a specific endonuclease recognition site(s) and are detected using DNA hybridization with DNA probes after separation by gel electrophoresis.
Methods of detecting mutations which make use of polymerase chain reaction (PCR) have also been developed. In instances where the particular mutation has been identified, labeled primers can be used to determine whether a sample contains the known mutations. PCT/US93/04160 describes a method which allows perfectly matched DNA molecules to be separated from imperfectly matched molecules. The molecules can also be labeled to provide probes for identifying regions of heterozygosity in the genome.
In U.S. Pat. No. 5,217,863, Cotton et al. claims a method of detecting point mutations in sample DNA by hybridizing it to known DNA (without mutations) and subjecting the heteroduplex to hydroxylamine or osmium tetroxide and piperdine treatment. Hydroxylamine reacts with mismatched C and osmium tetroxide reacts with mismatched T (and to a lesser extent mismatched C), resulting in cleavage at the point of mismatch on addition of piperidine. The resulting material is then separated, for instance, by electrophoresis. If cleavage has occurred at one or more sites this will be apparent from the result of separation treatment, the number of fragments indicating the number of cleavages and hence the number of mutations of the type under consideration. However, the identity of the sequence(s) cannot be determined.
More recently, mutation-detecting assays have been developed that utilize proteins that recognize and bind to mismatched DNA heteroduplexes. (See, e.g., Modrich, Science 266:1959-1960 (1994) and U.S. Pat. No. 5,459,039). These proteins have been found in a variety of organisms in addition to E. coli. They act in concert to recognize and repair mismatches. In the simplest embodiment, heteroduplexes formed between reference and test DNAs are contacted with a mismatch recognition protein, such as MutS. The mixture is then passed over a nitrocellulose filter which binds the protein and any protein:DNA complexes. The presence of a mismatch in the contacted DNA is indicated by retention of the DNA:protein complex on nitrocellulose. However, this method indicates only the presence or absence of a mismatch, and does not directly allow for identification of the specific mutation(s).
Similarly, WO 95/12689, assigned to GeneCheck, Inc., describes contacting labeled heteroduplexed DNA with a labeled immobilized mismatch binding protein ("MBP") such as MutS. Binding, detected by direct or indirect methods, is indicative of a mismatch. Similarly, this method indicates only the presence or absence of a mismatch, and does not directly allow for identification of the specific mutation(s). Along the same vein, WO 93/02216, assigned to Upstate Biotechnology, Inc. describes how mutations can be detected using a labeled antibodies specific for MBPs to determine if a mismatch is present. Again, the identity of the mismatch is not determined.
Methods have also been described which determine the general location of a mismatch using mismatch binding proteins. (See, WO 95/29258) Here, a test strand of nucleic acid potentially containing a mutation is hybridized to a reference strand known not to have a mutation. The duplex is contacted with a MBP and the complex is then treated with an exonuclease. The digestion of the nucleic acid terminates at the position of any bound MBP. The relative sizes of the resulting degradation products are analyzed, for example by electrophoresis, to determine the presence and approximate location of the mismatch.
U.S. Pat. No. 5,459,039 to Modrich et al. describes a method for detecting base sequence differences between homologous regions of two DNA molecules. In this method, the two strands are annealed and a protein which recognizes mismatches is added to form a DNA:protein complex. Modrich describes several labor-intensive methods of "localizing" the mismatch. For example, single-stranded gaps near the mismatch can be generated by contacting the DNA:protein complex with a defined mismatch correction system. The DNA is then cleaved with a single-stranded specific endonuclease and at least one restriction enzyme. The electrophoretic mobilities of the fragments are then compared. Alternatively, heteroduplexed DNA containing at least one GATC sequence may be contacted with a mixture of mutS, mutL, and mutH. Cleavage of the DNA indicates presence of a mismatch. However, the position of the mismatch is not determined.
Alternatively, the location of the mismatch can be identified by chemically modifying at least one strand of the DNA duplex in the vicinity of the bound mismatch recognition protein. Modrich et al. describes how chemical modification, such as hydroxyl radical cleavage, can be accomplished by modifying the MutS protein to create a binding site for a metal ion which can catalyze formation of hydroxyl radicals which in turn will attack and cleave at least one strand of bound DNA in the vicinity of the mismatch.
Other methods of mismatch detection utilize chemical rather than enzymatic means. Chemicals that cleave at mismatched bases are also known. Osmium tetroxide, for instance, modifies mispaired thymidines while hydroxylamine modifies unpaired cytosines. Co-owned U.S. Pat. No. 5,217,863 describes how these chemically modified mismatches can be treated with piperdine, which results in elimination of the mismatched, modified nucleotide and breakage of one strand at the mismatch. Adapter-primer oligonucleotides are then ligated to the newly-created terminus followed by sequencing to identify the nucleotide sequence adjacent to the mismatch. However, the identity of the mismatched nucleotide(s) is not determined.
However, none of these methods directly identify the precise sequence of a mutation. Moreover, none of these methods provides for a high-throughput system for identifying unknown mutations. Currently, PCR amplification may be utilized to amplify region(s) of DNA, followed by sequencing of the PCR product(s). However, genes which are the loci of known disease-causing mutations may cover many kilobases of DNA. The cost and labor required to sequence every patient DNA sample over these important regions would make the detection of pathogenic mutations extremely slow and prohibitively expensive. Thus, one or more "mutation scanning" methodologies, such as those described above, is typically applied to detect the presence of mutations and limit the regions to be sequenced to those containing the potential alterations. This process is still time-consuming and laborious, since the scanning process does not aid in the subsequent process of sequence determination, which itself may pose separate and unique difficulties associated with template quality and quantity, as well as the inherent limitation of current methods to provide sequence in excess of a certain number of nucleotides from a primer (typically 600). Thus, a need exists which both indicates the presence of unknown mutations and which directly provides the sequence of the alteration(s). This invention satisfies these needs and provides related advantages as well.