The present invention generally relates to detection of DNA variations. More specifically, the present invention relates to methods for automatic detection of DNA mutations and polymorphisms using human sequencing trace data.
Mutation detection is increasingly undertaken as a tool for wide spectrum of research in disease diagnostics, especially in cancer research. Many pharmaceutical companies spend billions of dollars to locate the mutate genes associated with any one particular disease. There are many technologies available to detect a mutation indirectly. The following are examples of the many indirect methods available to detect DNA variation in a specific region of DNA from multiple samples. One such series of indirect methods is referred to as mutation discovery methods. Mutation discovery methods detect the relative peak shifting when a mutation sample is compared to wild-type reference DNA. The mutation discovery methods include denaturing gradient gel electrophoresis (DGGE), denaturing high performance liquid chromatography (DHPLC), temperature gradient capillary electrophoresis (TGCE), heteroduplex analysis (HD), the analysis of single stranded DNA conformation polymorphism (SSCP), and chemical or enzyme cleavage of the mismatch (CECM). The mutation discovery methods are “blind” without knowing the specific location of DNA sequencing. Therefore, the mutation discovery methods cannot tell where the mutation has taken place and what type of mutation is in a DNA sample. All of these mutation discovery methods are indirect and requires confirmation of mutations by DNA sequencing. Another series of indirect methods is referred to as mutation genotyping. An example of mutation genotyping is the single base extension method, which detects mutation type when the DNA sequence is known. The above two series of indirect methods involve comparing two peaks in the electropherogram.
A more direct series of methods are referred to as DNA sequencing, which detects the mutation location and mutation type in the sample and provides accurate mutation information. However, DNA sequencing involves a large amount of calculation and extensive data manipulation to find the mutations. The mutation detection from DNA sequence data is cumbersome and time consuming and is currently based on visual inspection. The visual inspection is required because base-calling error percentage (1˜1.5%) is much higher than mutation percentage (0.05˜0.2%). A software program using computer hardware for automatic detection of mutation from sequence trace data appears to be the most prudent method to hunt for mutations in disease and cancer genes. There are academic software programs for mutation detection using trace data. However, the academic software programs can detect only a specific type of mutation with a specific chemistry. None of them are capable of detecting all kinds of mutations with all chemistries. Other drawbacks to the available academic software programs are errors, lack of flexibility, requirement of visual inspection of final results due to errors and cumbersome of use. Also, comparison methods have been discussed in a few scientific papers to find heterozygous mutation from DNA sequence traces with a linear trace subtraction method. But, there is no known paper discussing detection of insertion and deletion mutations, especially heterozygous insertions and deletions.
It is an object of the present invention to provide a method which can be implemented with software for the automatic detection of DNA mutation from sequence trace data.
It is another object of the present invention to provide a method with can be used for the detection of insertion and deletion DNA mutations from sequence trace data.