Genomic DNA varies significantly from individual to individual, except in identical siblings. Many human diseases arise from genomic variations and mutations. The genetic diversity amongst humans and other life forms explains the heritable variations observed in disease susceptibility. Diseases arising from such genetic mutations include Huntington's disease, cystic fibrosis and Duchenne muscular dystrophy. Each of these diseases is associated with a single gene mutation. Diseases such as multiple sclerosis, diabetes, Parkinson's disease, Alzheimer's disease, hypertension and cancer (e.g., Breast Cancer) are much more complex. These diseases may be due to polygenic (multiple gene influences) or multifactorial (multiple gene and environmental influences) causes. Many of the variations in the genome do not result in a disease trait. However, as described above, a single mutation can result in a disease trait. The ability to scan the human genome to identify the location of genes which underlie or are associated with the pathology of such diseases is an enormously powerful tool in medicine and human biology.
Although substantial progress has been made in identifying the genetic basis of many human diseases, current methodologies used to develop this information are limited by prohibitive costs and the extensive amount of work required to obtain genotype information from large sample populations. These limitations make identification of complex gene mutations contributing to disorders such as diabetes extremely difficult. Techniques for scanning the human genome to identify the locations of genes involved in disease processes began in the early 1980s with the use of restriction fragment length polymorphism (RFLP) analysis (Botstein et al. (1980), Am. J. Hum. Genet., 32:314-31; Nakamura et al. (1987), Science, 235:1616-22). RFLP analysis involves southern blotting and other techniques. Southern blotting is both expensive and time-consuming when performed on large numbers of samples, such as those required to identify a complex genotype associated with a particular phenotype. Some of these problems were avoided with the development of polymerase chain reaction (PCR) based microsatellite marker analysis. Microsatellite markers are simple sequence length polymorphisms (SSLPs) consisting of di-, tri-, and tetra-nucleotide repeats.
Oncology is another field that relies heavily on the discovery of DNA alterations. The discovery of mutations in DNA leads to the identification of mutated proteins in the tumors. The specific lesions in the proteins from tumor samples are used to develop drugs which attack and destroy only the cancer cells that contain those DNA mutations. This results in personalized, non-toxic therapy that will one day cure cancer. It is imperative to identify these targets by genome scale analysis of every protein coding region.
Other methods for detecting mutations are also prohibitively expensive and impractical. One of these methods involves DNA sequencing. A complete analysis by DNA sequencing may involve sequencing 6 billion bases of nucleic acid per diagnosis. While such an analysis is possible with current technology, the cost associated with the analysis of 6 billion bases is at this time prohibitive for routine diagnosis outside of the research environment.
Another mutation analyzing method involves DNA chips. A number of oligonucleotides can be fixed onto a solid glass surface and selectively hybridized with a test DNA fragment to detecting a signal and determine a nucleic acid sequence. While this technique is possible for common (Single Nucleotide Polymorphisms or SNP's) and known mutations, it is not economically feasible for unknown DNA mutations that may occur in a gene. It is also not practical for the detection of insertions and deletions in DNA. For example, in cystic fibrosis, a recessive disorder affecting 1 in 2000-2500 live births in the United States, more than 225 presumed disease-causing mutations have been identified in one gene alone. Any one of these 225 mutations may cause a person to be a carrier or a sufferer of cystic fibrosis.
Another approach to mutation scanning relies on the fact that DNA molecules change in structure depending on the exact content of base-pairs. The structural defects in the DNA can be measured by looking for altered mobility of the DNA fragment through either a polyacrylamide gel, a capillary filled with flowable linear polyacrylamide or denaturing liquid chromatography. The analysis can be done on single strand molecules (Single Strand Conformational Analysis) or on double stranded molecules (Heteroduplex Analysis, Temperature Sensitive Gradient Electrophoresis, Conformation Specific Gel Elecrophoresis or Denaturing High Performance Liquid Chromatography).
Each of the methods described above, and variations thereof, are limited in their applicability to complex mutational analysis. Specifically, tumor samples are contaminated with normal cells (lymphocytes, stroma, etc.) which results in heterogeneous, impure DNA populations and makes it quite difficult to find mutations with the current gold standard of DNA sequencing. Moreover, the current techniques suffer from large costs and lack of sensitivity in non-pure DNA in the case of DNA sequencing and low sensitivity and throughput in the case of the structural analysis based methods. Additionally all these procedures suffer from low throughput which limits the utility and scope of analysis. Furthermore, multiple mutations may be present in a single affected individual, and may be spaced within a few base pairs of each other. These phenomena present unique difficulties in designing clinical screening methods that can accommodate large numbers of sample DNAs.
Thus, there is a need in the art for relatively low cost methods that allow the efficient screening of large numbers of target DNA such as disease associated genes and exons, for genetic variation and the rapid identification of the variant sequence. This extremely important in both risk analysis from blood samples as well as drug target identification and design from tumor samples.