In recent years, the molecular biology of a number of human genetic diseases has been elucidated by the application of recombinant DNA technology. More than 3000 diseases are currently known to be of genetic origin (Human Genome Mutations, D. N. Cooper and M. Krawczak, BIOS Publishers, 1993). These include hemophilias, thalassemias, Duchenne Muscular Dystrophy (DMD), Huntington's Disease (HD), Alzheimer's Disease, Cystic Fibrosis (CF), and various cancers, e.g., breast cancer. In addition to mutated genes, which result in genetic disease, certain birth defects are the result of chromosomal abnormalities such as Trisomy 21 (Down's Syndrome), Trisomy 13 (Patau Syndrome), Trisomy 18 (Edward's Syndrome), Monosomy X (Turner's Syndrome) and other sex chromosome aneuploidies such as Klinefelter's Syndrome (XXY).
Other genetic diseases are caused by an abnormal number of trinucleotide repeats in a gene. These diseases include Huntington's disease, prostate cancer, Spinal Cerebellar Ataxia (SCA), Fragile X syndrome (Kremer et al., Science 252:1711-14 (1991); Fu et al., Cell 67:1047-58 (1991); Hirst et al. J. Med. Genet. 28:824-29 (1991)), Myotonic Dystrophy (MD) type I ( Mahadevan et al., Science 255:1253-55 (1992); Brook et al., Cell 68:799-808 (1992)), Kennedy's disease, also termed Spinal and Bulbar Muscular Atrophy (La Spada et al., Nature 352:77-79 (1991)), Machado-Joseph disease, Dentatorubral and Pallidolyusian Atrophy. The aberrant number of triplet repeats can be located in any region of a gene, including the coding regions, non-coding regions of exons, introns, and promoter. In certain of these diseases, e.g., prostate cancer, the number of tiplet repeats is positively correlated with prognosis of the disease. All available evidence suggests that amplification of a tri-nucleotide repeat is involved in the molecular pathology in each of these disorders. Although some of these trinucleotide repeats appear to be in non-coding DNA, they clearly are involved with perturbations of genomic regions that ultimately affect gene expression. Perturbations of various di-and tri-nucleotide repeats resulting from somatic mutation in tumor cells could also affect gene expression and/or gene regulation.
Further, there is growing evidence that certain DNA sequences may predispose an individual to any of a number of other diseases such as diabetes, arteriosclerosis, obesity, various autoimmune diseases and cancer (e.g., colorectal, breast, ovarian, lung). The knowledge of the genetic lesion causing or contributing to a genetic disease allows one to predict whether a person has or is at risk of developing a disease or condition and also, at least in some cases, to determine the prognosis of the disease.
Furthermore, numerous genes have polymorphic regions. Since individuals have any one of several allelic variants of a polymorphic region, individuals can be identified based on the type of allelic variants of polymorphic regions of genes. This can be used, e.g., for forensic purposes. In other situations, it is crucial to know the identity of allelic variants that an individual has. For example, allelic differences in certain genes, e.g., major histocompatibility complex (MHC) genes are involved in graft rejection or graft versus host disease in bone marrow transplantation. Accordingly, it is highly desirable to develop rapid, sensitive, and accurate methods for determining the identity of allelic variants of polymorphic regions of genes and/or genetic lesions.
Several methods for detecting the identity of allelic variants or genetic lesions are currently in use. For example, the identity of an allelic variant or the presence of a genetic lesion can be determined by comparing the mobility of an amplified nucleic acid fragment with a known standard by gel electrophoresis, or by hybridization with a probe, which is complementary to the sequence to be identified. Identification, however, can only be accomplished if the nucleic acid fragment is labeled with a sensitive reporter function (e.g. radioactive (.sup.32 P, .sup.35 S), fluorescent or chemiluminescent). However, radioactive labels can be hazardous and the signals they produce decay over time. Non-isotopic labels (e.g. fluorescent) suffer from a lack of sensitivity and fading of the signal when high intensity lasers are being used. Additionally, performing labeling, electrophoresis and subsequent detection are laborious, time-consuming and error-prone procedures. Electrophoresis is particularly error-prone, since the size or the molecular weight of the nucleic acid cannot be directly correlated to the mobility in the gel matrix. It is known that sequence specific effects, secondary structures and interactions with the gel matrix are causing artefacts.
Other detection methods involve mass spectrometry. In general, mass spectrometry provides a means of "weighing" individual molecules by ionizing the molecules in vacuo and making them "fly" by volatilization. Under the influence of electric and/or magnetic fields, the ions follow trajectories depending on their individual mass (m) and charge (z). In the range of molecules with low molecular weight, mass spectrometry has long been part of the routine physical-organic repertoire for analysis and characterization of organic molecules by the determination of the mass of the parent molecular ion. In addition, by arranging collisions of this parent molecular ion with other particles (e.g., argon atoms), the molecular ion is fragmented forming secondary ions by the so-called collisionally activated dissociation (CAD). The fragmentation pattem/pathway very often allows the derivation of detailed structural information. Many applications of mass spectrometric methods are known in the art, particularly in the biosciences, and can be found summarized in Methods in Enzymology, Vol. 193: "Mass Spectrometry" (J. A. McCloskey, editor), 1990, Academic Press, New York; McLaffery et al., (1994) Acc. Chem. Res. 27:397-386; Chait & Kent (1992) Science 257:1885-1894; Siuzdak, (1994) Proc. Natl. Acad. Sci. USA 91:11290-11297.
Due to the apparent analytical advantages of mass spectrometry in providing high detection sensitivity, accuracy of mass measurements, detailed structural information by CAD in conjunction with an MS/MS configuration and speed, as well as on-line data transfer to a computer, there has been considerable interest in the use of mass spectrometry for the structural analysis of nucleic acids. Recent reviews summarizing this field include K. H. Schram, "Mass Spectrometry of Nucleic Acid Components, Biomedical Applications of Mass Spectrometry" 34, 203-287 (1990); P. F. Crain, "Mass Spectrometric Techniques in Nucleic Acid Research," Mass Spectrometry Reviews 9, 505-554 (1990); and Murray, K. (1996) J. Mass. Spectrom. Rev. 31:1203; and Nordhoff et al. (1997) J. Mass Spectrom. 15:67.
However, analysis of DNA molecules by mass spectrometry has certain limitations, such as the fact that nucleic acids are very polar biopolymers that are very difficult to volatilize.