In recent years, the molecular biology of a number of human genetic diseases has been elucidated by the application of recombinant DNA technology. More than 3000 diseases are known to be of genetic origin (Cooper and Krawczak, "Human Genome Mutations" (BIOS Publ. 1993)), including, for example, hemophilias, thalassemias, Duchenne muscular dystrophy, Huntington's disease, Alzheimer's disease and cystic fibrosis, as well as various cancers such as breast cancer. In addition to mutated genes that result in genetic disease, certain birth defects are the result of chromosomal abnormalities, including, for example, trisomy 21 (Down's syndrome), trisomy 13 (Patau syndrome), trisomy 18 (Edward's syndrome), monosomy X (Turner's syndrome) and other sex chromosome aneuploidies such as Klinefelter's syndrome (XXY).
Other genetic diseases are caused by an abnormal number of trinucleotide repeats in a gene. These diseases include Huntington's disease, prostate cancer, spinal cerebellar ataxia 1 (SCA-1), Fragile X syndrome (Kremer et al., Science 252:1711-14 (1991); Fu et al., Cell 67:1047-58 (1991); Hirst et al., J. Med. Genet. 28:824-29 (1991)); myotonic dystrophy type I (Mahadevan et al., Science 255:1253-55 (1992); Brook et al, Cell 68:799-808 (1992)), Kennedy's disease (also termed spinal and bulbar muscular atrophy (La Spada et al., Nature 352:77-79 (1991)), Machado-Joseph disease, and dentatorubral and pallidolyusian atrophy. The aberrant number of triplet repeats can be located in any region of a gene, including a coding region, a non-coding region of an exon, an intron, or a regulatory element such as a promoter. In certain of these diseases, for example, prostate cancer, the number of triplet repeats is positively correlated with prognosis of the disease.
Evidence indicates that amplification of a trinucleotide repeat is involved in the molecular pathology in each of the disorders listed above. Although some of these trinucleotide repeats appear to be in non-coding DNA, they clearly are involved with perturbations of genomic regions that ultimately affect gene expression. Perturbations of various dinucleotide and trinucleotide repeats resulting from somatic mutation in tumor cells also can affect gene expression or gene regulation.
Additional evidence indicates that certain DNA sequences predispose an individual to a number of other diseases, including diabetes, arteriosclerosis, obesity, various autoimmune diseases and cancers such as colorectal, breast, ovarian and lung cancer. Knowledge of the genetic lesion causing or contributing to a genetic disease allows one to predict whether a person has or is at risk of developing the disease or condition and also, at least in some cases, to determine the prognosis of the disease.
Numerous genes have polymorphic regions. Since individuals have any one of several allelic variants of a polymorphic region, each can be identified based on the type of allelic variants of polymorphic regions of genes. Such identification can be used, for example, for forensic purposes. In other situations, it is crucial to know the identity of allelic variants in an individual. For example, allelic differences in certain genes such as the major histocompatibility complex (MHC) genes are involved in graft rejection or graft versus host disease in bone marrow transplantation. Accordingly, it is highly desirable to develop rapid, sensitive, and accurate methods for determining the identity of allelic variants of polymorphic regions of genes or genetic lesions.
Several methods are used for identifying of allelic variants or genetic lesions. For example, the identity of an allelic variant or the presence of a genetic lesion can be determined by comparing the mobility of an amplified nucleic acid fragment with a known standard by gel electrophoresis, or by hybridization with a probe that is complementary to the sequence to be identified. Identification, however, only can be accomplished if the nucleic acid fragment is labeled with a sensitive reporter function, for example, a radioactive (.sup.32 P, .sup.35 S), fluorescent or chemiluminescent reporter. Radioactive labels can be hazardous and the signals they produce can decay substantially over time. Non-radioactive labels such as fluorescent labels can suffer from a lack of sensitivity and fading of the signal when high intensity lasers are used. Additionally, labeling, electrophoresis and subsequent detection are laborious, time-consuming and error-prone procedures. Electrophoresis is particularly error-prone, since the size or the molecular weight of the nucleic acid cannot be correlated directly to its mobility in the gel matrix because sequence specific effects, secondary structures and interactions with the gel matrix cause artifacts in its migration through the gel.
Mass spectrometry has been used for the sequence analysis of nucleic acids (see, for example, Schram, Mass Spectrometry of Nucleic Acid Components, Biomedical Applications of Mass Spectrometry 34:203-287 (1 990); Crain, Mass Spectrom. Rev. 9:505-554 (1990); Murray, J. Mass Spectrom. Rev. 31:1203 (1996); Nordhoff et al., J. Mass Spectrom. 15:67 (1997)). In general, mass spectrometry provides a means of "weighing" individual molecules by ionizing the molecules in vacuo and making them "fly" by volatilization. Under the influence of electric and/or magnetic fields, the ions follow trajectories depending on their individual mass (m) and charge (z). For molecules with low molecular weight, mass spectrometry is part of the routine physical-organic repertoire for analysis and characterization of organic molecules by the determination of the mass of the parent molecular ion. In addition, by arranging collisions of this parent molecular ion with other particles such as argon atoms, the molecular ion is fragmented, forming secondary ions by collisionally activated dissociation (CAD); the fragmentation pattern/pathway very often allows the derivation of detailed structural information. Many applications of mass spectrometric methods are known in the art, particularly in the biosciences (see Meth. Enzymol., Vol. 193, "Mass Spectrometry" (McCloskey, ed.; Academic Press, NY 1990; McLaffery et al., Acc. Chem. Res. 27:297-386 (1994); Chait and Kent, Science 257:1885-1894 (1992); Siuzdak, Proc. Natl. Acad. Sci., USA 91:11290-11297 (1994)), including methods for producing and analyzing biopolymer ladders (see, International PCT application No. WO 96/36732; U.S. Pat. No. 5,792,664). Despite the effort to apply mass spectrometry methods to the analysis of nucleic acid molecules, however, there are limitations, including physical and chemical properties of nucleic acids. Nucleic acids are very polar biopolymers that are difficult to volatilize.
Accordingly, a need exists for methods to determine the identity of a nucleic acid molecules, particularly genetic lesions in a nucleic acid molecule, using alternative methodologies. Therefore it is an object herein to provide processes and compositions that satisfy this need and provide additional advantages.