The genetic information of all living organisms (e.g. animals, plants and microorganisms) is encoded in deoxyribonucleic acid (DNA). In humans, the complete genome is comprised of about 100,000 genes located on 24 chromosomes (The Human Genome, T. Strachan, BIOS Scientific Publishers, 1992). Each gene codes for a specific protein which after its expression via transcription and translation, fulfills a specific biochemical function within a living cell. Changes in a DNA sequence are known as mutations and can result in proteins with altered or in some cases even lost biochemical activities; this in turn can cause genetic disease. Mutations include nucleotide deletions, insertions or alterations (i.e. point mutations). Point mutations can be either "missense", resulting in a change in the amino acid sequence of a protein or "nonsense" coding for a stop codon and thereby leading to a truncated protein.
More than 3000 genetic diseases are currently known (Human Genome Mutations, D. N. Cooper and M. Krawczak, BIOS Publishers, 1993), including hemophilias, thalassemias, Duchenne Muscular Dystrophy (DMD), Huntington's Disease (HD), Alzheimer's Disease and Cystic Fibrosis (CF). In addition to mutated genes, which result in genetic disease, certain birth defects are the result of chromosomal abnormalities such as Trisomy 21 (Down's Syndrome), Trisomy 13 (Patau Syndrome), Trisomy 18 (Edward's Syndrome), Monosomy X (Tumer's Syndrome) and other sex chromosome aneuploidies such as Klienfelter's Syndrome (XXY). Further, there is growing evidence that certain DNA sequences may predispose an individual to any of a number of diseases such as diabetes, arteriosclerosis, obesity, various autoimmune diseases and cancer (e.g. colorectal, breast, ovarian, lung).
Viruses, bacteria, fungi and other infectious organisms contain distinct nucleic acid sequences, which are different from the sequences contained in the host cell. Therefore, infectious organisms can also be detected and identified based on their specific DNA sequences.
Since the sequence of about 16 nucleotides is specific on statistical grounds even for the size of the human genome, relatively short nucleic acid sequences can be used to detect normal and defective genes in higher organisms and to detect infectious microorganisms (e.g. bacteria, fungi, protists and yeast) and viruses. DNA sequences can even serve as a fingerprint for detection of different individuals within the same species. (Thompson, J. S. and M. W. Thompson, eds., Genetics in Medicine, W. B. Saunders Co., Philadelphia, Pa. (1986).
Several methods for detecting DNA are currently being used. For example, nucleic acid sequences can be identified by comparing the mobility of an amplified nucleic acid fragment with a known standard by gel electrophoresis, or by hybridization with a probe, which is complementary to the sequence to be identified. Identification, however, can only be accomplished if the nucleic acid fragment is labeled with a sensitive reporter function (e.g. radioactive (.sup.32 P, .sup.35 S), fluorescent or chemiluminescent). However, radioactive labels can be hazardous and the signals they produce decay over time. Non-isotopic labels (e.g. fluorescent) suffer from a lack of sensitivity and fading of the signal when high intensity lasers are being used. Additionally, performing labeling, electrophoresis and subsequent detection are laborious, time-consuming and error-prone procedures. Electrophoresis is particularly error-prone, since the size or the molecular weight of the nucleic acid cannot be directly correlated to the mobility in the gel matrix. It is known that sequence specific effects, secondary structures and interactions with the gel matrix are causing artefacts.
In general, mass spectrometry provides a means of "weighing" individual molecules by ionizing the molecules in vacuo and making them "fly" by volatilization. Under the influence of combinations of electric and magnetic fields, the ions follow trajectories depending on their individual mass (m) and charge (z). In the range of molecules with low molecular weight, mass spectrometry has long been part of the routine physical-organic repertoire for analysis and characterization of organic molecules by the determination of the mass of the parent molecular ion. In addition, by arranging collisions of this parent molecular ion with other particles (e.g., argon atoms), the molecular ion is fragmented forming secondary ions by the so-called collision induced dissociation (CID). The fragmentation pattern/pathway very often allows the derivation of detailed structural information. Many applications of mass spectrometric methods are known in the art, particularly in biosciences, and can be found summarized in Methods in Enzymology, Vol. 193: "Mass Spectrometry" (J. A. McCloskey, editor), 1990, Academic Press, New York.
Due to the apparent analytical advantages of mass spectrometry in providing high detection sensitivity, accuracy of mass measurements, detailed structural information by CID in conjunction with an MS/MS configuration and speed, as well as on-line data transfer to a computer, there has been considerable interest in the use of mass spectrometry for the structural analysis of nucleic acids. Recent reviews summarizing this field include K. H. Schram, "Mass Spectrometry of Nucleic Acid Components, Biomedical Applications of Mass Spectrometry" 34, 203-287 (1990); and P. F. Crain, "Mass Spectrometric Techniques in Nucleic Acid Research," Mass Spectrometry Reviews 9, 505-554 (1990).
However, nucleic acids are very polar biopolymers that are very difficult to volatilize. Consequently, mass spectrometric detection has been limited to low molecular weight synthetic oligonucleotides by determining the mass of the parent molecular ion and through this, confirming the already known oligonucleotide sequence or alternatively, confirming the known sequence through the generation of secondary ions (fragment ions) via CID in an MS/MS configuration utilizing, in particular, for the ionization and volatilization, the method of fast atomic bombardment (FAB mass spectrometry) or plasma desorption (PD mass spectrometry). As an example, the application of FAB to the analysis of protected dimeric blocks for chemical synthesis of oligodeoxynucleotides has been described (Koster et al. Biomedical Environmental Mass Spectrometry 14, 111-116 (1987)).
Two more recent ionization/desorption techniques are electrospray/ionspray (ES) and matrix-assisted laser desorption/ionization (MALDI). ES mass spectrometry has been introduced by Fenn et al. (J. Phys. Chem. 88, 4451-59 (1984); PCT Application No. WO 90/14148) and current applications are summarized in recent review articles (R. D. Smith et al., Anal. Chem. 62, 882-89 (1990) and B. Ardrey, Electrospray Mass Spectrometry, Spectroscopy Europe, 4, 10-18 (1992)). The molecular weights of a tetradecanucleotide (Covey et al. "The Determination of Protein, Oligonucleotide and Peptide Molecular Weights by Ionspray Mass Spectrometry," Rapid Communications in Mass Spectrometry, 2, 249-256 (1988)), and of a 21-mer (Methods in Enzymology, 193, "Mass Spectrometry" (McCloskey, editor), p. 425, 1990, Academic Press, New York) have been published. As a mass analyzer, a quadrupole is most frequently used. The determination of molecular weights in femtomole amounts of sample is very accurate due to the presence of multiple ion peaks which all could be used for the mass calculation.
MALDI mass spectrometry, in contrast, can be particularly attractive when a time-of-flight (TOF) configuration is used as a mass analyzer. The MALDI-TOF mass spectrometry has been introduced by Hillenkamp et al. ("Matrix Assisted UV-Laser Desorption/Ionization: A New Approach to Mass Spectrometry of Large Biomolecules," Biological Mass Spectrometry (Burlingame and McCloskey, editors), Elsevier Science Publishers, Amsterdam, pp. 49-60, 1990.) Since, in most cases, no multiple molecular ion peaks are produced with this technique, the mass spectra, in principle, look simpler compared to ES mass spectrometry.
Although DNA molecules up to a molecular weight of 410,000 daltons have been desorbed and volatilized (Williams et al., "Volatilization of High Molecular Weight DNA by Pulsed Laser Ablation of Frozen Aqueous Solutions," Science, 26, 1585-87 (1989)), this technique has so far only shown very low resolution (oligothymidylic acids up to 18 nucleotides, Huth-Fehre et al., Rapid Communications in Mass Spectrometry, 6, 209-13 (1992); DNA fragments up to 500 nucleotides in length K. Tang et al., Rapid Communications in Mass Spectrometry, 8, 727-730 (1994); and a double-stranded DNA of 28 base pairs (Williams et al., "Time-of-Flight Mass Spectrometry of Nucleic Acids by Laser Ablation and Ionization from a Frozen Aqueous Matrix," Rapid Communications in Mass Spectrometry, 4, 348-351 (1990)).
Japanese Patent No. 59-131909 describes an instrument, which detects nucleic acid fragments separated either by electrophoresis, liquid chromatography or high speed gel filtration. Mass spectrometric detection is achieved by incorporating into the nucleic acids, atoms which normally do not occur in DNA such as S, Br, I or Ag, Au, Pt, Os, Hg.