Detection of Mutations
The genetic information of all living organisms (e.g., animals, plants and microorganisms) is encoded in deoxyribonucleic acid (DNA). In humans, the complete genome is contains of about 100,000 genes located on 24 chromosomes (The Human Genome, T. Strachan, BIOS Scientific Publishers, 1992). Each gene codes for a specific protein, which after its expression via transcription and translation, fulfills a specific biochemical function within a living cell. Changes in a DNA sequence are known as mutations and can result in proteins with altered or in some cases even lost biochemical activities; this in turn can cause genetic disease. Mutations include nucleotide deletions, insertions or alterations (i.e. point mutations). Point mutations can be either “missense”, resulting in a change in the amino acid sequence of a protein or “nonsense” coding for a stop codon and thereby leading to a truncated protein.
More than 3000 genetic diseases are currently known (Human Genome Mutations, D. N. Cooper and M. Krawczak, BIOS Publishers, 1993), including hemophilias, thalassemias, Duchenne Muscular Dystrophy (DMD), Huntington's Disease (HD), Alzheimer's Disease and Cystic Fibrosis (CF). In addition to mutated genes, which result in genetic disease, certain birth defects are the result of chromosomal abnormalities such as Trisomy 21 (Down's Syndrome), Trisomy 13 (Patau Syndrome), Trisomy 18 (Edward's Syndrome), Monosomy X (Turner's Syndrome) and other sex chromosome aneuploidies such as Klienfelter's Syndrome (XXY). Further, there is growing evidence that certain DNA sequences may predispose an individual to any of a number of diseases such as diabetes, arteriosclerosis, obesity, various autoimmune diseases and cancer (e.g., colorectal, breast, ovarian, lung).
Viruses, bacteria, fungi and other infectious organisms contain distinct nucleic acid sequences, which are different from the sequences contained in the host cell. Therefore, infectious organisms can also be detected and identified based on their specific DNA sequences.
Since the sequence of about 16 nucleotides is specific on statistical grounds even for the size of the human genome, relatively short nucleic acid sequences can be used to detect normal and defective genes in higher organisms and to detect infectious microorganisms (e.g., bacteria, fungi, protists and yeast) and viruses. DNA sequences can even serve as a fingerprint for detection of different individuals within the same species (see, Thompson, J. S. and M. W. Thompson, eds., Genetics in Medicine, W.B. Saunders Co., Philadelphia, Pa. (1991)).
Several methods for detecting DNA are currently being used. For example, nucleic acid sequences can be identified by comparing the mobility of an amplified nucleic acid fragment with a known standard by gel electrophoresis, or by hybridization with a probe, which is complementary to the sequence to be identified. Identification, however, can only be accomplished if the nucleic acid fragment is labeled with a sensitive reporter function (e.g., radioactive (32P, 35S), fluorescent or chemiluminescent). Radioactive labels can be hazardous and the signals they produce decay over time. Non-isotopic labels (e.g., fluorescent) suffer from a lack of sensitivity and fading of the signal when high intensity lasers are being used. Additionally, performing labeling, electrophoresis and subsequent detection are laborious, time-consuming and error-prone procedures. Electrophoresis is particularly error-prone, since the size or the molecular weight of the nucleic acid cannot be directly correlated to the mobility in the gel matrix. It is known that sequence specific effects, secondary structure and interactions with the gel matrix are causing artifacts.
Use of Mass Spectrometry for Detection and Identification of Nucleic Acids
Mass spectrometry provides a means of “weighing” individual molecules by ionizing the molecules in vacuo and making them “fly” by volatilization. Under the influence of combinations of electric and magnetic fields, the ions follow trajectories depending on their individual mass (m) and charge (z). In the range of molecules with low molecular weight, mass spectrometry has long been part of the routine physical-organic repertoire for analysis and characterization of organic molecules by the determination of the mass of the parent molecular ion. In addition, by arranging collisions of this parent molecular ion with other particles (e.g., argon atoms), the molecular ion is fragmented forming secondary ions by the so-called collision induced dissociation (CID). The fragmentation pattern/pathway very often allows the derivation of detailed structural information. Many applications of mass spectrometric methods are known in the art, particularly in biosciences (see, e.g., Methods in Enzymol., Vol. 193: “Mass Spectrometry” (J. A. McCloskey, editor), 1990, Academic Press, New York).
Because of the apparent analytical advantages of mass spectrometry in providing high detection sensitivity, accuracy of mass measurements, detailed structural information by CID in conjunction with an MS/MS configuration and speed, as well as on-line data transfer to a computer, there has been interest in the use of mass spectrometry for the structural analysis of nucleic acids. Recent reviews summarizing this field include K. H. Schram, “Mass Spectrometry of Nucleic Acid Components, Biomedical Applications of Mass Spectrometry” 34, 203–287 (1990); and P. F. Crain, “Mass Spectrometric Techniques in Nucleic Acid Research,” Mass Spectrometry Reviews 9, 505–554 (1990); see, also U.S. Pat. No. 5,547,835 and U.S. Pat. No. 5,622,824).
Nucleic acids, however, are very polar biopolymers that are very difficult to volatilize. Consequently, mass spectrometric detection has been limited to low molecular weight synthetic oligonucleotides for confirming an already known oligonucleotide sequence by determining the mass of the parent molecular ion, or alternatively, confirming a known sequence through the generation of secondary ions (fragment ions) via CID in an MS/MS configuration using, in particular, for the ionization and volatilization, the method of fast atomic bombardment (FAB mass spectrometry) or plasma desorption (PD mass spectrometry). As an example, the application of FAB to the analysis of protected dimeric blocks for chemical synthesis of oligodeoxynucleotides has been described (Köster et al. (1987) Biomed. Environ. Mass Spectrometry 14, 111–116).
Other ionization/desorption techniques include electrospray/ion-spray (ES) and matrix-assisted laser desorption/ionization (MALDI). ES mass spectrometry has been introduced by Fenn et al. (J. Phys. Chem. 88:4451–59 (1984); PCT Application No. WO 90/14148) and current applications are summarized in review articles (see, e.g., Smith et al. (1990) Anal. Chem. 62:882–89 and Ardrey (1992) Electrospray Mass Spectrometry, Spectroscopy Europe 4:10–18). The molecular weights of a tetradecanucleotide (see, Covey et al. (1988) The “Determination of Protein, Oligonucleotide and Peptide Molecular Weights by ionspray Mass Spectrometry,” Rapid Commun. in Mass Spectrometry 2:249–256), and of a 21-mer (Methods in Enzymol., 193, “Mass Spectrometry” (McCloskey, editor), p. 425, 1990, Academic Press, New York) have been published. As a mass analyzer, a quadrupole is most frequently used. Because of the presence of multiple ion peaks that all could be used for the mass calculation, the determination of molecular weights in femtomole amounts of sample is very accurate.
MALDI mass spectrometry, in contrast, can be attractive when a time-of-flight (TOF) configuration (see, Hillenkamp et al. (1990) pp 49–60 in “Matrix Assisted UV-Laser Desorption/Ionization: A New Approach to Mass Spectrometry of Large Biomolecules,” Biological Mass Spectrometry, Burlingame and McCloskey, editors, Elsevier Science Publishers, Amsterdam) is used as a mass analyzer. Since, in most cases, no multiple molecular ion peaks are produced with this technique, the mass spectra, in principle, look simpler compared to ES mass spectrometry.
Although DNA molecules up to a molecular weight of 410,000 daltons have been desorbed and volatilized (Williams et al., “Volatilization of High Molecular Weight DNA by Pulsed Laser Ablation of Frozen Aqueous Solutions,” Science 246, 1585–87 (1989)), this technique had only shown very low resolution (oligothymidylic acids up to 18 nucleotides, Huth-Fehre et al. Rapid Commun. in Mass Spectrom., 6, 209–13 (1992); DNA fragments up to 500 nucleotides in length K. Tang et al., Rapid Commun. in Mass Spectrom., 8, 727–730 (1994); and a double-stranded DNA of 28 base pairs (Williams et al., “Time-of-Flight Mass Spectrometry of Nucleic Acids by Laser Ablation and Ionization from a Frozen Aqueous Matrix,” Rapid Commun. in Mass Spectrom., 4, 348–351 (1990)). Japanese Patent No. 59-131909 describes an instrument, which detects nucleic acid fragments separated either by electrophoresis, liquid chromatography or high speed gel filtration. Mass spectrometric detection is achieved by incorporating into the nucleic acids, atoms, such as S, Br, I or Ag, Au, Pt, Os, Hg, that normally do not occur in DNA.
Co-owned U.S. Pat. No. 5,622,824 describes methods for DNA sequencing based on mass spectrometric detection. To achieve this, the DNA is by means of protection, specificity of enzymatic activity, or immobilization, unilaterally degraded in a stepwise manner via exonuclease digestion and the nucleotides or derivatives detected by mass spectrometry. Prior to the enzymatic degradation, sets of ordered deletions that span a cloned DNA fragment can be created. In this manner, mass-modified nucleotides can be incorporated using a combination of exonuclease and DNA/RNA polymerase. This permits either multiplex mass spectrometric detection, or modulation of the activity of the exonuclease so as to synchronize the degradative process. Co-owned U.S. Pat. Nos. 5,605,798 and 5,547,835 provide methods for detecting a particular nucleic acid sequence in a biological sample. Depending on the sequence to be detected, the processes can be used, for example, in methods of diagnosis. These methods, while broadly useful and applicable to numerous embodiments, represent the first disclosure of such applications and can be improved upon.
Therefore, it is an object herein to provided improved methods for sequencing and detecting DNA molecules in biological samples. It is also an object herein to provided improved methods for diagnosis of genetic diseases, predispositions to certain diseases, cancers, and infections.