Mitochondrial DNA (mtDNA) is found in eukaryotes and differs from nuclear DNA in its location, its sequence, its quantity in the cell, and its mode of inheritance. The nucleus of the cell contains two sets of 23 chromosomes, one paternal set and one maternal set. However, cells may contain hundreds to thousands of mitochondria, each of which may contain several copies of mtDNA. Nuclear DNA has many more bases than mtDNA, but mtDNA is present in many more copies than nuclear DNA. This characteristic of mtDNA is useful in situations where the amount of DNA in a sample is very limited. Typical sources of DNA recovered from crime scenes include hair, bones, teeth, and body fluids such as saliva, semen, and blood.
In humans, mitochondrial DNA is inherited strictly from the mother (Case J. T. and Wallace, D. C., Somatic Cell Genetics, 1981, 7, 103-108; Giles, R. E. et al. Proc. Natl. Acad. Sci. 1980, 77, 6715-6719; Hutchison, C. A. et al Nature, 1974, 251, 536-538). Thus, the mtDNA sequences obtained from maternally related individuals, such as a brother and a sister or a mother and a daughter, will exactly match each other in the absence of a mutation. This characteristic of mtDNA is advantageous in missing persons cases as reference mtDNA samples can be supplied by any maternal relative of the missing individual (Ginther, C. et al. Nature Genetics, 1992, 2, 135-138; Holland, M. M. et al. Journal of Forensic Sciences, 1993, 38, 542-553; Stoneking, M. et al. American Journal of Human Genetics, 1991, 48, 370-382).
The human mtDNA genome is approximately 16,569 bases in length and has two general regions: the coding region and the control region. The coding region is responsible for the production of various biological molecules involved in the process of energy production in the cell and includes about 37 genes (22 transfer RNAs, 2 ribosomal RNAs, and 13 peptides), with very little intergenic sequence and no introns. The control region is responsible for regulation of the mtDNA molecule. Two regions of mtDNA within the control region have been found to be highly polymorphic, or variable, within the human population (Greenberg, B. D. et al. Gene, 1983, 21, 33-49). These two regions are termed “hypervariable Region I” (HV1), which has an approximate length of 342 base pairs (bp), and “hypervariable Region II” (HV2), which has an approximate length of 268 bp. Forensic mtDNA examinations are performed using these two regions because of the high degree of variability found among individuals.
There exists a need for rapid identification of humans wherein human remains and/or biological samples are analyzed. Such remains or samples may be associated with war-related casualties, aircraft crashes, and acts of terrorism, for example. Analysis of mtDNA enables a rule-in/rule-out identification process for persons for whom DNA profiles from a maternal relative are available. Human identification by analysis of mtDNA can also be applied to human remains and/or biological samples obtained from crime scenes.
The process of human identification is a common objective of forensics investigations. As used herein, “forensics” is the study of evidence discovered at a crime or accident scene and used in a court of law. “Forensic science” is any science used for the purposes of the law, in particular the criminal justice system, and therefore provides impartial scientific evidence for use in the courts of law, and in a criminal investigation and trial. Forensic science is a multidisciplinary subject, drawing principally from chemistry and biology, but also from physics, geology, psychology and social science, for example.
Forensic scientists generally use two highly variable regions of human mtDNA for analysis. These regions are designated “hypervariable regions 1 and 2” (HV1 and HV2 which contain 341 and 267 base pairs respectively). These hypervariable regions, or portions thereof, provide one non-limiting example of mitochondrial DNA identifying amplicons.
A typical mtDNA analysis begins when total genomic DNA is extracted from biological material, such as a tooth, blood sample, or hair. The polymerase chain reaction (PCR) is then used to amplify, or create many copies of, the two hypervariable portions of the non-coding region of the mtDNA molecule, using flanking primers. Care is taken to eliminate the introduction of exogenous DNA during both the extraction and amplification steps via methods such as the use of pre-packaged sterile equipment and reagents, aerosol-resistant barrier pipette tips, gloves, masks, and lab coats, separation of pre- and post-amplification areas in the lab using dedicated reagents for each, ultraviolet irradiation of equipment, and autoclaving of tubes and reagent stocks. In casework, questioned samples are always processed before known samples and they are processed in different laboratory rooms. When adequate amounts of PCR product are amplified to provide all the necessary information about the two hypervariable regions, sequencing reactions are performed. These chemical reactions use each PCR product as a template to create a new complementary strand of DNA in which some of the nucleotide residues that make up the DNA sequence are labeled with dye. The strands created in this stage are then separated according to size by an automated sequencing machine that uses a laser to “read” the sequence, or order, of the nucleotide bases. Where possible, the sequences of both hypervariable regions are determined on both strands of the double-stranded DNA molecule, with sufficient redundancy to confirm the nucleotide substitutions that characterize that particular sample. At least two forensic analysts independently assemble the sequence and then compare it to a standard, commonly used, reference sequence. The entire process is then repeated with a known sample, such as blood or saliva collected from a known individual. The sequences from both samples, about 780 bases long each, are compared to determine if they match. The analysts assess the results of the analysis and determine if any portions of it need to be repeated. Finally, in the event of an inclusion or match, the SWGDAM mtDNA database, which is maintained by the FBI, is searched for the mitochondrial sequence that has been observed for the samples. The analysts can then report the number of observations of this type based on the nucleotide positions that have been read. A written report can be provided to the submitting agency.
Approximately 610 bp of mtDNA are currently sequenced in forensic mtDNA analysis. Recording and comparing mtDNA sequences would be difficult and potentially confusing if all of the bases were listed. Thus, mtDNA sequence information is recorded by listing only the differences with respect to a reference DNA sequence. By convention, human mtDNA sequences are described using the first complete published mtDNA sequence as a reference (Anderson, S. et al., Nature, 1981, 290, 457-465). This sequence is commonly referred to as the Anderson sequence. It is also called the Cambridge reference sequence or the Oxford sequence. Each base pair in this sequence is assigned a number. Deviations from this reference sequence are recorded as the number of the position demonstrating a difference and a letter designation of the different base. For example, a transition from A to G at Position 263 would be recorded as 263 G. If deletions or insertions of bases are present in the mtDNA, these differences are denoted as well.
In the United States, there are seven laboratories currently conducting forensic mtDNA examinations: the FBI Laboratory; Laboratory Corporation of America (LabCorp) in Research Triangle Park, North Carolina; Mitotyping Technologies in State College, Pa.; the Bode Technology Group (BTG) in Springfield, Va.; the Armed Forces DNA Identification Laboratory (AFDIL) in Rockville, Md.; BioSynthesis, Inc. in Lewisville, Tex.; and Reliagene in New Orleans, La.
Mitochondrial DNA analyses have been admitted in criminal proceedings from these laboratories in the following states as of April 1999: Alabama, Arkansas, Florida, Indiana, Illinois, Maryland, Michigan, New Mexico, North Carolina, Pennsylvania, South Carolina, Tennessee, Texas, and Washington. Mitochondrial DNA has also been admitted and used in criminal trials in Australia, the United Kingdom, and several other European countries.
Since 1996, the number of individuals performing mitochondrial DNA analysis at the FBI Laboratory has grown from 4 to 12, with more personnel expected in the near future. Over 150 mitochondrial DNA cases have been completed by the FBI Laboratory as of March 1999, and dozens more await analysis. Forensic courses are being taught by the FBI Laboratory personnel and other groups to educate forensic scientists in the procedures and interpretation of mtDNA sequencing. More and more individuals are learning about the value of mtDNA sequencing for obtaining useful information from evidentiary samples that are small, degraded, or both. Mitochondrial DNA sequencing is becoming known not only as an exclusionary tool but also as a complementary technique for use with other human identification procedures. Mitochondrial DNA analysis will continue to be a powerful tool for law enforcement officials in the years to come as other applications are developed, validated, and applied to forensic evidence.
Presently, the forensic analysis of mtDNA is rigorous and labor-intensive. Currently, only 1-2 cases per month per analyst can be performed. Several molecular biological techniques are combined to obtain a mtDNA sequence from a sample. The steps of the mtDNA analysis process include primary visual analysis, sample preparation, DNA extraction, polymerase chain reaction (PCR) amplification, post-amplification quantification of the DNA, automated DNA sequencing, and data analysis. Another complicating factor in the forensic analysis of mtDNA is the occurrence of heteroplasmy wherein the pool of mtDNAs in a given cell is heterogeneous due to mutations in individual mtDNAs. There are two forms of heteroplasmy found in mtDNA. Sequence heteroplasmy (also known as point heteroplasmy) is the occurrence of more than one base at a particular position or positions in the mtDNA sequence. Length heteroplasmy is the occurrence of more than one length of a stretch of the same base in a mtDNA sequence as a result of insertion of nucleotide residues.
Heteroplasmy is a problem for forensic investigators since a sample from a crime scene can differ from a sample from a suspect by one base pair and this difference may be interpreted as sufficient evidence to eliminate that individual as the suspect. Hair samples from a single individual can contain heteroplasmic mutations at vastly different concentrations and even the root and shaft of a single hair can differ. The detection methods currently available to molecular biologists cannot detect low levels of heteroplasmy. Furthermore, if present, length heteroplasmy will adversely affect sequencing runs by resulting in an out-of-frame sequence that cannot be interpreted.
Mass spectrometry provides detailed information about the molecules being analyzed, including high mass accuracy. It is also a process that can be easily automated.
Several groups have described detection of PCR products using high resolution electrospray ionization-Fourier transform-ion cyclotron resonance mass spectrometry (ESI-FT-ICR MS). Accurate measurement of exact mass combined with knowledge of the number of at least one nucleotide allowed calculation of the total base composition for PCR duplex products of approximately 100 base pairs. (Aaserud et al., J. Am. Soc. Mass Spec., 1996, 7, 1266-1269; Muddiman et al, Anal Chem., 1997, 69, 1543-1549; Wunschel et al., Anal. Chem., 1998, 70, 1203-1207; Muddiman et al., Rev. Anal. Chem., 1998, 17, 1-68). Electrospray ionization-Fourier transform-ion cyclotron resistance (ESI-FT-ICR) MS may be used to determine the mass of double-stranded, 500 base-pair PCR products via the average molecular mass (Hurst et al., Rapid Commun. Mass Spec. 1996, 10, 377-382). The use of matrix-assisted laser desorption ionization-time of flight (MALDI-TOF) mass spectrometry for characterization of PCR products has been described. (Muddiman et al., Rapid Commun. Mass Spec., 1999, 13, 1201-1204). However, the degradation of DNAs over about 75 nucleotides observed with MALDI limited the utility of this method.
U.S. Pat. No. 5,849,492 reports a method for retrieval of phylogenetically informative DNA sequences which comprise searching for a highly divergent segment of genomic DNA surrounded by two highly conserved segments, designing the universal primers for PCR amplification of the highly divergent region, amplifying the genomic DNA by PCR technique using universal primers, and then sequencing the gene to determine the identity of the organism.
U.S. Pat. No. 5,965,363 reports methods for screening nucleic acids for polymorphisms by analyzing amplified target nucleic acids using mass spectrometric techniques and to procedures for improving mass resolution and mass accuracy of these methods.
WO 99/14375 reports methods, PCR primers and kits for use in analyzing preselected DNA tandem nucleotide repeat alleles by mass spectrometry.
WO 98/12355 reports methods of determining the mass of a target nucleic acid by mass spectrometric analysis, by cleaving the target nucleic acid to reduce its length, making the target single-stranded and using MS to determine the mass of the single-stranded shortened target. Also reported are methods of preparing a double-stranded target nucleic acid for MS analysis comprising amplification of the target nucleic acid, binding one of the strands to a solid support, releasing the second strand and then releasing the first strand which is then analyzed by MS. Kits for target nucleic acid preparation are also reported.
PCT WO97/33000 reports methods for detecting mutations in a target nucleic acid by nonrandomly fragmenting the target into a set of single-stranded nonrandom length fragments and determining their masses by MS.
U.S. Pat. No. 5,605,798 reports a fast and highly accurate mass spectrometer-based process for detecting the presence of a particular nucleic acid in a biological sample for diagnostic purposes.
WO 98/20166 reports processes for determining the sequence of a particular target nucleic acid by mass spectrometry. Processes for detecting a target nucleic acid present in a biological sample by PCR amplification and mass spectrometry detection are disclosed, as are methods for detecting a target nucleic acid in a sample by amplifying the target with primers that contain restriction sites and tags, extending and cleaving the amplified nucleic acid, and detecting the presence of extended product, wherein the presence of a DNA fragment of a mass different from wild-type is indicative of a mutation. Methods of sequencing a nucleic acid via mass spectrometry methods are also described.
WO 97/37041, WO 99/31278 and U.S. Pat. No. 5,547,835 report methods of sequencing nucleic acids using mass spectrometry. U.S. Pat. Nos. 5,622,824, 5,872,003 and 5,691,141 report methods, systems and kits for exonuclease-mediated mass spectrometric sequencing.
There is a need for a mitochondrial DNA forensic analysis which is both specific and rapid, and in which no nucleic acid sequencing is required. The present invention addresses this need, among others.