A. Field of the Invention
The present invention is generally directed to the field of genetic identity detection including forensic identification and paternity testing as well as genetic mapping. The present invention is more specifically directed to the use of mass spectrometry to detect length variations in DNA nucleotide sequence repeats, often referred to as short tandem repeats ("STR"), microsatellite repeats or simple sequence repeats ("SSR"). The invention is also directed to DNA sequences provided for the analysis of STR polymorphisms at specific loci on specific chromosomes.
B. Description of Related Art
Polymorphic DNA tandem repeat loci are useful DNA markers for paternity testing, human identification, and genetic mapping. Higher organisms, including plants, animals and humans, contain segments of DNA sequence with variable sequence repeats. Commonly sized repeats include dinucleotides, trinucleotides, tetranucleotides and larger. The number of repeats occurring at a particular genetic locus vary depending on the locus and the individual from a few to hundreds. The sequence and base composition of repeats can vary significantly, not even remaining constant within a particular nucleotide repeat locus. DNA nucleotide repeats are known by several different names including microsatellite repeats, simple sequence repeats, short tandem repeats and variable nucleotide tandem repeats. As used herein, the term "DNA tandem nucleotide repeat" ("DTNR") refers to all types of tandem repeat sequences.
Thousands of DTNR loci have been identified in the human genome and have been predicted to occur as frequently as once every 15 kb. Population studies have been undertaken on dozens of these STR markers as well as extensive validation studies in forensic laboratories. Specific primer sequences located in the regions flanking the DNA tandem repeat region have been used to amplify alleles from DTNR loci via the polymerase chain reaction ("PCR.TM."). Thus, the PCR.TM. products include the polymorphic repeat regions, which vary in length depending on the number of repeats or partial repeats, and the flanking regions, which are typically of constant length and sequence between samples.
The number of repeats present for a particular individual at a particular locus is described as the allele value for the locus. Because most chromosomes are present in pairs, PCR.TM. amplifications of a single locus commonly yields two different sized PCR.TM. products representing two different repeat numbers or allele values. The range of possible repeat numbers for a given locus, determined through experimental sampling of the population, is defined as the allele range, and may vary for each locus, e.g., 7 to 15 alleles. The allele PCR.TM. product size range (allele size range) for a given locus is defined by the placement of the two PCR.TM. primers relative to the repeat region and the allele range. The sequences in regions flanking each locus must be fairly conserved in order for the primers to anneal effectively and initiate PCR.TM. amplificattion. For purposes of genetic analysis di-, tri-, and tetranucleotide repeats in the range of 5 to 50 are typically utilized in screens.
Many different primers have been designed for various DTNR loci and reported in the literature. These primers anneal to DNA sequences outside the DNA tandem repeat region to produce PCR.TM. products usually in the size range of 100-800 bp. These primers were designed with polyacrylamide gel electrophoretic separation in mind, because DNA separations have traditionally been performed by slab gel or capillary electrophoresis. However, with a mass spectrometry approach to DTNR typing and analysis, examining smaller DNA oligomers is advantageous because the sensitivity of detection and mass resolution are superior with smaller DNA oligomers.
The advantages of using mass spectrometry for characterizing DTNRs include a dramatic increase in both the speed of analysis (a few seconds per sample) and the accuracy of direct mass measurements. In contrast, electrophoretic methods require significantly longer lengths of time (minutes to hours) and can only measure the size of DTNRs as a function of relative mobility to comigrating standards. Gel-based separation systems also suffer from a number of artifacts that reduce the accuracy of size measurements. These mobility artifacts are related to the specific sequences of DNA fragments and the persistence of secondary and tertiary structural elements even under highly denaturing conditions.
The inventors have performed significant work in developing time-of-flight mass spectrometry ("TOF-MS") as a means for separating and sizing DNA molecules, although other forms of mass spectrometry can be used and are within the scope of this invention. Balancing the throughput and high mass accuracy advantages of TOF-MS is the limited size range for which the accuracy and resolution necessary for characterizing DTNRs by mass spectrometry is available. Current state of the art for TOF-MS offers single nucleotide resolution up to .about.100 nucleotides in size and four nucleotide resolution up to .about.160 nucleotides in size. These numbers are expected to grow as new improvements are developed in the mass spectrometric field.
Existing gel-based protocols for the analysis of DTNRs do not work with TOF-MS because the allele PCR.TM. product size range, typically between 100 and 800 nucleotides, is outside the current resolution capabilities of TOF-MS. Application of DTNR analysis to TOF-MS requires the development of new primer sets that produce small PCR.TM. products 50 to 160 nucleotides in length, preferably 50 to 100 nucleotides in length. Amplified DNA may also be used to generate single stranded DNA products that are in the preferred size range for TOF-MS analysis by extending a primer in the presence of a chain termination reagent. A typical class of chain termination reagent commonly used by those of skill in the art is the dideoxynucleotide triphosphates. Again, application of DTNR analysis to TOF-MS requires that the primer be extended to generate products of 50 to 160 nucleotides in size, and preferably 50 to 100 nucleotides in length.
Gel-based systems are capable of multiplexing the analysis of 2 or more DTNR loci using two approaches. The first approach is to size partition the different PCR.TM. product loci. Size partitioning involves designing the PCR.TM. primers used to amplify different loci so that that the allele PCR.TM. product size range for each locus covers a different and separable part of the gel size spectrum. As an example, the PCR.TM. primers for Locus A might be designed so that the allele size range is from 250 to 300 nucleotides, while the primers for Locus B are designed to produce an allele size range from 340 to 410 nucleotides.
The second approach to multiplexing 2 or more DTNR loci on gel-based systems is the use of spectroscopic partitioning. Current state of the art for gel-based systems involves the use of fluorescent dyes as specific spectroscopic markers for different PCR.TM. amplified loci. Different chromophores that emit light at different color wavelengths provide the means for differential detection of two different PCR.TM. products even if they are exactly the same size, thus 2 or more loci can produce PCR.TM. products with allele size ranges that overlap. For example, Locus A with a green fluorescent tag produces an allele size range from 250 to 300 nucleotides, while Locus B with a red fluorescent tag produces an allele size range of 270 to 330 nucleotides. A scanning, laser-excited fluorescence detection device monitors the wavelength of emissions and assigns different PCR.TM. product sizes, and their corresponding allele values, to their specific loci based on their fluorescent color.
In contrast, mass spectrometry directly detects the molecule preventing the use of optical spectroscopic partitioning as a means for multiplexing. While it is possible to have a limited use of size partitioning with TOF-MS, the limited size range of high-resolution detection by TOF-MS makes it likely that only 2 different loci can be multiplexed and size partitioned. In many cases, it may not be possible to even multiplex 2 loci and maintain a partitioning of the 2 different allele size ranges. Therefore, new methods are needed in order to employ mass spectrometry for the analysis of multiplexed DTNRs.