A. Field of the Invention
The present invention is generally directed to the field of genetic identity detection including forensic identification and paternity testing as well as genetic mapping. The present invention is more specifically directed to the use of mass spectrometry to detect length variations in DNA nucleotide sequence repeats, often referred to as short tandem repeats (xe2x80x9cSTRxe2x80x9d), microsatellite repeats or simple sequence repeats (xe2x80x9cSSRxe2x80x9d). The invention is also directed to DNA sequences provided for the analysis of STR polymorphisms at specific loci on specific chromosomes.
B. Description of Related Art
Polymorphic DNA tandem repeat loci are useful DNA markers for paternity testing, human identification, and genetic mapping. Higher organisms, including plants, animals and humans, contain segments of DNA sequence with variable sequence repeats. Commonly sized repeats include dinucleotides, trinucleotides, tetranucleotides and larger. The number of repeats occurring at a particular genetic locus vary depending on the locus and the individual from a few to hundreds. The sequence and base composition of repeats can vary significantly, not even remaining constant within a particular nucleotide repeat locus. DNA nucleotide repeats are known by several different names including microsatellite repeats, simple sequence repeats, short tandem repeats and variable nucleotide tandem repeats. As used herein, the term xe2x80x9cDNA tandem nucleotide repeatxe2x80x9d (xe2x80x9cDTNRxe2x80x9d) refers to all types of tandem repeat sequences.
Thousands of DTNR loci have been identified in the human genome and have been predicted to occur as frequently as once every 15 kb. Population studies have been undertaken on dozens of these STR markers as well as extensive validation studies in forensic laboratories. Specific primer sequences located in the regions flanking the DNA tandem repeat region have been used to amplify alleles from DTNR loci via the polymerase chain reaction (xe2x80x9cPCR(trademark)xe2x80x9d). Thus, the PCR(trademark) products include the polymorphic repeat regions, which vary in length depending on the number of repeats or partial repeats, and the flanking regions, which are typically of constant length and sequence between samples.
The number of repeats present for a particular individual at a particular locus is described as the allele value for the locus. Because most chromosomes are present in pairs, PCR(trademark) amplifications of a single locus commonly yields two different sized PCR(trademark) products representing two different repeat numbers or allele values. The range of possible repeat numbers for a given locus, determined through experimental sampling of the population, is defined as the allele range, and may vary for each locus, e.g., 7 to 15 alleles. The allele PCR(trademark) product size range (allele size range) for a given locus is defined by the placement of the two PCR(trademark) primers relative to the repeat region and the allele range. The sequences in regions flanking each locus must be fairly conserved in order for the primers to anneal effectively and initiate PCR(trademark) amplification. For purposes of genetic analysis di-, tri-, and tetranucleotide repeats in the range of 5 to 50 are typically utilized in screens.
Many different primers have been designed for various DTNR loci and reported in the literature. These primers anneal to DNA sequences outside the DNA tandem repeat region to produce PCR(trademark) products usually in the size range of 100-800 bp. These primers were designed with polyacrylamide gel electrophoretic separation in mind, because DNA separations have traditionally been performed by slab gel or capillary electrophoresis. However, with a mass spectrometry approach to DTNR typing and analysis, examining smaller DNA oligomers is advantageous because the sensitivity of detection and mass resolution are superior with smaller DNA oligomers.
The advantages of using mass spectrometry for characterizing DTNRs include a dramatic increase in both the speed of analysis (a few seconds per sample) and the accuracy of direct mass measurements. In contrast, electrophoretic methods require significantly longer lengths of time (minutes to hours) and can only measure the size of DTNRs as a function of relative mobility to comigrating standards. Gel-based separation systems also suffer from a number of artifacts that reduce the accuracy of size measurements. These mobility artifacts are related to the specific sequences of DNA fragments and the persistence of secondary and tertiary structural elements even under highly denaturing conditions.
The inventors have performed significant work in developing time-of-flight mass spectrometry (xe2x80x9cTOF-MSxe2x80x9d) as a means for separating and sizing DNA molecules, although other forms of mass spectrometry can be used and are within the scope of this invention. Balancing the throughput and high mass accuracy advantages of TOF-MS is the limited size range for which the accuracy and resolution necessary for characterizing DTNRs by mass spectrometry is available. Current state of the art for TOF-MS offers single nucleotide resolution up to xcx9c100 nucleotides in size and four nucleotide resolution up to xcx9c160 nucleotides in size. These numbers are expected to grow as new improvements are developed in the mass spectrometric field.
Existing gel-based protocols for the analysis of DTNRs do not work with TOF-MS because the allele PCR(trademark) product size range, typically between 100 and 800 nucleotides, is outside the current resolution capabilities of TOF-MS. Application of DTNR analysis to TOF-MS requires the development of new primer sets that produce small PCR(trademark) products 50 to 160 nucleotides in length, preferably 50 to 100 nucleotides in length. Amplified DNA may also be used to generate single stranded DNA products that are in the preferred size range for TOF-MS analysis by extending a primer in the presence of a chain termination reagent. A typical class of chain termination reagent commonly used by those of skill in the art is the dideoxynucleotide triphosphates. Again, application of DTNR analysis to TOF-MS requires that the primer be extended to generate products of 50 to 160 nucleotides in size, and preferably 50 to 100 nucleotides in length.
Gel-based systems are capable of multiplexing the analysis of 2 or more DTNR loci using two approaches. The first approach is to size partition the different PCR(trademark) product loci. Size partitioning involves designing the PCR(trademark) primers used to amplify different loci so that that the allele PCR(trademark) product size range for each locus covers a different and separable part of the gel size spectrum. As an example, the PCR(trademark) primers for Locus A might be designed so that the allele size range is from 250 to 300 nucleotides, while the primers for Locus B are designed to produce an allele size range from 340 to 410 nucleotides.
The second approach to multiplexing 2 or more DTNR loci on gel-based systems is the use of spectroscopic partitioning. Current state of the art for gel-based systems involves the use of fluorescent dyes as specific spectroscopic markers for different PCRT amplified loci. Different chromophores that emit light at different color wavelengths provide the means for differential detection of two different PCR(trademark) products even if they are exactly the same size, thus 2 or more loci can produce PCR(trademark) products with allele size ranges that overlap. For example, Locus A with a green fluorescent tag produces an allele size range from 250 to 300 nucleotides, while Locus B with a red fluorescent tag produces an allele size range of 270 to 330 nucleotides. A scanning, laser-excited fluorescence detection device monitors the wavelength of emissions and assigns different PCR(trademark) product sizes, and their corresponding allele values, to their specific loci based on their fluorescent color.
In contrast, mass spectrometry directly detects the molecule preventing the use of optical spectroscopic partitioning as a means for multiplexing. While it is possible to have a limited use of size partitioning with TOF-MS, the limited size range of high-resolution detection by TOF-MS makes it likely that only 2 different loci can be multiplexed and size partitioned. In many cases, it may not be possible to even multiplex 2 loci and maintain a partitioning of the 2 different allele size ranges. Therefore, new methods are needed in order to employ mass spectrometry for the analysis of multiplexed DTNRs.
It is, therefore, a goal of the present invention to provide newly designed PCR(trademark) primers which are closer to the repeat regions then have previously been employed providing for the efficient analysis by TOF-MS. Specifically, the invention provides oligonucleotide primers designed to characterize various DTNR markers useful for human identity testing. The primers are for use in PCR(trademark) amplification schemes, however, one of skill in the art could, in light of the present disclosure, employ them to generate appropriate size nucleic acid products for TOF-MS analysis using other methods of extending one or more of the disclosed primers. Additionally, these primers and their extension products are suitable for detection by mass spectrometry. Thus, applications of this invention include forensic and paternity testing and genetic mapping studies.
An embodiment of the present invention encompasses an oligonucleotide primer for use in analyzing alleles of a DNA tandem nucleotide repeat at a DNA tandem nucleotide repeat locus by mass spectrometry, which includes a nucleotide sequence that contains a flanking region of the locus where the primer upon extension generates a product that is capable of being analyzed by mass spectrometry. Preferably, the oligonucleotide primer""s 3xe2x80x2 end will be complementary to a region flanking a DNA tandem repeat region immediately adjacent to the DNA tandem repeat region or may further extend up to one, two, three, four or five tandem repeats into the DNA tandem repeat region. Used in this context xe2x80x9cimmediately adjacentxe2x80x9d or xe2x80x9cimmediately flankingxe2x80x9d means one, two, three, or four nucleotides away from the DNA tandem repeat region of the DNA tandem repeat locus.
The oligonucleotide primers of this invention are designed to generate extension products amenable to mass spectral analysis and containing a DTNR sequence, or region of interest, for which one is interested in determining the mass. The xe2x80x9cflankingxe2x80x9d regions of a DTNR locus are the portions of DNA sequence on either side of the DTNR region of interest. For embodiments employing PCR(trademark) primers and polymerases to amplify a DTNR sequence, the primers are sufficiently complementary to a portion of one or more flanking regions of the DTNR locus to allow the primer to effectively anneal to the target nucleic acid and provide a site to extend a complement to the target nucleic acid via PCR(trademark). For embodiments employing primer extension, a preferred method is to use a single primer that is sufficiently complementary to allow effective anealling to a portion of a target DTNR locus flanking region in conjunction with a chain termination reagent. The chain termination reagent allows the production of discreet limited size nucleic acid products for mass spectral analysis. Preferred chain termination reagent for use in the present invention are dideoxynucleotide triphosphates. Therefore, for the methods comprising any type of primer extension, it is preferred that at least one of the primers is sufficiently complementary to a portion of a flanking region that is preferably adjacent to or close to the DTNR region of interest, generally within about 40 nucleotides of the DNA tandem nucleotide repeat region. As used in this context, xe2x80x9caboutxe2x80x9d means anywhere from xc2x11 to 40 nucleotides, and all the integers in between, for example, xc2x11, xc2x12, xc2x13, xc2x14, xc2x15, xc2x16, xc2x17, xc2x18, xc2x19, xc2x110, etc. nucleotides.
The primer extension products are preferably single-stranded and may be any size that can be adequately resolved by mass spectrometric analysis. Preferably, detected, the final product single-stranded target nucleic acids are less than about 160 or 150 bases in length. More preferably, the extended nucleic acid products are from about 10 to 100 or 120 bases in length. As used in this context, xe2x80x9caboutxe2x80x9d means anywhere from xc2x11 to 20 bases, and all the integers in between, for example, xc2x11, xc2x12, xc2x13, xc2x14, xc2x15, xc2x16, xc2x17, xc2x18, xc2x19, xc2x110, etc. bases.
As used herein xe2x80x9caxe2x80x9d will be understood to mean one or more. Thus, xe2x80x9ca DNA tandem repeat markerxe2x80x9d may refer, for example, to one, two, three, four, five or more DNA tandem repeat markers.
The present invention is also directed to new oligonucleotide primers which have been designed to match a portion of the flanking regions for various DTNR loci. Specific embodiments of this invention include oligonucleotide primers designed to amplify the following DTNR loci: CSF1PO, D3S1358, D5S818, D7S820, D8S1179, D13S317, D16S539, D18S51, D21S11, DYS19, F13A1, FES/FPS, FGA, HPRTB, TH01, TPOX, DYS388, DYS391, DYS392, DYS393, D2S1391, D18S535, D2S1338, D19S433, D6S477, D1S518, D14S306, D22S684, F13B, CD4, D12S391, D10S220 and D7S523. With the exception of D3S 1358, sequences for the STR loci of this invention are accessible to the general public through GenBank using the accession numbers listed in Table 1. These oligonucleotide primers may preferably contain a cleavable site, such as a recognition site for Type II and IIS restriction endonucleases, an exonuclease blocking site, or a chemically cleavable site, for reducing the length of the amplified product and increasing the mass spectral resolution.
Examples of some oligonucleotide primers that may be employed for amplifying these loci are listed in SEQ ID NO:1 through SEQ ID NO:103. Preferred oligonucleotide primers that also contain a cleavable phosphorothioate linkage and biotin moiety for immobilization on an avidin, streptavidin solid support are sequences according to SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:11, SEQ ID NO:14, SEQ ID NO:16, SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:30, SEQ ID NO:31, SEQ ID NO:83, SEQ ID NO:84, SEQ ID NO:85, SEQ ID NO:86, SEQ ID NO:87, SEQ ID NO:88, SEQ ID NO:89, SEQ ID NO:90, SEQ ID NO:91, SEQ ID NO:92, SEQ ID NO:93, SEQ ID NO:94, SEQ ID NO:95, SEQ ID NO:96, SEQ ID NO:97, SEQ ID NO:98, SEQ ID NO:99, SEQ ID NO:100 and SEQ ID NO:103. These newly designed primers generate nucleic acid extension products which are smaller than those used previously with electrophoresis separation methods. Additionally, these primers may be used in other methods of primer extension known to those of skill in the art.
It will be apparent to one skilled in the art that some variations of these primers will also serve effectively, for example, adding or deleting one or a few bases from the primer and/or shifting the position of the primer relative to the DTNR sequence by one or a few bases. Thus, primers encompassed by the present invention include the primers specifically listed as well as modifications of these primers. Although these sequences are all biotinylated at the 5xe2x80x2 end and contain a phosphorothioate linkage at a particular location, one of skill in the art would recognize that similar primers having biotin moieties and the cleavable groups at other sites would also be encompassed by the present invention. Primers containing types of immobilization attachments sites other than biotin, for example, would also be encompassed. Typically, the placement of the cleavable group is not critical as long as it is close enough to the 3xe2x80x2 end to cleave the cleave the nucleic acid extension product to a reduced-length amplified product that is amenable to mass spectral analysis. These primers in pairs may also be combined to generate overlapping PCR(trademark) product sizes which are all distinguishable by mass. However, for embodiments multiplexing multiple DTNR loci with overlapping allelic mass ranges, strategic placement of the cleavable group may effect a separation or an interleaving of mass spectral peaks.
Another embodiment of this invention encompasses a kit for analyzing alleles of a DTNR locus in a target nucleic acid, having a first strand and a second complementary strand, by mass spectrometry which includes a first primer complementary to the flanking region of a DNA tandem nucleotide repeat region and a second primer complementary to the opposite flanking region of a DNA tandem nucleotide repeat region. Preferred kits of this invention are kits for analyzing the following DTNR loci: CSF1PO, D3S1358, D5S818, D7S820, D8S1179, D13S317, D16S539, D18S51, D21S11, DYS19, F13A1, FES/FPS, FGA, HPRTB, TH01, TPOX, DYS388, DYS391, DYS392, DYS393, D2S1391, D18S535, D2S1338, D19S433, D6S477, D1S518, D14S306, D22S684, F13B, CD4, D12S391, D10S220 and D7S523.
Another embodiment of this invention encompasses a kit for analyzing alleles of a multiple DTNR loci in a target nucleic acid by mass spectrometry, which includes a plurality of primers complementary to the flanking regions of DNA tandem nucleotide repeat regions. Preferred kits of this invention are kits for analyzing the following DTNR loci: CSF1PO, D3S1358, D5S818, D7S820, D8S1179, D13S317, D16S539, D18S51, D21S11, DYS19, F13A1, FES/FPS, FGA, HPRTB, TH01, TPOX, DYS388, DYS391, DYS392, DYS393, D2S1391, D18S535, D2S1338, D19S433, D6S477, D1S518, D14S306, D22S684, F13B, CD4, D12S391, D10S220 and D7S523.
The primers employed with these kits may preferably have cleavable sites, such as a recognition site for a restriction endonuclease, an exonuclease blocking site, or a chemically cleavable site. Preferred chemically cleavable sites encompass modified bases, modified sugars (e.g., ribose), and chemically cleavable groups incorporated into the phosphate backbone, such as dialkoxysilane, 3xe2x80x2-(S)-phosphorothioate, 5xe2x80x2-(S)-phosphorothioate, 3xe2x80x2-(N)-phosphoroamidate, or 5xe2x80x2-(N)-phosphoroamidate linkages. Another preferred embodiment is a kit employing a first primer that is capable of attaching to a solid support.
For primer extension by PCR amplification, it is preferable to employ these primers in pairs. Preferred pairs of primers include the following: a sequence according to SEQ ID NO:1 and a sequence according to SEQ ID NO:2; a sequence according to SEQ ID NO:3 and a sequence according to SEQ ID NO:4; a sequence according to SEQ ID NO:5 and a sequence according to SEQ ID NO:6; a sequence according to SEQ ID NO:7 and a sequence according to SEQ ID NO:8; a sequence according to SEQ ID NO:9 and a sequence according to SEQ ID NO:10; a sequence according to SEQ ID NO:11 and a sequence according to SEQ ID NO:12; a sequence according to SEQ ID NO:13 and a sequence according to SEQ ID NO:14; a sequence according to SEQ ID NO:15 and a sequence according to SEQ ID NO:16; a sequence according to SEQ ID NO:17 and a sequence according to SEQ ID NO:18; a sequence according to SEQ ID NO:19 and a sequence according to SEQ ID NO:20; a sequence according to SEQ ID NO:21 and a sequence according to SEQ ID NO:22; a sequence according to SEQ ID NO:23 and a sequence according to SEQ ID NO:24; a sequence according to SEQ ID NO:25 and a sequence according to SEQ ID NO:26; a sequence according to SEQ ID NO:27 and a sequence according to SEQ ID NO:28; a sequence according to SEQ ID NO:29 and a sequence according to SEQ ID NO:30; a sequence according to SEQ ID NO:31 and a sequence according to SEQ ID NO:32; a sequence according to SEQ ID NO:49 and a sequence according to SEQ ID NO:83; a sequence according to SEQ ID NO:52 and a sequence according to SEQ ID NO:84; a sequence according to SEQ ID NO:54 and a sequence according to SEQ ID NO:85; a sequence according to SEQ ID NO:56 and a sequence according to SEQ ID NO:86; a sequence according to SEQ ID NO:58 and a sequence according to SEQ ID NO:87; a sequence according to SEQ ID NO:59 and a sequence according to SEQ ID NO:88; a sequence according to SEQ ID NO:62 and a sequence according to SEQ ID NO:89; a sequence according to SEQ ID NO:63 and a sequence according to SEQ ID NO:90; a sequence according to SEQ ID NO:66 and a sequence according to SEQ ID NO:91; a sequence according to SEQ ID NO:67 and a sequence according to SEQ ID NO:92; a sequence according to SEQ ID NO:70 and a sequence according to SEQ ID NO:93; a sequence according to SEQ ID NO:72 and a sequence according to SEQ ID NO:94; a sequence according to SEQ ID NO:74 and a sequence according to SEQ ID NO:95; a sequence according to SEQ ID NO:76 and a sequence according to SEQ ID NO:96; a sequence according to SEQ ID NO:78 and a sequence according to SEQ ID NO:97; a sequence according to SEQ ID NO:80 and a sequence according to SEQ ID NO:98; a sequence according to SEQ ID NO:66 and a sequence according to SEQ ID NO:99; a sequence according to SEQ ID NO:33 and a sequence according to SEQ ID NO:100;and a sequence according to SEQ ID NO:101 and a sequence according to SEQ ID NO:103.
In one embodiment, at least one of the primers used to prepare the nucleic acid extension product contains a surface binding moiety, such as a biotin moiety, at the 5xe2x80x2-end and a cleavable moiety, such as a phosphorothioate linkage (see FIGS. 7A and 7B), near the 3xe2x80x2-end for a capture and release assay, such as one using streptavidin-coated magnetic beads for binding biotinylated primers, described in PCT Patent Application No. WO 96/37630, and incorporated herein by reference. These linkages are often referred as thiophosphate linkages as well. Incorporation of a method for obtaining single-stranded PCR(trademark) products, such as is possible with the primer modifications described above, is preferred. Removal of one of the two strands halves the number of DNA oligomers that will be visualized by TOF-MS and improves the likelihood of resolving all PCR(trademark) product strands.
Another embodiment of this invention encompasses a method for analyzing DNA tandem nucleotide repeat alleles at a DNA tandem nucleotide repeat locus in a target nucleic acid by mass spectrometry which includes the steps of a) obtaining a target nucleic acid containing a DNA tandem nucleotide repeat region; b) extending the target nucleic acid using one or more primers to obtain a limited size range of nucleic acid extension products, wherein the primers are complementary to a sequence flanking the DNA tandem nucleotide repeat of said locus; and c) determining the mass of the nucleic acid extension products by mass spectrometry, where the target nucleic acid is normally double-stranded (i.e. it has a first strand and a second complementary strand). Nucleic acid extension products may be generated in this method by any means known to those of skill in the art, and particularly either by amplification, such as PCR amplification, or by primer extension in conjunction with a chain termination reagent. Preferred primers may immediately flank the DNA tandem repeat locus, or may further extend up to one, two, three, four or five tandem repeats into the DNA tandem repeat region. Used in this context xe2x80x9cimmediately adjacentxe2x80x9d or xe2x80x9cimmediately flankingxe2x80x9d means one, two, three, or four nucleotides away from the DNA tandem repeat region of the DNA tandem repeat locus. Preferred primers may contain a cleavable site, such as a recognition site for a restriction endonuclease, an exonuclease blocking site, or a chemically cleavable site, and be capable of attaching to a solid support.
These primers may be capable of directly or indirectly attaching to a solid support via covalent or noncovalent binding. The primers may contain an immobilization attachment site (IAS) for attachment to a solid support. This site is usually upstream of the chemically cleavable site. A suitable immobilization attachment site is any site capable of being attached to a group on a solid support. These sites may be a substituent on a base or sugar of the primer. An IAS may be, for example, an antigen, biotin, or digoxigenin. This attachment allows for isolation of only one strand of an amplified product. Such isolation of either single-stranded or double-stranded amplified target nucleic acids generally occurs prior to the application of the nucleic acids to the matrix solution, resulting in well-defined mass spectral peaks and enhanced mass accuracy. The matrix solution can be any of the known matrix solutions used for mass spectrometric analysis, including 3-hydroxypicolinic acid (xe2x80x9c3-HPAxe2x80x9d), nicotinic acid, picolinic acid, 2,5-dihydroxybenzoic acid, and nitrophenol.
For example, in one embodiment, a strand of a target nucleic acid extension product may be bound or attached to a solid support to permit rigorous washing and concomitant removal of salt adducts, unwanted oligonucleotides and enzymes. Either a double-stranded or a single-stranded nucleic acid extension product may be isolated for mass spectrometric analysis. The single-stranded target nucleic acid extension product analyzed by MS may be either the strand bound or not bound to the solid support.
When an unbound strand is used for MS analysis, it is typically purified by first washing the bound strand and its attached complement under conditions not sufficiently rigorous to disrupt the strand""s attachment to its bound complement. After unwanted biomolecules and salts are removed, the complement may then be released under more rigorous conditions. In contrast, when the bound strand is to be analyzed, it is typically washed under more vigorous conditions such that the interactions between the bound strand, if present, and its unbound complement is disrupted. This allows the unbound strand to be washed away with the other salts and unwanted biomolecules. Cleavable linkers or cleavable primers may be used to release the bound strand from the solid support prior to MS analysis.
Preferred primers for practicing this method include primers designed to amplify DTNR loci selected from the group consisting of CSF1PO, D3S1358, D5S818, D7S820, D8S1179, D13S317, D16S539, D18S51, D21S11, DYS19, F13A1, FES/FPS, FGA, HPRTB, TH01, TPOX, DYS388, DYS391, DYS392, DYS393, D2S1391, D18S535, D2S1338, D19S433, D6S477, D1S518, D14S306, D22S684, F13B, CD4, D12S391, D10S220 and D7S523. Preferred pairs of primers designed to amplify these loci include: a sequence according to SEQ ID NO:1 and a sequence according to SEQ ID NO:2; a sequence according to SEQ ID NO:3 and a sequence according to SEQ ID NO:4; a sequence according to SEQ ID NO:5 and a sequence according to SEQ ID NO:6; a sequence according to SEQ ID NO:7 and a sequence according to SEQ ID NO:8; a sequence according to SEQ ID NO:9 and a sequence according to SEQ ID NO:10; a sequence according to SEQ ID NO:11 and a sequence according to SEQ ID NO:12; a sequence according to SEQ ID NO:13 and a sequence according to SEQ ID NO:14; a sequence according to SEQ ID NO:15 and a sequence according to SEQ ID NO:16; a sequence according to SEQ ID NO:17 and a sequence according to SEQ ID NO:18; a sequence according to SEQ ID NO:19 and a sequence according to SEQ ID NO:20; a sequence according to SEQ ID NO:21 and a sequence according to SEQ ID NO:22; a sequence according to SEQ ID NO:23 and a sequence according to SEQ ID NO:24; a sequence according to SEQ ID NO:25 and a sequence according to SEQ ID NO:26; a sequence according to SEQ ID NO:27 and a sequence according to SEQ ID NO:28; a sequence according to SEQ ID NO:29 and a sequence according to SEQ ID NO:30; a sequence according to SEQ ID NO:31 and a sequence according to SEQ ID NO:32; a sequence according to SEQ ID NO:49 and a sequence according to SEQ ID NO:83; a sequence according to SEQ ID NO:52 and a sequence according to SEQ ID NO:84; a sequence according to SEQ ID NO:54 and a sequence according to SEQ ID NO:85; a sequence according to SEQ ID NO:56 and a sequence according to SEQ ID NO:86; a sequence according to SEQ ID NO:58 and a sequence according to SEQ ID NO:87; a sequence according to SEQ ID NO:59 and a sequence according to SEQ ID NO:88; a sequence according to SEQ ID NO:62 and a sequence according to SEQ ID NO:89; a sequence according to SEQ ID NO:63 and a sequence according to SEQ ID NO:90; a sequence according to SEQ ID NO:66 and a sequence according to SEQ ID NO:91; a sequence according to SEQ ID NO:67 and a sequence according to SEQ ID NO:92; a sequence according to SEQ ID NO:70 and a sequence according to SEQ ID NO:93; a sequence according to SEQ ID NO:72 and a sequence according to SEQ ID NO:94; a sequence according to SEQ ID NO:74 and a sequence according to SEQ ID NO:95; a sequence according to SEQ ID NO:76 and a sequence according to SEQ ID NO:96; a sequence according to SEQ ID NO:78 and a sequence according to SEQ ID NO:97; a sequence according to SEQ ID NO:80 and a sequence according to SEQ ID NO:98; a sequence according to SEQ ID NO:66 and a sequence according to SEQ ID NO:99; a sequence according to SEQ ID NO:33 and a sequence according to SEQ ID NO:100; and a sequence according to SEQ ID NO:101 and a sequence according to SEQ ID NO:103.
The present invention also focuses on an improved method of multiplexing the analysis of nucleic acid extension products derived from DNA nucleotide repeat loci. This method differs from known methods of multiplexing DTNR analysis in that mass spectrometry is employed and the range of possible nucleic acid extension products for the multiplexed loci, the allele nucleic acid extension product size ranges, may be specifically chosen to overlap in the mass scale yet be uniquely resolved and detected.
Thus, this invention encompasses methods for analyzing more than one target nucleic acid in which the target nucleic acids are used to produce more than one nucleic acid product extension product and where each nucleic acid extension product may comprise a different DTNR sequence. A preferred embodiment encompasses simultaneously determining the mass of more than one DNA tandem nucleotide repeat allele at more than one DNA tandem nucleotide repeat loci. According to this embodiment several amplification products containing various DTNR sequences from different DTNR loci may be analyzed in the same solution and spectrum.
Additionally, the DNA tandem nucleotide repeat loci may have overlapping allelic mass ranges (see FIGS. 4 and 5). The term xe2x80x9coverlapping allelic mass rangesxe2x80x9d is defined to mean that the alleles that may be present for a particular DTNR locus have masses that overlap, or coincide, as observed by mass spectrometry with the masses for alleles from another DTNR locus. The methods of the present invention allow one to resolve these alleles by mass spectrometry either by increasing the mass separation of these peaks or by modifying the mass of the amplified products containing the various DTNR sequences such that the amplification products have interleaving mass spectral peaks (see FIG. 6).
This novel interleaved multiplexing approach overcomes the TOF-MS limitations for size partitioning and takes advantage of the high mass accuracy of the method within the high resolution mass range below about 160 nucleotides in size. One specific embodiment encompasses a method that involves the design of specific primer or primers that produce nucleic acid extension products for a first locus with defined allele mass values. The primer or primers for second locus are then selected so that while the mass range for the predicted nucleic acid extension products of the primers overlap with the mass range for the products of the first locus, the specific predicted nucleic acid extension product mass values differ from those of the first locus and therefore can be uniquely resolved by TOF-MS. Further loci may be added to the multiplex using the same method such that three, four, five, six, seven, eight, nine, ten or more loci may be analyzed simultaneously.
The basic limits for this multiplexing are defined by the ability to resolve all possible nucleic acid extension products within a mixture. It is not inconceivable that as many as 10 different loci might be interleaved and uniquely resolved. In addition to multiplexing two or more DTNRs it is also possible to use this invention to interleave mixtures of DTNRs with specific nucleic acid extension products arising from nonrepeat loci, e.g., a DTNR locus with allelic nucleic acid extension products 72, 76, 80, 84 and 88 nucleotides in size could be simultaneously analyzed with a nucleic acid extension product 82 nucleotides in size.
The ability to interleave loci requires that thenucleic acid extension product mass values for all possible allele values should preferably be known. These allele mass values may be determined empirically or more likely by calculation using the known loci sequences. In many cases it may be necessary to xe2x80x9cfine tunexe2x80x9d the allele mass values for one or more loci in a multiplexed mixture in order to prevent unresolvable overlap between two Nucleic acid extension products. For example, allele 5 for Locus A may be only 5 Da different in mass than allele 9 for Locus B preventing resolution of those two Nucleic acid extension products by mass spectrometry. Mass modifications to one or both loci may be used to increase this mass difference to 100 Da.
Adjusting the allele mass values for any given locus may be done by any number of methods including: increasing or decreasing the size the of the nucleic acid extension products via altered sequences and placement of the primers; addition of nonhybridizing nucleotides to the 5xe2x80x2 ends of one or more primers, addition of nonnucleotide chemical modifications internally or to the ends of one or both primers; alterations in base composition within one or both primers, including the use of nonstandard nucleotides, that may or may not result in mismatches within the primers; incorporation of and specific placement of a chemically cleavable moiety within the primer backbone to reduce the length of the nucleic acid extension product by a selected amount; enzymatic cleavage of the nucleic acid extension products using a restriction endonuclease that recognizes a restriction site within one or both primers or within the nucleic acid extension product itself; use of a 5xe2x80x2 to 3xe2x80x2 exonuclease in concert with exonuclease blocking modified nucleotides contained within one or more primers; incorporation of nonstandard deoxyribonucleotides or chemically or isotopically modified nucleotides during polymerization; any number of methods of mass modifying by addition of chemical moieties post amplification; by using different chain termination reagents in conjunction with primer extension; or any number of other means that anyone skilled in the art would be able to identify.
Another embodiment encompasses a method of multiplexing amplification products containing DTNRs having overlapping allelic ranges where at least one amplification product contains a mass modified nucleotide. Mass modified nucleotides include nucleotides to which nonnucleotide moieties have been chemically attached; bases having altered compositions; nonstandard nucleotides, that may or may not result in mismatches within the primers; and any bases whose masses have been modified through the addition of chemical moieties after the amplification step.
Alternatively, the length of at least one extension product may be reduced by cleaving the extension product at a cleavable site such as a restriction endonuclease cleavage site, an exonuclease blocking site, or a chemically cleavable site. Preferred chemically cleavable sites for multiplexing include modified bases, modified sugars (e.g., ribose), or a chemically cleavable group incorporated into the phosphate backbone, such as a dialkoxysilane, 3xe2x80x2-(S)-phosphorothioate, 5xe2x80x2-(S)-phosphorothioate, 3xe2x80x2-(N)-phosphoroamidate, or 5xe2x80x2-(N)-phosphoroamidate. Preferred primers may also be capable of attaching to a solid support.
Another embodiment of this invention encompasses a method for multiplexing the detection of more than one amplified DNA tandem nucleotide repeat marker from more than one DNA tandem nucleotide repeat loci including: determining the mass of more than one nucleic acid extension product by mass spectrometry, where the DNA tandem nucleotide repeat loci each comprise a DNA tandem repeat sequence and a flanking sequence and have overlapping allelic mass ranges. Typically, at least one of the target nucleic acid extension products may contain a mass modifying group.
xe2x80x9cMass modifying groupsxe2x80x9d may comprise any group that alters the mass of the amplified products to produce interleaving or otherwise resolvable mass spectral peaks. These groups, which may be incorporated during or after primer extension, may be mass modified nucleotides, nonstandard deoxyribonucleotides, or even cleavable sites as cleaving such a site modifies the mass by reducing the length of the extension product. As used in this context, modified or nonstandard bases are generally understood to include bases not found in DTNR locus flanking the DTNR sequence of the sample or target nucleic acid.