The present invention relates generally to organic chemistry, analytical chemistry, biochemistry, molecular biology, genetics, diagnostics and medicine. In particular, it relates to methods for genotyping diploid cells or organisms by comparing the masses of fragments of alleles that include the polymorphic locus.
The following is offered as background information only and is not intended nor admitted to be prior art to the present invention.
The ability to detect DNA sequence variances in an organism""s genome has become an important tool in the diagnosis of diseases and disorders and in the prediction of response to therapeutic regimes. It is becoming increasingly possible, using early variance detection, to diagnose and treat, even prevent the occurrence of, a disease or disorder before it has physically manifested itself. Furthermore, variance detection can be a valuable research tool in that it may lead to the discovery of genetic bases for disorders the cause of which were hitherto unknown or thought to be other than genetic.
The most common type of sequence variance is the single nucleotide polymorphism or SNP. As the name suggests, a SNP involves the substitution of one nucleotide to another at a particular locus in a gene. While each SNP involves but one nucleotide, a single gene may contain numerous SNPs.
It is estimated that SNPs occur in human DNA at a frequency of about 1 in 100 nucleotides when 50 to 100 individuals are compared. Nickerson, D. A., Nature Genetics, 1998, 223-240. This translates to as many as 30 million SNPs in the human genome. However, very few SNPs have any effect on the physical well-being of humans. Detecting the 30 million SNPs and then determining which of them are relevant to human health will clearly be a formidable task.
Complete DNA sequencing is presently the definitive procedure for detecting sequence differences. However, current DNA sequencing technology is costly, time consuming and, in order to assure accuracy, highly redundant. Most sequencing projects require a 5- to 10-fold coverage of each nucleotide to achieve an acceptable error rate of 1 in 2,000 to 1 in 10,000 bases. In addition, DNA sequencing is an inefficient way to detect SNPs. While on the average a SNP occurs once in about every 100 nucleotides, when variance between two copies of a gene, for example those associated with two chromosomes, is being compared, a SNP may occur as infrequently as once in 1,000 or more bases. Thus, only a small segment of the gene in the vicinity of the SNP locus is really of interest. If full sequencing is employed, a tremendous number of nucleotides will have to be sequenced before any useful information is obtained. For example, to compare ten versions of a 3,000 nucleotide DNA sequence for the purpose of detecting four variances among them, even if only 2-fold redundancy is employed (each strand of the double-stranded 3,000 nucleotide DNA segment from each individual is sequenced once), 60,000 nucleotides would have to be sequenced (10xc3x973,000xc3x972). Furthermore, sequencing problems are often encountered that can require even more runs, often with new primers. Thus, as many as 100,000 nucleotides might have to be sequenced to detect four variances.
Determination of whether a particular gene of a species or of an individual of that species contains a SNP is called genotyping. Complete sequencing is, therefore, a method for accomplishing genotyping but, as is indicated above, it is slow, costly and extremely inefficient.
An alternative to complete sequencing to compare the masses of fragments of two alleles of a gene known or suspected to contain a SNP. A mass difference between any of the fragments indicates that the alleles contained different nucleotides in the divergent fragments which, in turn, reveals that the alleles are heterozygous. Generally, the procedure involves amplifying a segment of each allele using a modified nucleotide corresponding to one of the natural nucleotides involved in the polymorphism, which modified nucleotide imparts enhanced susceptibility to cleavage at its sites of incorporation, The modified nucleotide is incorporated into the amplicon in at least a portion of the points of occurrence of the natural nucleotide. The modified segments are then cleaved at the sites of incorporation of the modified nucleotide to give two sets of fragments, which are compared as indicated above. While providing a vast improvement in terms of speed, efficiency and cost over complete sequencing, this procedure is not free of potential shortcomings.
For example, large differences in assay signals between thermocyclers can limit the robustness of the procedure. A relatively high occurrence (as much as 25%) of allele-specific reactions in which only one diagnostic product is produced from a heterozygous mixture may confound the result. An amplification bias for the allele that has the site of incorporation of the modified nucleotide farthest from the extending primer terminus (called skewing) may occur. In addition, automated calling of genotype may be affected by the exponential decrease in mass spectometric signal with linear increases in fragment size. Heterozygotes that give fragments differing in size by 5-10 nucleotides can produce peaks of very unequal intensity that are difficult for automated devices to recognize.
What is needed, then, is a method that retains the rapid, inexpensive, efficient, yet accurate characteristics of the mass comparison technique but which eliminates the above potential shortcomings. The present invention provides such a method.
Thus in one aspect the present invention relates to a method for genotyping a diploid organism. The method comprises taking two alleles of a target gene of a diploid organism suspected to contain a polymorphism and obtaining a segment of each that contains the suspected polymorphic locus. A natural nucleotide is replaced at greater than 90% of its points of occurrence in the two segments with a modified nucleotide to give two modified segments. In doing so, the natural nucleotide that is replaced is not a nucleotide involved in the polymorphism. Furthermore, replacing the natural nucleotide with a modified nucleotide comprises amplification using a primer that hybridizes to each segment such that, after amplification, a first modified nucleotide is incorporated between the end of the primer and the polymorphic locus and a second modified nucleotide is located from 5 to 20 nucleotides downstream from the first modified nucleotide. The modified segments are then cleaved at greater than 90% of the points of occurrence of the modified nucleotide to give two sets of fragments each of which includes a 5-20 nucleotide fragment. The masses of the 5-20 nucleotide fragments from the two modified segments are then compared to detect the presence or absence of the polymorphism.
In an aspect of this invention, the second modified nucleotide is from 7 to 20 nucleotides downstream of the first modified nucleotide and it is these 7-20 20 nucleotide fragments that are compared to detect the presence or absence of the polymorphism.
In as aspect of this invention, the second modified nucleotide is from 7 to 12 nucleotides downstream of the first modified nucleotide and it is these 7-12 nucleotide fragments that are compared to detect the presence or absence of the polymorphism.
In an aspect of this invention, if there would be less than 5 nucleotides between the first and second modified nucleotides, the method further comprises using a primer that contains a point mutation that removes the site of incorporation of one of the modified nucleotides.
In an aspect of this invention, if there are less than 7 nucleotides between the first and the second modified nucleotides, the method further comprises using a primer that contains a point mutation that removes the site of incorporation of one of the modified nucleotides.
In an aspect of this invention, if there would be more than 20 nucleotides between the first and second modified nucleotides, the method further comprises a primer, which contains a point mutation that incorporates a modified nucleotide downstream of the first modified nucleotide or upstream of the second modified nucleotide.
In an aspect of this invention, if there would be more than 12 nucleotides between the first and the second modified nucleotides, the method further comprises a primer, which contains a point mutation that incorporates a modified nucleotide downstream of the first modified nucleotide or upstream of the second modified nucleotide.
An aspect of this invention is a method for genotyping a diploid organism, comprising, first, providing two alleles of a target gene of a diploid organism suspected to contain a polymorphism and then obtaining a segment from each allele wherein the segment contains the suspected polymorphic locus. A natural nucleotide in the segment is then replaced at greater than 90% of its points of occurrence in the two segments with a modified nucleotide to give a first and a second modified segment. IN this aspect of the invention, the natural nucleotide that is replaced is a nucleotide involved in the polymorphism. Replacing the natural nucleotide with a modified nucleotide comprises amplification using a primer that hybridizes to each segment such that, after amplification, the suspected polymorphic locus is the first site of incorporation of a modified nucleotide after the end of the primer. Furthermore a second modified nucleotide must be located from 5 to 20 nucleotides downstream of the first modified nucleotide. The first and second modified segments are cleaved at greater than 90% of the points of occurrence of the modified nucleotide to give a first and second set of fragments. Finally, the masses of the two sets of fragments are compared for the presence of the 5-20 nucleotide fragment wherein, if the fragment is present or absent in both sets, the gene is homozygous and if the fragment is present in only one set, the gene is heterozygous.
In an aspect of this invention, a nucleotide known to be involved in the polymorphism is replaced with a mass-modified nucleotide.
In an aspect of this invention, comparing the masses of the fragments comprises using a mass spectrometer.
In an aspect of this invention, the mass spectrometer is a MALDI mass spectrometer.
In an aspect of this invention, the MALDI mass spectrometer is a MALDI-TOF mass spectrometer.
In an aspect of this invention, the mass spectrometer is an ESI mass spectrometer.
In an aspect of this invention, the percentage replacement of a natural nucleotide with a modified nucleotide, the percentage cleavage at a modified nucleotide, or both the percentage replacement and the percentage cleavage, is greater than 95%.
In as aspect of this invention, the percentage replacement of a natural nucleotide with a modified nucleotide, the percentage cleavage at a modified nucleotide, or both the percentage replacement and the percentage cleavage, is greater than 99%.