The following is offered as background information only and is not intended nor admitted to be prior art to the present invention.
The ability to detect DNA sequence variances in an organism's genome has become an important tool in the diagnosis of diseases and disorders and in the prediction of response to therapeutic regimes. It is becoming increasingly possible, using early variance detection, to diagnose and treat, even prevent the occurrence of, a disease or disorder before it has physically manifested itself. Furthermore, variance detection can be a valuable research tool in that it may lead to the discovery of genetic bases for disorders the cause of which were hitherto unknown or thought to be other than genetic.
The most common type of sequence variance is the single nucleotide polymorphism or SNP. As the name suggests, a SNP involves the substitution of one nucleotide for another at a particular locus in a gene. While each SNP involves a single nucleotide, a single gene may contain numerous SNPs.
It is estimated that SNPs occur in human DNA at a frequency of about 1 in 100 nucleotides when 50 to 100 individuals are compared. Nickerson, D. A., Nature Genetics, 1998, 223-240. This translates to as many as 30 million SNPs in the human genome. However, very few SNPs have any effect on the physical well-being of humans. Detecting the 30 million SNPs and then determining which of them are relevant to human health will clearly be a formidable task.
Complete DNA sequencing is presently the definitive procedure for accomplishing the above. However, current DNA sequencing technology is costly, time consuming and, in order to assure accuracy, highly redundant. Most sequencing projects require a 5- to 10-fold coverage of each nucleotide to achieve an acceptable error rate of 1 in 2,000 to 1 in 10,000 bases. In addition, DNA sequencing is an inefficient way to detect SNPs. While on the average a SNP occurs once in about every 100 nucleotides, when variance between two copies of a gene, for example those associated with two chromosomes, is being examined, a SNP may occur as infrequently as once in 1,000 or more bases. Thus, only a small segment of the gene in the vicinity of the SNP locus is really of interest. If full sequencing is employed, a tremendous number of nucleotides will have to be sequenced before any useful information is obtained. For example, to compare ten versions of a 3,000 nucleotide DNA sequence for the purpose of detecting four variances among them, even if only 2-fold redundancy is employed (each strand of the double-stranded 3,000 nucleotide DNA segment from each individual is sequenced once), 60,000 nucleotides would have to be sequenced (10×3,000×2). Furthermore, sequencing problems are often encountered that can require even more runs, often with new primers. Thus, as many as 100,000 nucleotides might have to be sequenced to detect four variances.
Determination of whether a particular gene of a species or of an individual of that species contains a SNP is called genotyping. Complete sequencing is, therefore, a method for accomplishing genotyping but, as is indicated above, it is slow, costly and extremely inefficient.
What is needed is a rapid, inexpensive, efficient, yet accurate, method for genotyping. The present invention provides such a method.