The human genome project enables more broad measurement of disease risk, diagnosis or prognosis and prediction of reaction on medication. Nucleotide sequence analysis of a plurality of individuals presents polymorphic sites, which are referred to as SNPs (single nucleotide polymorphisms). The SNP is a variation occurred over the specific frequency in a nucleotide sequence of chromosome in organism. In human body, SNPs occur every about 1,000 bases. In consideration of the size of human genome, millions of SNPs exist in human body. Since the SNP is regarded as a means for explaining characteristic difference between individuals, the SNP can be used in prevention or treatment of disease by examination of cause of disease.
SNPs discovered by the human genome project show only that the polymorphism exist in human body but do not show how those polymorphisms are related to disease. In order to reveal the relationship between the SNPs and diseases, a comparative analysis of polymorphism pattern represented in healthy people and patients, SNP scoring, is required. For precise examination of the relationship between the SNP and disease, a large number of SNPs should be analyzed without error.
The SNP scoring method includes DNA sequencing, PCR-SSCP (Polymerase chain reaction—Single stranded conformation polymorphism), allele specific hybridization, oligo-ligation, mini-sequencing and enzyme cleavage method. A method using a DNA chip is also introduced, but it is not different from the allele specific hybridization in principle except its using a support adhering to oligonucleotide probe.
The two classical methods for carrying out DNA sequencing are the Maxam and Gilbert chemical procedure and the Sanger method that has been recently used. The DNA sequencing method is to find out nucleotide sequences of the whole or a part of genes rather than to examine genetic variations of specific sites. Since genetic variations of specific sites may be identified by examination of nucleotide sequences, the DNA sequencing method can be used in the SNP scoring. However, the DNA sequencing method is ineffective because adjacent nucleotide sequences that do not require examination are read with target SNP.
In PCR-SSCP (Orita, M. et al., Genomics, 1989, 5:8874-8879), sequences including SNPs to be analyzed are amplified by PCR, and then separated into each strand. Thereafter, electrophoresis is performed on polyacrylamide gel. Since the secondary structure of DNA strand is changed by difference of sequence, variations in sequence are examined from differences in electrophoresis running velocity resulting from the difference of structure.
The allele specific hybridization is to examine variations by hybridizing DNAs labeled with radioisotope to probes attached to a nylon filer, by regulating hybridization conditions such as temperature.
The oligo-ligation (Nucleic Acid Research 24, 3728, 1996) is to examine sequence variations by performing a reaction under a condition where it is not happened if target DNA is non-complementary with template DNAs and confirming whether the ligation is happened.
The mini-sequencing (Genome Research 7:606, 1997) is developed for SNP scoring. This method performs DNA polymerization in a condition that only one base of interest can be polymerized and distinguish what the polymerized base is.
The PCR-SSCP, the allele specific hybridization, the oligo-ligation are ineffective methods in analysis of many samples because of its use of polyacrylamide gel. And the errors resulting from mismatching of probes with undesired sites cannot be identified by those methods.
Although the mini-sequencing is simple and effective in analysis of many samples, the incorrect result by errors of mismatching cannot be still identified, and base deletion and insertion cannot be found by the mini-sequencing.
The enzyme cleavage method is also developed for SNP scoring (WO 01/90419). In the enzyme cleavage method, sequences to be analyzed are amplified by appropriate methods like the PCR. The amplified products include sequences that can be cleaved or recognized by two restriction enzymes. The enzyme cleavage method is to examine sequence variations by cleaving the amplified products with two restriction enzymes and measuring the molecular weight of the cleaved fragments. The enzyme cleavage has an advantage of simplicity and rapidity because the molecular weight of the fragments obtained from restriction enzyme reaction is measured by mass spectrometry right after amplification of genes by PCR. However, the incorrect analysis by errors is not identified by the enzyme cleavage method described in WO 01/90419. Although the incorrect analysis may be induced when primers are combined in undesired sites during the PCR, it is not identified. For example, the primer used to examine polymorphisms of CYP2C9 may be combined with CYP2C8. In this case, it is difficult to discover whether the errors are generated because whether the primer is combined with CYP2C8 other than CYP2C9 cannot be identified. This method can detect one base substitution, but cannot detect deletion or insertion. Also, substitution of adjacent two or more bases cannot be detected at the same time.