The present invention relates to the method of DNA or RNA base sequence determination, in particular to the method of base sequence determination for long DNA or RNA having more than 500 base in length.
According to the conventional method, DNA base sequence determination is implemented in the order of DNA fragmented sample preparation process, separation by gel electrophoresis, detection of the separated DNA band pattern, and sequence determination. To describe this method of DNA base sequence determination with reference to the dideoxy chain termination method; the target DNA fragment is inserted into the M13 phage to transform sensitive bacteria, and the transformed bacteria is cultured to increase its copies from which the single strand DNA is prepared. For DNA sequencing, the specific position of the single strand DNA is hybridized with oligonucleotides, then the complementary DNA strand is synthesized through DNA polymerase reaction. The reaction mixture consists of the deoxyribonucleic acid labeled with a radioisotope, in addition to four types of deoxyribonucleic acids serving as substrates. Dideoxyribonucleic acids are further added to it to produce short DNA fragments terminated at special base species. Using the dideoxyribonucleic acids, the operation is performed for obtaining the DNA fragment families having the terminal bases of A, G, C and T, respectively. These DNA fragments are separated by a gel electrophoresis, and then the separated DNA band pattern is detected, thereby determining the DNA base sequence.
According to the said method, DNA fragments have conventionally been detected by the autoradiography in which DNA fragments are labeled by radioisotope. Because of handling inconvenience, however, DNA has come to be labeled by the fluorophore (L. M. Smith et al; Nature, Vol. 321, PP. 674-679(1986) and J. M. Prober et al; Science, Vol. 238, PP. 336-341 (1987)). This method permits analysis of up to about 500 base in length for each DNA. When the base length is greater, the DNA band separation is poorer and signal intensity is substantially reduced, making the analysis very difficult. Longer base areas are analyzed by (1) the method wherein the oligomer complementary to the sequence close to the terminal of the analyzed DNA sequence is prepared and is used as a new primer to determine longer sequences successively (primer walking), and (2) the method wherein one side or both sides of the target DNA are degraded with enzyme, inserted into a vector and then analyzed.
The primer walking technique is effective to ensure reliable determination of the long DNA sequence, but it is accompanied by the problem that a new primer has to be synthesized for every DNA analysis. Instead of synthesizing oligomers each time, it is possible to prepare a set of oligomers which include all different base sequences. At shortest 10 mers are necessary for priming DNA polimerase reaction. Because the shorter oligomers do not produce stable base pairs. A large number of oligomers such as 4.sup.10 .congruent.10.sup.6 have to be prepared for the set. This is not practical. The number of required oligomers should be reduced to less than 1000. Furthermore, the primer, walking method requires a lot of electrophoresis operations repeatedly, which is also a great trouble. To solve this problem, an attempt is made to form an oligo-nucleotide array which has various kinds of oligomers on a solid surface. DNA to be analyzed is put on the surface to hybridize to the complementary oligomer sites. The hybridized oligomer sequences overlap with one another, by preparing the oligomers having all combinations of 8 to 10 base lengths of different base sequences and by detecting whether or not it is to be hybridized with the target DNA (Z. Strezoska et al; Proc. Natl. Acad. Sci. USA, Vol. 88, PP. 10089-10093 (1991)). This method uses successive hybridization, not primer extending; it does not need electrophoresis. Theoretically one it should be able to determine the sequence of long DNA fragments, but it requires use of a great number of primers with different denaturing temperature, and cannot obtain hybridization conditions to fill all the sequences. It is difficult to determine all sequences according to this method. Furthermore, since determination of the repeated sequences is likely to be indefinite, the target DNA must be made into short fragments in advance, and the long DNA must be analyzed little by little; this requires much trouble.