1. Field of the Invention
This invention relates generally to a method for gene analysis. More specifically, the invention relates to a method for gene analysis whereby a gene-derived single-stranded nucleic acid analyte is analyzed for a given purpose by hybridization of degenerate probes.
2. Related Background Art
In molecular biology, identification of genes, detection of gene variations and analysis of the base sequence of genes (hereunder referred to as xe2x80x9cgene analysisxe2x80x9d) are not only important for understanding the functions and regulatory mechanisms of the genes, but also provide a practical and useful means of analysis that can be applied in the fields such as genome analysis, gene diagnosis and forensic medicine.
Gene analysis methods are largely classified into methods that are capable of detecting only the presence or absence of gene variations and methods that are capable of detecting and analyzing the actual base sequences. Examples of the methods of the former type, which have been put to use, include the method called Restriction Fragment Length Polymorphism (RFLP), which analyzes length polymorphism of nucleic acid analytes that have been fragmented by sequence specific restriction endonucleases and the method called Single Strand Conformation Polymorphism (SSCP), which, after denaturing nucleic acid analytes into single strands and then restoring the non-denaturing conditions in order to attain stable structures within the single strands, analyzes polymorphic structures by subjecting them to electrophoresis. Methods of the latter type-that have been put into use utilize partial base sequence data for a gene to synthesize the complementary strands of oligonucleotide, in which hybridization, thermo-cycled polymerase reaction, ligase reaction, etc. are carried out utilizing the discriminating power of the oligonucleotide with the specific base sequence, and the reaction product is analyzed. Also commonly employed are DNA sequencing methods such as the Sanger method and the Maxam-Gilbert method, which directly determine gene base sequences.
However, all of the methods described above suffer from drawbacks. Specifically, the former type of method gives little information on base sequence, and variation cannot necessarily be detected in all cases. The latter type of method gives base sequence data, but since such methods relay on the partial base sequence data of a given nucleic acid analyte, they are not universally applicable and cannot be used if a portion or the full length of the base sequence data is unknown.
It is, therefore, an object of this invention to provide a universal analysis method that can be applied to genes of any base sequence and that can even be applied to genes of which a portion or the full length of the base sequence data has not been obtained. It is another object to provide a method that can be applied to sequencing of nucleic acids (or genes) of lengths exceeding 1 kb, for which the Sanger method is unsuitable.
In the course of pursuing these objects, a few of the present inventors have already discovered a rapid DNA sequencing method by hybridization, and have developed a DNA sequencing apparatus employing the method (Japanese Unexamined Patent Publication HEI No. 10-243785). Because the method is a base sequence determination method, it can be used for universal gene analysis. However, in practical use the disclosed method has required oligonucleotide probes of 8 or more bases in consideration of the efficiency of polymerase reaction and the lengths that can be sequenced. This has resulted in a tremendous increase in the number of combinations of probe base sequences (i.e. the number of probes), for which reason the method is not believed to be suitable for practical use without employing fine processing technology such as DNA chips.
A possible alternative is to use electrophoresis to carry out gene analysis according to the method described above. To this end the number of base of an oligonucleotide probe (i.e. the probe base length) must be no greater than 7, and preferably as small as about 3, 4 or 5. The number of possible probe combinations is 64 in the case of 3 bases, 256 in the case of 4 bases and 1024 in the case of 5 bases. If oligonucleotide probes of such short length are used, the method can be satisfactorily accomplished even using currently available electrophoresis apparatuses. However, since a hybridizable site appears on an average of once every 64 bases for a 3-base length, once every 256 bases for a 4-base lenght and once every 1024 bases for a 5-base length, multiple probe-complementary sites will be present when it is attempted to sequence nucleic acid analytes that are over 1 kb long using the aforementioned oligonucleotide probes, and this presents a problem that renders it unsuitable for methods employing DNA chips. Also, when such short oligonucleotide probes are used for polymerase reaction, the melting point of the double-stranded nucleic acid produced with the oligonucleotide probe as such and the annealing temperature onto the nucleic acid analyte are extremely low. The melting point Tm is calculated by one of the following equations according to three known methods.
Nearest neighbor methodxe2x80x83xe2x80x83(1)
%GC method (Tm=81.5xc2x0 C.+16.6(log M)+0.41(%GC)xe2x88x920.61(%form)xe2x88x92500/L)xe2x80x83xe2x80x83(2)
2+4 method (Tm=2(A+T)+4(G+C))xe2x80x83xe2x80x83(3)
When estimated according to the 2+4 method that is often used for short oligomers,
for a 3-base length: Tm=6-12xc2x0 C.,
for a 4-base length: Tm=8-16xc2x0 C., and
for a 5-base length: Tm=10-20xc2x0 C.
Since the temperature for achieving hybridization is usually to be set at 10xc2x0 C. or even lower than the annealing temperature, the hybridization must be conducted almost freezing temperature, and the DNA polymerase, including Taq polymerase (optimal temperature: 72xc2x0 C.) that is used for the thermo-cycled polymerase reaction amplification is virtually inactive at such a low temperature. Consequently, when oligonucleotide probes are used with such base lengths, they are indeed unlikely to achieve gene analysis.
In order to overcome the problems described above, this invention provides a method for gene analysis by hybridization that is universally applicable.
Specifically, according to a first aspect of this invention there is provided a method for gene analysis by hybridization comprising:
a first step of preparing a set of degenerate probes;
a second step of hybridizing a single-stranded nucleic acid analyte derived from the gene to each probe of the set of degenerate probes;
a third step of using the hybridized nucleic acid analyte as a template and each of the probes as a primer to carry out a thermo-cycled polymerase reaction;
a fourth step of separating the reaction product obtained from each probe by gel electrophoresis to obtain an electrophoresis pattern; and
a fifth step of comparing the electrophoresis patterns for each of the probes.
According to another aspect of the invention there is provided a method for gene analysis by hybridization comprising:
a first step of preparing a set of degenerate probes; each probe having a prescribed base sequence,
a second step of hybridizing a single-stranded nucleic acid analyte derived from the gene to each probe of the set of degenerate probes;
a third step of using the hybridized nucleic acid analyte as a template and each of the probes as a primer to carry out a thermo-cycled polymerase reaction, and extending the primer;
a fourth step of separating the extension reaction product obtained from each of the probes into extension fragments by gel electrophoresis, and determining the base length of each of the extension fragments; and
a fifth step of correlating the base lengths of the extension fragments with the prescribed base sequence of each of the probes to characterize the base sequence of the nucleic acid analyte.
According to yet another aspect of the invention there is provided a method for gene analysis by hybridization comprising:
a first step of preparing a set of degenerate probes, each probe having a prescribed base sequence;
a second step of hybridizing a single-stranded nucleic acid analyte derived from the gene to each probe of the set of degenerate probes;
a third step of using the hybridized nucleic acid analyte as a template and each of the probes as a primer to carry out a thermo-cycled polymerase reaction, and extending the primer;
a fourth step of separating the extension reaction product obtained from each of the probes into extension fragments by gel electrophoresis, and determining the base length of each of the extension fragments; and
a fifth step of aligning the prescribed base sequence of each of the probes in the order of the base length of the extension fragment according to a Eulerian path-finding algorithm, to determine a portion of the base sequence of the nucleic acid analyte.
Preferably, the full length base sequence of the nucleic acid analyte is determined in the method for gene analysis described above.
The invention further provides the method for gene analysis as described above wherein each of the probes is an oligonucleotide represented by N1N2N3 . . . NnX1X2 . . . Xm (formula 1), N1N2N3 . . . X1X2 . . . XmNn (formula 2), N1N2N3 . . . X1X2 . . . XmNnxe2x88x921Nn (formula 3), . . . or X1X2 . . . XmN1N2N3 . . . Nnxe2x88x921Nn (formula n) (where N1-Nn designate any of the four bases A, T, G and C but are random, X1-Xm designate any of A, T, G and C but are predetermined, and m and n are each a natural number).
The invention still further provides any of the above methods for gene analysis wherein the set of degenerate probes is the set of all of the 4m combinations comprising each of the aforementioned probes, or a partial subset thereof.
Here, m is preferably 3, 4 or 5.
More preferably, n is 5, 6, 7 or 8.
Most preferably, m is 4 and n is 6.
The invention still further provides any of the above methods for gene analysis wherein in the first step there is prepared an array vessel having a number of wells corresponding to the total number of the set of degenerate probes, and each probe of the set of degenerate probes is fractionally dispensed into one of the wells.
The invention still further provides the method for gene analysis as described above wherein each of the probes is as defined above, and the total number of the set of degenerate probes is 4m.
The invention still further provides a gene analysis kit comprising:
a set of degenerate probes each probe of which is an oligonucleotide represented by N1N2N3 . . . NnX1X2 . . . Xm (formula 1), N1N2N3 . . . X1X2 . . . XmNn (formula 2), N1N2N3 . . . X1X2 . . . XmNnxe2x88x921Nn (formula 3), . . . or X1X2 . . . XmN1N2N3 . . . Nnxe2x88x921Nn (formula n) (where N1-Nn designate any of the four bases A, T, G and C but are random, X1-Xm designate any of A, T, G and C but are predetermined, and m and n are each a natural number);
an array vessel having a number of wells corresponding to the total number of the set of degenerate probes;
a buffer solution; and
DNA polymerase,
wherein each of the probes is fractionally dispensed in one of the wells of the array vessel.
The invention still further provides the gene analysis kit as described above wherein each of the dispensed probes is immobilized on one well of the array vessel.
According to the method for gene analysis of this invention, degenerate probes are used for hybridization with a single-stranded nucleic acid analyte derived from a gene, the nucleic acid analyte is used as a template and the probes are used as primers to carry out a thermo-cycled polymerase reaction, the reaction products obtained from the respective probes are separated by gel electrophoresis and the electrophoresis patterns are compared; therefore, it allows the feature of base sequence of the gene to be extracted without sequencing the entire base sequence thereof.
Also, according to the gene sequencing method of the invention, degenerate probes are used for hybridization with a single-stranded nucleic acid analyte derived from a gene, the nucleic acid analyte is used as a template and the probes are used as primers to carry out a thermo-cycled polymerase reaction and to extend the primers, the extension reaction products obtained from the respective probes are separated into extension fragments by gel electrophoresis, the base length of each extension fragment is determined and the base lengths of the extension fragments are correlated with the prescribed base sequence of each probe to characterize the base sequence of the nucleic acid analyte; therefore it allows the feature of base sequence of the gene to be extracted with the need of sequencing only a part thereof and without sequencing the entire base sequence thereof.
Furthermore, according to the gene sequencing method of the invention, degenerate probes are used for hybridization with a single-stranded nucleic acid analyte derived from a gene, the nucleic acid analyte is used as a template and the probes are used as primers to carry out a thermo-cycled polymerase reaction to extend the primers, the extension reaction products obtained from the respective probes are separated into extension fragments by gel electrophoresis, the base length of each extension fragment is determined and the prescribed base sequence of each probe is aligned in the order of the base length of the extension fragment according to the Eulerian path-finding algorithm to determine a portion of the base sequence of the nucleic acid analyte; therefore, it allows the base sequence to be determined visually in a relatively simple manner. Moreover, the present method allows the full length base sequence of the nucleic acid analyte to be determined without cloning.
The method for gene analysis of the invention is therefore widely and universally applicable to not only sequencing but also gene identification, gene variation detection and other purposes of gene base sequence data analysis. More specifically, it can be applied to gene diagnosis (infection, cancer, genetic diseases and the like) and gene-related drug development (drug discovery, gene screening), as well as in general gene detection- related fields (drugs, foods, agriculture, environment, etc.).
The present invention will be more fully understood from the detailed description given hereinbelow and the accompanying drawings, which are given by way of illustration only and are not to be considered as limiting the present invention.
Further scope of applicability of the present invention will become apparent from the detailed description given hereinafter. However, it should be understood that the detailed description and specific examples, while indicating preferred embodiments of the invention, are given by way of illustration only, since various changes and modifications within the spirit and scope of the invention will be apparent to those skilled in the art from this detailed description.