1. Field of the Invention
With an object of elucidating and analyzing the function of novel DNA such as cDNA which has a possibility of being associated with certain types of diseases in a specific organ or tissue and wherein the nucleotide sequence of said DNA is partially sequenced, the present invention relates to a method for probing the function of said cDNA using an antisense oligonucleotide complementary to the partial sequence of said cDNA. More particularly, the present invention relates to a method for efficiently probing the function of a protein encoded by a DNA wherein the nucleotide sequence of said DNA is partially sequenced, e.g., obtained by a random decoding of cDNA library and the like, whereby pharmaceuticals, diagnostic agents, chemical reagents, etc. can be developed.
2. Description of Related Art
At present, it is assumed that there are several tens of thousands of genes which code for proteins and peptides in the cells of animals including human being. In a human genome project which has been proceeded in the field of genetic engineering, revelation of the primary structure of proteins encoded by all human genes is one of the most important targets. However there is a problem that, even if the nucleotide sequence of the genome is revealed, the primary structure of the protein encoded by said revealed sequence cannot be determined because, usually, there are many intron regions in genes.
In addition, even if the structure of each gene is determined, it does not help elucidate life phenomena, particularly various biological phenomena based upon the function of the protein encoded by human gene until it becomes clear what kind of functions such many genes have. Furthermore, it is useless to developing useful pharmaceuticals or non-contributable to the therapy of various diseases and illnesses. As a result of development of biochemical and molecular biological techniques, structures and functions of some proteins and peptides have been revealed already. The actual status is, however, that few genes have ever been isolated and revealed in terms of the structure and function of the coded proteins. Most of the rest have not been imagined at all even for the types and the numbers thereof. It is expected that, among such unknown proteins and peptides, some will be targets for the development of unique pharmaceuticals, diagnostic agents and chemical reagents. Thus, it is very important to know not only the sequence information of gene but also the functions of said gene, particularly the functions of the proteins encoded by the gene.
According to the conventional methods, however, a substantial amount of labor and cost is required for finding out novel proteins and peptides and also for elucidating their structures and functions. Therefore, it has not been possible to investigate them in an efficient manner. In brief, in such conventional methods, it is necessary to find a biological function in which novel protein/peptide is thought to participate. However, when the known assessment system for biological functions is utilized, discrimination with known protein/peptide is difficult and, preferably, it is necessary to find a novel biological function. Consequently, most of endeavors for finding novel proteins and peptides have been devoted to the investigation of such novel biological functions. However, even if novel proteins/peptides can be found by chance, they are obtained in a very small amount in most of the cases. In view of the above, it is necessary that protein is purified paying much labor, that a partial amino acid sequence is determined and that, based upon the resulting sequence, cDNA which codes for said protein/peptide is cloned. Although an expression cloning using, as an index, a biological function has been conducted, there has been little cases of success. As such, it has been only possible to utilize it to the development of pharmaceuticals, diagnostic agents and chemical reagents as a result of a series of voluminous and complicated studies from finding of the function until cloning of cDNA which may require much labor and cost. Under such circumstances, it is not easy to investigate many proteins and peptides. Furthermore, such a means cannot be utilized for the case where the biological function is entirely unknown.
In recent years, there has been a rapid progress in genetic engineering techniques and the studies have been carried out in an effective manner but, in spite of that, the technique such as a cDNA cloning requires skill and experience. Especially in the case of sequencing of cloned cDNA, it should be conducted by hand using radioisotopes (RI). However, as a result of a recent development and improvement in analyzers for nucleotide sequences (DNA sequencers), it has been possible to conduct an analysis of nucleotide sequences without the use of RI and, moreover, in an automatic manner. As such, the analysis of nucleotide sequences which was conducted only by the use of cloned cDNA fragments can now be carried out using much more cDNA fragments as objects. Thus, if the nucleotide sequences can be elucidated one after another by such a means and if the structures of the proteins and peptides encoded thereby can be clarified, it will be now possible to effectively achieve the novel proteins and peptides.
In view of the above, projects (a representative example is a human genome project) in which only the partial sequence of cDNA library is randomly decoded and the resulting sequences are analyzed after putting in a database have been carried out throughout the world. As a result, when a cDNA which has a homology with the cDNA coding for known protein/peptide is obtained, functions of the protein/peptide encoded by said cDNA can now be presumed to a considerable extent from the resulting sequence information only. In fact, subtypes of known proteins and peptides have been successively elucidated by means of such an approach.
However, it is usual that most of cDNA's do not exhibit a homology which is sufficient for estimating the function with the cDNA which codes for known proteins and peptides. Therefore, even if a lot of partial sequences of cDNA are determined, there is very little possibility of success in revealing the function. Accordingly, the actual present status is that there have been many accumulations of partial sequences of cDNA wherein the sequences themselves are novel but neither structure nor function encoded thereby are elucidated. It is not easy to elucidate the structure and function of coded proteins/peptides based on the partial nucleotide sequence of novel cDNA.
For example, in the case of presumption of an amino acid sequence from a nucleotide sequence of cDNA having novel sequence, there are six translation frames which have a possibility of coding for the protein or peptide. When there is entirely no nucleotide sequence such as TGA, TAG or TAA which is supposed to be a translation termination codon at the downstream from ATG (which is supposed to be a translation initiation codon) or at the upstream from TGA, TAG or TAA (which is supposed to be a translation termination codon) in the above six frames, then a possibility that the amino acid sequence encoded by said frame is an amino acid sequence of the protein/peptide encoded by said cDNA will be strongly suggested. There is also a possibility that, when there is entirely no nucleotide sequence such as TGA, TAG or TAA supposed to be a translation termination codon, all of those translation frames will code for the peptide or protein. In addition, when the nucleotide sequence is partially substituted or dropped due -to an error in analyzing the nucleotide sequence, there is a possibility of substitution of the partially coded amino acid sequence or of a shift of the translation frame. Accordingly, at present, it is possible to decide the correct amino acid sequence of the protein/petide encoded by said cDNA only when the cDNA having a full-length translation frame (an area between the translation initiation codon [ATG] at the upstream and that [TGA/TAG/TAA] at the downstream in a single translation frame) is available and its entire nucleotide sequence is determined.
Only when such problems are overcome, it is possible to chemically synthesize the partial peptide of said protein/peptide or to prepare said protein/peptide in full by means of genetic engineering in large quantities. For example, it is possible to prepare antiserum/antibody against said protein/peptide or to label it with a suitable fluorescent substance or radioisotope. However, investigation of the function requires many additional difficulties and labor.
In addition, it has been known to investigate the function by an introduction to cultured cells or transgenic animals either as they are or after a modification followed by an excessive expression or a destruction but all of those need a lot of cost and time and they are not able to be commonly applied to many cDNAs.
As mentioned hereinabove, although it is easy to decide the partial sequence of cDNA, it is still difficult to decide the function of the protein/peptide encoded thereby.
Now, the role played by cDNA will be considered. Since cDNA is an artificially synthesized DNA using a reverse transcriptase wherein mRNA is used as a template, it has a nucleotide sequence complementary to mRNA. Accordingly, unlike genome DNA wherein protein-coding regions are interrupted by intervening sequences (introns), it is possible to decide the amino acid sequence of the coded protein by deciding the nucleotide sequence of cDNA. The characteristic features of cDNA as compared with genome DNA are (1) since there is no intron, the primary structure of the coded protein can be determined immediately when the nucleotide sequence is sequenced; (2) since its size is small, it can be handled easily; and (3) when a suitable promoter or the like can be utilized, it can be expressed in all types of cells from eukaryotic to prokaryotic ones.
In addition, although it is not possible to check from the structure when and in which cell it is expressed in the case of genome, an expression in the cell can be surely proved in the case of cDNA upon the completion of cloning. Accordingly, such a tissue specificity is one of the very useful information for checking the function of the gene. When such a property of cDNA is taken into consideration, the importance of the role played by cDNA in a human genome project can be clarified.
Now, in the current technology, cDNA whose structure is completely analyzed is essential for analysis of structure and function of the gene which is a functional unit of genome. However, in the current art, it is still difficult to obtain a full-length cDNA clone. Moreover, it is substantially impossible to check the function directly from the cDNA whose structure is elucidated only partly. Accordingly, so far as the conventional investigating means for the function analysis is utilized, the current status is that initiation of the study for elucidating its function is not possible from the information for the partial sequence only of the voluminous cDNA library and that will be a large problem in proceeding the study. In addition, even when the structure of the full-length cDNA is elucidated, much difficulty and labor are needed for construct the expression system which effectively expresses the desired protein from said information only. In fact, at present, any researching means for investigating the function from the cDNA sequence information wherein only partial sequence is known has not been established yet.
Now, an art in which a repression of expression of gene is conducted by a chemical synthesized oligonucleotide having a complementary sequence to a sequence in a domain specific to said gene or mRNA transcribed from said gene is widely known as an antisense method (G. Zon, Pharmaceutical Res., 5(9). 539 (1988); C. A. Stein et al., Cancer Res., 48, 2659(1988); E. Uhlman et al., Chemical Rev., 90(4), 543 (1990); J. Goodchild, Bioconjugate Chem., 1(3), 165 (1990)). Attempts for conducting the repression of expression of virus gene, oncogenes, etc. using such an antisense oligonucleotide has been investigated particularly in detail. The antisense oligonucleotide used therefor has a disadvantage that it is easily decomposed by a hydrolase such as nuclease when the oligonucleotide is in a natural phosphate structure. Therefore, in order to improve the stability against such an enzyme, various nucleotide derivatives which are chemically modified in phosphate groups or sugar-hydroxyl groups have been developed. An example of derivatives in which a phosphate group in nucleotide is modified is a phosphorothioate (F. Eckstein, Angew. Chem., 6, 431 (1983); F. Eckstein et al., Biochemistry, 23, 3443 (1984); J. W. Stec et al., J. Am. Chem. Soc., 106, 6077 (1984); F. Eckstein et al., Ann. Rev. Biochem., 54, 367 (1985)), a methylphosphonate (P. S. Millar et al., Biochemistry, 18, 5134 (1979); P. S. Millar et al., Biochimie, 67, 769 (1985); P. O. P. Ts'O et al., Ann. N. Y. Acad. Sci., 507, 220 (1988)), etc.
In addition, a method of methylating a 2'-hydroxyl group has been proposed as one of the attempts for improving the stability by modifying the 2'-hydroxyl group of a ribose ring which is a constituting unit of the nucleotide (Y. Furukawa et al., Chem. Pharm. Bull., 13, 1273 (1965); E. Ohtsuka et al., Nucleic Acids Res., 15, 6131 (1987)). Among those, a phosphorothioate derivative (which completely covers the mRNA to be treated and, thereafter, is able to become a substrate hydrolyzable with RNase H) is fully capable of achieving an antisense effect, is stable to hydrolase or the like and exhibits a relatively low cell toxicity and, accordingly, it is used as the best antisense derivative at this moment. As such, phosphorothioate derivatives have been widely utilized as an effective means for repression of expression of genes.
At present, this method of elucidating the function of cDNA by means of an antisense method using an antisense oligonucleotide such as said phosphorothioate derivatives was reported to be used with an object of confirming the presumed function of said cDNA after presuming almost all of the functions of the cDNA by a homology investigation or the like of the sequence information with the known gene as a result of analysis of full-length nucleotide sequence of cDNA (H. Weintraub et al., Trends in Genetics, 1, 22 (1985); C. V. Cabrera et al., Cell, 50, 659 (1987); C. Inoue et al., Proc. Natl. Acad. Sci. USA, 84, 6659 (1987); R. Heikkila et al., Nature, 328, 445 (1987); P. Harrison et al., Lancet, 342, 254 (1993); C. Wahlestdt et al., Nature, 363, 260 (1993); A. Osensand et al., Nature, 364, 445 (1993)). However, there has been no case of utilization of the antisense method for elucidating the function of novel cDNA wherein only partial sequence is known.
As such, the means which have been known for investigating the function from the structure of gene are:
(1) a method in which the function which is intensified by an excessive expression of the gene is analyzed; and PA1 (2) a method in which a homology with known genes using is investigated by computers. PA1 (1) a method for probing the function of a protein encoded by a cDNA wherein a nucleotide of said cDNA is partially sequenced, which comprises assessing the change of biological actions when an antisense oligonucleotide substantially complementary to the partial nucleotide sequence to an assessment system for biological functions thereof; and PA1 (2) a method according to (1) in which said partial nucleotide sequence is a segment positioned at the 5'-terminal region of a full-length nucleotide sequence of said cDNA. PA1 (3) a method according to (1) in which said antisense oligonucleotide is selected from the group consisting of polydeoxynucleotides containing 2'-deoxy-D-ribose, polyribonucleotides containing D-ribose, any other type of polynucleotide which is an N-glycoside of a purine or pyrimidine base, or other polymers containing nonnucleotide backbones including protein nucleic acids and synthetic sequence specific nucleic acid polymers commercially available or nonstandard linkages, providing that the polymers contain nucleotides in a configuration which allows for base pairing and base stacking such as is found in DNA and RNA; PA1 (4) a method according to (1) wherein said antisense oligonucleotide is selected from the group consisting of methyl phosphonates, phosphorotriesters, phosphoramidates, carbamates, phosphorothioates, and phosphorodithioates; and PA1 (5) a method for probing the function of a protein encoded by a cDNA wherein a nucleotide of said cDNA is partially sequenced but no biological action thereof is known, which comprises: PA1 (6) a method for probing the function of a protein encoded by cDNA wherein a nucleotide of said DNA is partially sequenced by techniques including a random decoding, etc. of the cDNA library, etc., which comprises the following steps: PA1 (7) a method according to any of (1) to (6) in which the assessment system for the biological function wherein the protein or peptide encoded by said cDNA can be expressed is a cell having a possibility of expressing the protein or peptide encoded by said cDNA; PA1 (8) a method according to any of (1) to (7) in which the assessment system for the biological function wherein the protein or peptide encoded by said cDNA is a cell which is capable of expressing the mRNA having a nucleotide sequence complementary to said cDNA; PA1 (9) a method according to any of (1) to (8) in which the cell used for the assessment system is a cell culture; PA1 (10) a method according to any of (1) to (9) in which the cell used for the assessment system is a cell selected from the group consisting of nerve cell, bone cell, muscle cell, blood vessel cell, endocrine cell, lymphocyte, reticuloendothelial cell and the like or, more particularly, a selected from the group consisting of smooth muscle cell, endothelial cell, fibroblast, epithelial cell, blood cell and the like; PA1 (11) a method according to any of (1) to (10) in which the cell used for the assessment system is a cell derived from the tissues which constitute epidermis, mucous membrane, exocrine gland, endocrine gland, lung, digestive organ, urinary/genital gland, genital cell, liver, fat tissue, connective tissue, blood vessel, muscle, hematocyte, bone marrow, nerve, etc. and is an established cell strain or a primary cultured cell from the above normal or tumorized tissue; PA1 (12) a method according to any of (1) to (11) in which the change of biological actions is a promotion or an inhibition of the production of enzyme, extracellular matrix and adhesion molecule, transcription controlling factor, growth factor, hormone, cytokine, differentiation/induction factor, chemotaxic factor, neurotransmitter and the like in the cell; PA1 (13) a method according to any of (1) to (12) in which the change of biological actions is a production of a factor related to various amplifications and differentiations or a gene-expression regulating factor such as growth factor, hormone, cytokine, a chemotaxic factor such as a factor capable of promoting the migration of leukocyte, protein capable of increasing the phagocytosis or bacteriocidal ability of leukocyte, lymphocyte growth factor, T cell activating factor, T cell growth factor, antigen-specific inhibiting factor specifically acting on an immune system, antigen-nonspecific inhibiting factor, etc. as well as various gene products including various enzymes and regulatory factors and also a promotion or an inhibition of the production of said gene products in cells; PA1 (14) a method according to any of (1) to (11) in which the change of biological actions is a change on the surface of the cell, such as production of receptor protein or production of adhesion molecules, or a change outside the cell, such as a formation of extracellular matrix; PA1 (15) a method according to any of (1) to (11) in which the change of biological actions is: PA1 (16) a method according to any of (1) to (11) in which the change of biological actions is a change in the second messenger (such as liberation of arachidonic acid, liberation of acetylcholine, liberation of Ca.sup.+2 , generation of cAMP, generation of cGMP, production of inositol phosphate metabolites, change in cell membrane potentials, phosphorylation of protein, activation of c-fos and change in pH); PA1 (17) a method according to any of (1) to (11) in which the change of biological actions is a morphological change of cells such as elongation of neutrite, shrinking or expansion of the cell, generation and disappearance of intracellular granules, etc.; PA1 (18) a method according to any of (1) to (17) which is a series of processes comprising: PA1 (19) a pharmaceutical or diagnostic composition or chemical reagent comprising an effective amount of a protein whose function is elucidated by a method according to any of (1) to (18); PA1 (20) a method for developing a pharmaceutical or diagnostic composition or chemical reagent which comprises using an assessment system for the biological function of the protein elucidated by a method according to any of (1) to (18); PA1 (21) a pharmaceutical or diagnostic composition or chemical reagent comprising an effective amount of an antisense oligonucleotide per se prepared in a probing method according to any of (1) to (18); PA1 (22) a method according to any of (1) to (17), in which the assessment step (b) is carried out in a system capable of detecting a biological response relying on said cDNA; PA1 (23) a method according to any of (1) to (18), in which said detecting system is an in vitro system; and PA1 (24) a method according to any of (1) to (18), in which said detecting system is an in vivo system.
In elucidating the function of cDNA using the method of (1), it is unavoidable to firstly clone the cDNA of a full length for completely coding for protein so that all of its sequence information is checked. In constructing the expression system wherein the desired protein can be efficiently expressed from such an information, a lot of difficulties and labor are needed.
In the method (2), it is possible to investigate the function from the information on a partial sequence provided for the highly homologous sequence, but if it remains to show the lower homologous sequence, an elucidation of the function of cDNA from said cDNA wherein only the partial sequence is known (that is an object of the present invention) is not possible.