The present invention relates to a technique for automatically extracting genetic motifs, and particularly to a technique for improving a genetic motif extracting efficiency.
Recent progress of genetic engineering has brought rapid progress of a technique for determining a gene arrangement such as expressed by a DNA sequence and/or an amino acid sequence. Further, the genome project is being conducted worldwide so as to clarify all gene arrangements of specific organisms (i.e., to sequence or identify gene information), for various species including human being. As such, databases of gene arrangement information have increased explosively, so as to effectively utilize those previously clarified gene arrangement information (sequence gene information or genetic sequences).
Most of these gene arrangements have been clarified about the arrangement information thereof, but the functions and structures thereof are unknown. Effective methods for presuming such functions and structures of such genes from the gene arrangements include the extraction of motifs having characteristic regularities. To this end, the present applicant has proposed a technique for automatically extracting motifs by comparing a plurality of gene arrangement information with one another, in a prior Japanese Patent Application of the present applicant (see Japanese Unexamined Patent Publication No. 7-274965).
According to such a motif extracting technique, however, those extracted motifs have been merely printed out from a printer or output as a file into a database, so that the extracted motifs have not been fully reutilized in fact. As such, when adding the extracted motifs to the gene arrangement information to extract motifs again in order to promote clarification of functions and structures of genes, it has been necessary for a human operator to input the extracted motif information. However, in such a conventional technique, there has been an inevitable limitation in improvement of a motif extracting efficiency, resulting in difficulty in further improving the motif extracting efficiency.
The present invention has been carried out in view of the conventional problems as described above, and it is therefore an object of the present invention to provide a technique for improving a motif extracting efficiency for presuming functions, structures and the like of genes, by adding a mechanism for reutilizing extracted motifs.
As a first aspect to achieve the aforementioned object, the present invention provides a genetic motif extracting and processing apparatus comprising: gene arrangement information storing means for storing clarified gene arrangement information; gene arrangement information inputting means for inputting at least one piece of gene arrangement information; motif extracting means for extracting a genetic motif from the gene arrangement information input by the gene arrangement information inputting means; gene arrangement information retrieving means for retrieving, based on the motif extracted by said motif extracting means, gene arrangement information including said motif as a part thereof, from said gene arrangement information storing means; and gene arrangement information adding means for adding, the gene arrangement information retrieved by the gene arrangement information retrieving means, to the gene arrangement information input by the gene arrangement information inputting means.
According to such a constitution, when extracting a genetic motif, at least one piece of gene arrangement information is input through the gene arrangement information inputting means. Then, a genetic motif is extracted from the input gene arrangement information by the motif extracting means. Further, gene arrangement information including, as a part thereof, the extracted motif is retrieved from the gene arrangement information storing means, by the gene arrangement information retrieving means. The retrieved gene arrangement information is added by the gene arrangement information adding means, as required, to the gene arrangement information input by the gene arrangement information inputting means. Thereafter, motif extraction by the motif extracting means and retrieval of gene arrangement information by the gene arrangement information retrieving means are repeated, to thereby clarify functions and structures of gene arrangement information gradually. Namely, the gene arrangement information including, as a part thereof, the extracted motif can be added to the input gene arrangement information, so that the extracted motif is reutilized to thereby improve a motif extracting efficiency for presuming functions and structures of genes.
The genetic motif extracting and processing apparatus may further comprise: motif extraction range designating means for designating, in the gene arrangement information input by the gene arrangement information inputting means, a motif extraction range for the motif extracting means; wherein the motif extracting means extracts a genetic motif from within the extraction range designated by the motif extraction range designating means.
According to such a constitution, if a motif extraction range is designated by the motif extraction range designating means, a genetic motif is extracted from within the designated extraction range. Thus, from among input gene arrangement information of various organisms, only gene arrangement information of similar organisms can be designated as an extraction range, to thereby facilitate clarification of functions and structures of gene arrangements.
Further, the genetic motif extracting and processing apparatus may further comprise gene arrangement information editing means for editing the gene arrangement information.
According to such a constitution, various editions are conducted for the gene arrangement information by the gene arrangement information editing means, thereby allowing motif extraction along with an intention of a user.
In addition, the genetic motif extracting and processing apparatus may further comprise motif editing means for editing the motif extracted by the motif extracting means.
According to such a constitution, various editions are conducted for the extracted motif by the motif editing means, thereby allowing retrieval of gene arrangement information along with an intention of a user.
The genetic motif extracting and processing apparatus may further comprise alignment means for alignment-processing a plurality of gene arrangement information input by the gene arrangement information inputting means.
According to such a constitution, since the input plurality of gene arrangement information is alignment-processed by the alignment means, gene arrangement information assumed to be necessary may be input randomly, thereby enabling improvement of an input operation efficiency of gene arrangement information. Further, since users are allowed to visually understand similar regions among the input plurality of gene information, it becomes possible to readily designate a motif extraction range by the motif extraction range designating means.
The genetic motif extracting and processing apparatus may further comprise motif storing means for storing motifs; and motif registering means for registering the motif extracted by the motif extracting means into the motif storing means.
According to such a constitution, the extracted motif is registered into the motif storing means by the motif registering means, so that it becomes possible for another human operator to reutilize the extracted motif, thereby allowing improvement of a working efficiency for clarifying functions and structures of gene arrangement information.
The genetic motif extracting and processing apparatus may further comprise motif displaying means for displaying at least one motif from those motifs registered in the motif storing means.
According to such a constitution, since motifs registered in the motif registering means are displayed as required, various motifs can be readily compared with one another so that clarification of functions and structures of gene arrangements can be readily performed.
As a second aspect to achieve the aforementioned object, the present invention provides a genetic motif extracting and processing method comprising: a gene arrangement information inputting process for inputting at least one piece of gene arrangement information; a motif extracting process for extracting a genetic motif from the gene arrangement information input by the gene arrangement information inputting process; a gene arrangement information retrieving process for retrieving, based on the motif extracted by said motif extracting process, gene arrangement information including said motif as a part thereof, from a gene arrangement information database; and a gene arrangement information adding process for adding, the gene arrangement information retrieved by the gene arrangement information retrieving process, to the gene arrangement information input by the gene arrangement information inputting process.
According to such a constitution, when extracting a genetic motif, at least one piece of gene arrangement information is input through the gene arrangement information inputting process. Then, a genetic motif is extracted from the input gene arrangement information by the motif extracting process. Further, gene arrangement information including, as a part thereof, the extracted motif is retrieved from the gene arrangement information database, by the gene arrangement information retrieving process. The retrieved gene arrangement information is added by the gene arrangement information adding process, as required, to the gene arrangement information input by the gene arrangement information inputting process. Thereafter, motif extraction by the motif extracting process and retrieval of gene arrangement information by the gene arrangement information retrieving process are repeated, to thereby clarify functions and structures of gene arrangement information gradually. Namely, the gene arrangement information including, as a part thereof, the extracted motif can be added to the input gene arrangement information, so that the extracted motif is reutilized to thereby improve a motif extracting efficiency for presuming functions and structures of genes.
As a third aspect to achieve the aforementioned object, the present invention provides a recording medium recorded with a genetic motif extracting and processing program for realizing: a gene arrangement information inputting function for inputting at least one piece of gene arrangement information; a motif extracting function for extracting a genetic motif from the gene arrangement information input by the gene arrangement information inputting function; a gene arrangement information retrieving function for retrieving, based on the motif extracted by said motif extracting function, gene arrangement information including said motif as a part thereof, from a gene arrangement information database; and a gene arrangement information adding function for adding, the gene arrangement information retrieved by the gene arrangement information retrieving function, to the gene arrangement information input by the gene arrangement information inputting function.
In this respect, the term xe2x80x9crecording mediumxe2x80x9d means a medium, which is capable of assuredly recording electronic information and also assuredly taking out the recorded electronic information as required, and which includes a mobile recording medium such as magnetic tape, magnetic disk, magnetic drum, IC card, and CD-ROM.
According to such a constitution, the recording medium is recorded with the genetic motif extracting and processing program for realizing the gene arrangement information inputting function, motif extracting function, gene arrangement information retrieving function, and gene arrangement information adding function. Thus, Those who obtained the recording medium such a program can readily construct the motif extracting and processing apparatus according to the present invention, utilizing a general computer system.
Further objects, features and advantages of the present invention will become more apparent from the following description of preferred embodiments when read in conjunction with the accompanying drawings.