Owing to recent advances in the field of molecular biology, an enormous amount of gene information is now available. As a consequence, it is necessary to make efforts using computers to extract information from a rapidly increasing number of successively clarified sequence data or an enormous number of gene expression data. Development of various computer tools for homology screening, protein classification, gene pooling, and the like has been attracting attention so far.
In connection with these attempts, several but not so many examples of studies are known which relate to methods of inferring a gene regulation network (hereinafter referred to as a gene network) from gene expression data. Gene expression data can be obtained in either the form of time series data (“time series data” refers to data obtained by measurement of gene expression amounts of a subject group of genes over the course of time), or steady state data (“steady state data” refers to data obtained by measurement of gene expression amounts of a subject group of genes under a plurality of differing experimental conditions (for example, gene mutation, or administration of a medicament)).
A method of analyzing a time series can predict a network using various methods, for example, information theory, heredity algorithm or simulated annealing (Non-Patent Document 1). However, an approach based on analysis of a time series requires that experimental results are obtained at very short intervals without experimental noise. This is very difficult to achieve with current techniques.
On the other hand, a number of methods of analyzing a steady state data have already been proposed. The steady state data can be obtained by mutating a specific gene activity, for example, by causing deletion or over-expression of a gene. Deletion is presently being performed on a large scale by the Yeast Genome Deletion Consortium and the like, and as a result, deletion-type expression profiles for various genes will become readily available in the near future (Non-Patent Document 2).
The present inventors have developed a new method and program for predicting a gene network, which predicts a gene network without simplifying (binarizing) the gene expression amount using, as basic data, a gene expression profile (detected values) obtained by inducing mutation (Patent Document 1). With this method, one of a plurality of genes is expressed under two conditions and the expression amount of this gene is detected for each of the conditions. At this time, the expression amount of each of the other genes is detected for each of the conditions. The difference among the detected values obtained is then determined and used as an indicator to derive the causal relationship between the one of the genes and the others.
Patent Document 1 also discloses a method for predicting a gene network which method detects and removes an indirect causal relationship (expressed as a “redundant causal relationship” in Patent Document 1) from a given gene network.
Non-Patent Document 1: Liang, S. et al., Proc. Pacific Symp. Biocomputing '98, World Scientific, 18-29, 1998.; Morohashi, M. and Kitano, H., Proc. 5th Euro. Conf. Artificial Life, Springer, 477-486, 1999.; Mjolsness, E., et al., Tech. Rept. JPL-ICTR-99-4, Jet Propulsion Lab., NASA, 1999.
Non-Patent Document 2: Winzeler, E. A. et al., Science, 285 (5429): 901-906, 1999.
Patent Document 1: WO 2002/038749