1. Field of the Invention
The present invention relates to a method for processing expression data of two or more genes, and for example, it is preferable method for processing expression data of a large number of genes on a DNA microarray and applying it to the diagnosis.
2. Description of Related Art
In recent years, it has been studied that useful data is extracted from a large quantity of gene expression data obtained from a DNA microarray and it is analyzed. As a general medical statistics analyzing method for this extraction and analyzing method, a logistic regression method or the like is known.
Also by the above-described logistic regression method, gene expression data is capable of being processed and acquiring useful information.
However, in the case where the object of the processing of the gene expression data relates to the diagnosis of the patients, a high reliability is required for the results therefrom. Here, since a logistic regression method is one of linear analysis methods, it is difficult to expect a non-linear phenomenon such as DNA analysis which attempts to find the results by combining the two or more factors with a high precision. Therefore, recently, as a non-linear analyzing method, NN (Neural Network) modeling method has been proposed. The NN modeling method is a modeling method in which learning processes are incorporated, the precision of the expected results is very high. In the diagnosis of the diseases concerned with two or more factors, or the like, the NN method is higher than the logistic regression method in the terms of the expected precision.
On the other hand, in the diagnosis, it is desired to indicate the evidence leading to the diagnosis. As a method capable of deriving such an evidence as well as the results of the diagnosis, the present inventors have directed their attention to the FNN (Fuzzy Neural Network) model which is one of the NN (Neural Network) models.
In order to construct such a FNN model, as similar to the NN model, it is necessary to decide parameters (coefficients) and input variables.
Conventionally, as a method for deciding these parameters and input variables, back propagation method has been proposed. However, it is said that there are 30,000 and over genes of human genome, all these genes are candidates for input variables. It is impossible to make NN of 30,000 pieces of inputs, and it is necessary to select the important genes among them. In a NN method, the smaller the number of input variables is, the easier the input variables are processed, however, there is the limitation which is up to about 20 pieces of input variables. In the case where 30,000 pieces of genes are narrowed down to 20 pieces, the combinations are infinite (about 1070 ways), the comparisons are not capable of being carried out by determining the parameters in all of the models. Moreover, even in the simple input variable selection method such as parameter increasing method, since the number of the candidates of input items is very large, it requires 100 hours or more even if only one input is selected. If the number of input variables is increased, the calculation time increases in an exponential manner, therefore, if a model with substantially high reliability in which the number of inputs up to around 5 inputs is selected is considered, furthermore, the time of its 50-fold or more is expected to require. Still yet, the number of models constructed by taking such a long time is only one model. It is because the determination of parameters is performed by the back propagation method, and because the time is taken too much to construct one model. As described later, causal relationship between the gene expression and the diagnosis of the onset or prognosis of the diseases is not determined by one pathway, but two or more pathways are involved in. Therefore, in order to expect the causal relationship between the gene expression and the diagnosis of the diseases with a high precision, it is necessary to propose two or more models, the construction of only one model is not sufficient.