Many diseases and disorders, such as cancer, have very complex genetic and phenotypic abnormalities and an unpredictable biological behaviour. The cancer cell represents the end-point of successive generations of clonal cell evolution, multiple gene mutations, genomic instability, and erroneous gene expression.
The biological behaviour of cancer is determined by multiple factors, most importantly the biological characteristics of the individual cancer, but also the biology of the patient such as age, sex, race, genetic constitution and the like, and the location of the cancer. This biological aud genetic complexity of cancer means that any individual cancer may follow an unpredictable clinical course, with an uncertain outcome for the patient.
Where multiple treatment options are available for a particular cancer, it is necessary to have an accurate prognosis for the patient, so that treatment can be tailored to the individual disease of that patient.
The clinical and information tools currently available to clinicians for the classification and prognostic evaluation of cancer have serious limitations, especially when applied to an individual patient. It would be desirable to integrate the clinical and biological information for a given cancer in an individual patient's particular cancer.
Gene expression data is available using standard microarray data. Gene expression microarrays, such as available from Affymetrix™, provide a large volume of data and can be used to characterise a particular disease or condition in a patient by comparing diseased or abnormal tissue with healthy normal tissue. However, the data obtained can be difficult to process to obtain meaningful information about a particular condition or disease.
This problem is particularly acute in medical applications relating to patient treatment. In order to influence patent management in a clinical environment, clinical decision support systems must have a high level of confidence. Shipp et al have elegantly demonstrated the potential of machine learning techniques for prognostic strafication of patients, however their approach misclassified 30% of the patients in terms of predicting the outcome of their treatment. They achieved 70% correct prognosis of cured cases of B-cell lymphoma cancer, and wrongly predicted 12% of the cases as cured in contrast to the actual fatal outcome. This accuracy is not appropriate for a clinical application of the model. The models on the same data presented in Alizadeh et al are not clinically applicable either.
Another difficulty with prior art approaches using machine learning is that they often do not provide an easy means for the model should new data become available. Instead, complete retaining of the model is required. This is time-consuming and potentially expensive as it involves intensive computational resources. It is desirable that any system used be able to adapt to the addition of new data without complete retaining of the system.
It has been found by the inventors that in a general view, techniques utilising Evolving Connectionsist Systems (ECOS) techniques have the following advantages when compared with the traditional statistical and neural network techniques: (i) they have a flexible structure that reflects the complexity of the data used for their training; (ii) they perform both clustering and classification/prediction; (iii) the models can be adapted on new data without the need to be retained on old data; (iv) they can be used to extract rules (profiles) of different sub-classes of samples. The rules (profiles) are fuzzy with some statistical coefficients attached.
It is therefore an object of the present invention to provide a method for determining a relationship between gene expression data and one or more conditions or prognostic outcome, or at least to provide the public with a useful choice.