2.1. General
For large scale data analysis, neural networks have supplemented or replaced hitherto conventional methods of analysis in many fields. It has namely been shown that neural networks are better than conventional methods at discovering and identifying in the datasets hidden, not immediately evident dependencies between individual input data. When new data of the same data type is input, neural networks which have been trained using a known dataset therefore deliver more reliable results than previous methods of analysis.
In the field of medical applications for example, the use of neural networks to determine a survival function for patients suffering from a particular disease, such as cancer, is known. Said survival function indicates the probability of a predetermined event occurring for the patient in question depending on the time that has elapsed since the first occurrence of the disease. Said predetermined event need not necessarily be the death of the patient, as would be inferred from the designation “survival function”, but may be any event, for example a recurrence of cancer.
The data records comprise a whole range of objectifiable information, that is to say data on whose value any neural network operator has no influence and whose value can be automatically captured if desired. In the case of breast cancer this is information about the patient's personal data, such as age, sex and the like, information about the medical condition, such as number of lymph nodes affected by cancer, biological tumor factors such as upA (Urokinase Plasminogen Activator), its inhibitor PAI-1 and similar factors, as well as information about the treatment method, for example type, duration and intensity of chemotherapy or radiotherapy. It goes without saying that a whole range of the abovementioned information, in particular the information about the medical condition, can only be determined using suitable measuring apparatus. Furthermore, the personal data can be automatically read in from suitable data media, for example machine-readable identity cards or the like. If they are not all available at the same time, which is often the case especially with laboratory measurements, the objectifiable data can of course be temporarily stored in a database on a suitable storage medium before they are fed to the neural network as input data.
2.2. The Neural Network as Signal Filter
In accordance with the foregoing, therefore, it is possible to conceive of a neural network as a kind of “signal filter” that filters out a meaningful output signal from a noisy, and therefore as yet non-meaningful input signal. As with any filter, whether or how well the filter is able to fulfill its function depends on whether it is possible to keep the intensity of the filter's intrinsic noise low enough that the signal to be filtered out is not lost in this intrinsic noise.
The greater the number of data records available for training the neural network on the one hand and the simpler the structure of the neural network on the other hand, the lower the intensity of the “intrinsic noise” of a neural network. Moreover, the generalizability of the network increases, the simpler the structure of the neural network. In the case of a conventional procedure in the prior art, therefore, one part of the training of neural networks is concerned with locating and eliminating parts of the structure that can be dispensed with for obtaining a meaningful output signal. With this “thinning out” (also known as “pruning” in the jargon) however, a further constraint to be taken into account is that the structure of the neural network cannot be “pruned” ad infinitum because as the complexity of the neural network is reduced, its ability to map complex interrelationships, and hence its meaningfulness, is also diminished.
2.3. Problems with Medical Application
In practice, and in particular in the case of the medical application of neural networks mentioned at the beginning, the problem is often encountered that only very small datasets of typically a few hundred data records are available for training the neural network. To compound the difficulty, not only a training dataset, but also a validation dataset and a generalization dataset must be provided for the training. The significance of said two datasets will be discussed in greater detail below in sections 5.5 and 5.7.
With such small datasets, the use of known pruning methods always led to so great a simplification of the structure of the neural network that the meaningfulness of the neural network diminished to an unacceptable level. To nevertheless obtain neural networks that delivered meaningful output signals after completion of the training phase, in the prior art neural networks with a rigid, that is to say fixed and invariable, structure were used where only small training datasets were available. The degree of complexity, or the simplicity, of this rigid structure was selected here on the basis of empirical knowledge in such a way that the neural network had on the one hand a high degree of meaningfulness while on the other hand having a still acceptable intrinsic noise level. It has hitherto been assumed that the specification of an invariable structure was unavoidable.
Another problem with medical applications of neural networks is the fact that only “censored” data are available for training. The term “censored” is used to denote the circumstance that it is not possible to foresee the future development for patients who have fortunately not yet suffered a relapse at the time of data capture, and statements about the survival function are therefore only possible up until the time the data were recorded.
It goes without saying that in particular in the case of medical applications it is not possible to forego a truly meaningful result under any circumstances whatsoever. Under no circumstances is it namely acceptable for even one single patient to be denied a treatment simply because the neural network did not consider it necessary. The consequences for the patient could be incalculable.
With respect to the details of the prior art outlined above, please see the articles listed in section 6. “References”.