Learning-capable systems such as neural nets are being used increasingly for risk assessment, because they are capable of recognizing and representing complex relationships between measured factors and outcomes that are not known a priori. This capability allows them to provide more reliable and/or more precise risk probability estimates than conventional procedures that are forced to assume a special form of the relationship, such as linear dependence.
In the field of medical applications, e.g., in treatment of cancer, the use of learning-capable systems such as neural nets or recursive partitioning (such as the well-known CART, “Classification and Regression Trees”, see for example: L Breiman et al., “Classification and Regression Trees”, Chapman and Hall, New York (1984)) for assessment of the risk probability of an event is known, even for censored data. (Outcome data is known as “censored” if some events that eventually occur are not necessarily observed due to the finite observation time.) An example of the application of learning-capable systems in cancer is the task of determining, at a point in time just after primary therapy, a patient's risk probability (say, risk of future disease (relapse)), in order to support the therapy decision.
The “factors” of the data sets comprise a set of objective characteristics whose values are not influenced by the person operating the learning capable system. In the case of primary breast cancer, these characteristics may typically comprise                Patient age at time of surgery        Number of affected lymph nodes        Laboratory measurement of the factor uPA        Laboratory measurement of the factor PAI-1        Characteristic of tumor size,        Laboratory measurement of the estrogen receptor,        Laboratory measurement of the progesterone receptor.        
The form of therapy actually administered can also be coded as a factor in order that the system also recognize relationships between therapy, and outcome.
The values are stored on an appropriate storage medium and are presented to the learning capable system. However, as a rule, individual measurements are subject to uncertainty analogous to the noise in a measured signal. The task of the learning capable system is to process these noisy values into refined signals which provide, within the framework of an appropriate probability representation, risk assessment.
The learning capability of networks even for nonlinear relationships is a consequence of their architecture and functionality. For example, a so-called “multilayer perceptron” (abbreviated “MLP” in the literature) comprises one input layer, one hidden layer, and one output layer. The “hidden nodes” present in a neural net serve the purpose of generating signals for the probability of complex internal processes. Hence, they have the potential to represent and reveal for example underlying aspects of biological processes that are not directly observable, but which nonetheless are ultimately critical for the future course of a disease.
Internal biological processes can proceed in parallel, at different rates, and can also interact Learning capable systems are capable of recognizing and representing even such internal processes that are not directly observable; in such cases, the quality of this recognition manifests itself indirectly, after learning has taken place, by virtue of the quality of the prediction of the events actually observed.
By recursive partitioning (e.g., CART), classification schemes are created that are analogous to the capabilities of neural nets in their representation of complex internal relationships.
The course of a disease may lead to distinct critical events whose prevention might require different therapy approaches. In the case of first relapse in breast cancer, for example, it is possible to classify findings uniquely into the following mutually exclusive categories    1. “distant metastasis in bone tissue”    2. “distant metastasis but no findings in bone”    3. “loco-regional” relapse.
Now, once one of these events has occurred, the subsequent course of the disease, in particular the probability of the remaining categories, can be affected; hence, in a statistical treatment of such data it is often advisable to investigate just first relapses. For illustration, in the case of a breast cancer patient suffering local relapse at 24 months after primary surgery and observed with “bone metastasis” at 48 months, only category 3 is relevant if one restricts to first relapse. The follow-up information on bone metastasis would not be used in this framework, i.e., the patient is regarded as “censored” for category 1 as soon as an event in another “competing” category (here local relapse) has occurred.
Competing risks can also occur for example due to a patient's dying of an entirely different disease or of a side-effect of therapy so that the risk category of interest to the physician is not observed.
For one skilled in the art, it is relatively obvious that by applying an exclusive endpoint classification with a censoring rule for unrealized endpoints, the data can be projected onto a form such that for each possible endpoint, according to the prior art, a separate neural net can be trained or a classification tree can be constructed by recursive partitioning. In the example with outputs 1-3, three completely independent neural networks or three independent decision trees would need to be trained.
A problem with this use of the prior art is that detection of possible predictive value of internal nodes with respect to one of the disease outcomes is lost with respect to the remaining disease outcomes. In reality, however, an internal biological process, detected by internal nodes of a neural network, could contribute to several different outcomes, albeit with different weightings. For example, the biological “invasiveness” of a tumor has a differing but significant impact both on distant metastasis and local relapse. The separately trained nets would each need to “discover” independently the impact of an Internal relationship coded in a node.
It is evident that the number of real events presented to a learning capable system is an important determinant of the detection quality, analogously to the statistical power of a system. This number is usually limited in medical applications. Hence, the probability is relatively high that an internal process will barely exceed the detection threshold with respect to one outcome but not with respect to the others. Under these circumstances, the potential impact to distinguish factor influences, as well as the biological explanatory potential of an internal node even for other outcomes, are lost.
Since therapies often have side effects it is typical for the medical decision context that the reduction of one risk category may occur at the expense of an increase of another risk. For this, the need to train a completely new neural net for each separate risk, as required by the prior art, is unsatisfactory.
The time-varying impact of factors on outcomes can be represented according to the prior art by different nodes in the output layer corresponding to particular time-dependent functions (e.g., by the known method of fractional polynomials). Although a time-varying assessment of the hazard rate is possible according to the prior art, the problem of competing risks cannot be formulated according to the prior art without interfering with a proper assessment of time-varying hazards.
In view of the deficiencies of the prior art, the task of the invention is to provide a method for detecting, identifying, and representing competing risks according to their intrinsic logical and/or causal relationship, in particular in such a manner that determination of a time-varying assessment is not restricted.