The subject matter of each of the above-noted applications and provisional application is herein incorporated in its entirety by reference thereto.
Three computer Appendices containing computer program source code for programs described herein have been submitted in prior applications. The Computer Appendices are each incorporated herein by reference in its entirety. Appendix III is provided herewith.
Thus, a portion of the disclosure of this patent document contains material that is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or patent disclosure, as it appears in the Patent and Trademark Office patent file or records, but otherwise reserves all copyright rights whatsoever.
This subject matter of the invention relates to the use of prediction technology, particularly nonlinear prediction technology, for the development of medical diagnostic aids for pregnancy-related and fertility-related conditions. In particular, training techniques operative on neural networks and other expert systems with inputs from patient historical information for the development of medical diagnostic tools and methods of diagnosis are provided.
Data Mining, Decision Support-Systems and Neural Networks
A number of computer decision-support systems have the ability to classify information and identify patterns in input data, and are particularly useful in evaluating data sets having large quantities of variables and complex interactions between variables. These computer decision systems which are collectively identified as xe2x80x9cdata miningxe2x80x9d or xe2x80x9cknowledge discovery in databasesxe2x80x9d (and herein as decision-support systems) rely on similar basic hardware components, e.g., personal computers (PCS) with a processor, internal and peripheral devices, memory devices and input/output interfaces. The distinctions between the systems arise within the software, and more fundamentally, the paradigms upon which the software is based. Paradigms that provide decision-support functions include regression methods, decision trees, discriminant analysis, pattern recognition, Bayesian decision theory, and fuzzy logic. One of the more widely used decision-support computer systems is the artificial neural network.
Artificial neural networks or xe2x80x9cneural netsxe2x80x9d are parallel information processing tools in which individual processing elements called neurons are arrayed in layers and furnished with a large number of interconnections between elements in successive layers. The functioning of the processing elements are modeled to approximate biologic neurons where the output of the processing element is determined by a typically non-linear transfer function. In a typical model for neural networks, the processing elements are arranged into an input layer for elements which receive inputs, an output layer containing one or more elements which generate an output, and one or more hidden layers of elements therebetween. The hidden layers provide the means by which non-linear problems may be solved. Within a processing element, the input signals to the element are weighted arithmetically according to a weight coefficient associated with each input. The resulting weighted sum is transformed by a selected non-linear transfer function, such as a sigmoid function, to produce an output, whose values range from 0 to 1, for each processing element. The learning process, called xe2x80x9ctrainingxe2x80x9d, is a trial-and-error process involving a series of iterative adjustments to the processing element weights so that a particular processing element provides an output which, when combined with the outputs of other processing elements, generates a result which minimizes the resulting error between the outputs of the neural network and the desired outputs as represented in the training data. Adjustment of the element weights are triggered by error signals. Training data are described as a number of training examples in which each example contains a set of input values to be presented to the neural network and an associated set of desired output values.
A common training method is backpropagation or xe2x80x9cbackpropxe2x80x9d, in which error signals are propagated backwards through the network. The error signal is used to determine how much any given element""s weight is to be changed and the error gradient, with the goal being to converge to a global minimum of the mean squared error. The path toward convergence, i.e., the gradient descent, is taken in steps, each step being an adjustment of the input weights of the processing element. The size of each step is determined by the learning rate. The slope of the gradient descent includes flat and steep regions with valleys that act as local minima, giving the false impression that convergence has been achieved, leading to an inaccurate result.
Some variants of backprop incorporate a momentum term in which a proportion of the previous weight-change value is added to the current value. This adds momentum to the algorithm""s trajectory in its gradient descent, which may prevent it from becoming xe2x80x9ctrappedxe2x80x9d in local minima. One backpropogation method which includes a momentum term is xe2x80x9cQuickpropxe2x80x9d, in which the momentum rates are adaptive. The Quickprop variation is described by Fahlman (see, xe2x80x9cFast Learning Variations on Back-Propagation: An Empirical Studyxe2x80x9d, Proceedings on the 1988 Connectionist Models Summer School, Pittsburgh, 1988, D. Touretzky, et al., eds., pp.38-51, Morgan Kaufmann, San Mateo, Calif.; and, with Lebriere, xe2x80x9cThe Cascade-Correlation Learning Architecturexe2x80x9d, Advances in Neural Information Processing Systems 2,(Denver, 1989), D. Touretzky, ed., pp. 524-32. Morgan Kaufmann, San Mateo, Calif.). The Quickprop algorithm is publicly accessible, and may be downloaded via the Internet, from the Artificial Intelligence Repository maintained by the School of Computer Science at Carnegie Mellon University. In Quickprop, a dynamic momentum rate is calculated based upon the slope of the gradient. If the slope is smaller but has the same sign as the slope following the immediately preceding weight adjustment, the weight change will accelerate. The acceleration rate is determined by the magnitude of successive differences between slope values. If the current slope is in the opposite direction from the previous slope, the weight change decelerates. The Quickprop method improves convergence speed, giving the steepest possible gradient descent, helping to prevent convergence to a local minimum.
When neural networks are trained on sufficient training data, the neural network acts as an associative memory that is able to generalize to a correct solution for sets of new input data that were not part of the training data. Neural networks have been shown to be able to operate even in the absence of complete data or in the presence of noise. It has also been observed that the performance of the network on new or test data tends to be lower than the performance on training data. The difference in the performance on test data indicates the extent to which the network was able to generalize from the training data. A neural network, however, can be retrained and thus learn from the new data, improving the overall performance of the network.
Neural nets, thus, have characteristics that make them well suited for a large number of different problems, including areas involving prediction, such as medical diagnosis.
Neural Nets and Diagnosis
In diagnosing and/or treating a patient, a physician will use patient condition, symptoms, and the results of applicable medical diagnostic tests to identify the disease state or condition of the patient. The physician must carefully determine the relevance of the symptoms and test results to the particular diagnosis and use judgement based on experience and intuition in making a particular diagnosis. Medical diagnosis involves integration of information from several sources including a medical history, a physical exam and biochemical tests. Based upon the results of the exam and tests and answers to the questions, the physician, using his or her training, experience and knowledge and expertise, formulates a diagnosis. A final diagnosis may require subsequent surgical procedures to verify or to formulate. Thus, the process of diagnosis involves a combination of decision-support, intuition and experience. The validity of a physician""s diagnosis is very dependent upon his/her experience and ability.
Because of the predictive and intuitive nature of medical diagnosis, attempts have been made to develop neural networks and other expert systems that aid in this process. The application of neural networks to medical diagnosis has been reported. For example, neural networks have been used to aid in the diagnosis of cardiovascular disorders (see, e.g., Baxt (1991) xe2x80x9cUse of an Artificial Neural Network for the Diagnosis of Myocardial Infarction,xe2x80x9d Annals of Internal Medicine 115:843; Baxt (1992) xe2x80x9cImproving the Accuracy of an Artificial Neural Network Using Multiple Differently Trained Networks,xe2x80x9d Neural Computation 4:772; Baxt (1992), xe2x80x9cAnalysis of the clinical variables that drive decision in an artificial neural network trained to identify the presence of myocardial infarction,xe2x80x9d Annals of Emergency Medicine 21:1439; and Baxt (1994) xe2x80x9cComplexity, chaos and human physiology: the justification for non-linear neural computational analysis,xe2x80x9d Cancer Letters 77:85). Other medical diagnostic applications include the use of neural networks for cancer diagnosis (see, e.g., Maclin, et al. (19910 xe2x80x9cUsing Neural Networks to Diagnose Cancerxe2x80x9d Journal of Medical Systems 15:11-9; Rogers, et al. (1994) xe2x80x9cArtificial Neural Networks for Early Detection and Diagnosis of Cancerxe2x80x9d Cancer Letters 77:79-83; Wilding, et al. (1994) xe2x80x9cApplication of Backpropogation Neural Networks to Diagnosis of Breast and Ovarian Cancerxe2x80x9d Cancer Letters 77:145-53), neuromuscular disorders (Pattichis, et al. (1995) xe2x80x9cNeural Network Models in EMG Diagnosisxe2x80x9d, IEEE Transactions on Biomedical Engineering 42:5:486-495), and chronic fatigue syndrome (Solms, et al. (1996) xe2x80x9cA Neural Network Diagnostic Tool for the Chronic Fatigue Syndromexe2x80x9d, International Conference on Neural Networks, Paper No. 108). These methodologies, however, fail to address significant issues relating to the development of practical diagnostic tests for a wide range of conditions and does not address the selection of input variables.
Computerized decision-support methods other than neural networks have been reported for their applications in medical diagnostics, including knowledge-based expert systems, including MYCIN (Davis, et al., xe2x80x9cProduction Systems as a Representation for a Knowledge-based Consultation Programxe2x80x9d, Artificial Intelligence, 1977; 8:1:15-45) and its progeny TEIRESIAS, EMYCIN, PUFF, CENTAUR, VM, GUIDON, SACON, ONCOCIN and ROGET. MYCIN is an interactive program that diagnoses certain infectious diseases and prescribes anti-microbial therapy. Such knowledge-based systems contain factual knowledge and rules or other methods for using that knowledge, with all of the information and rules being pre-programmed into the system""s memory rather than the system developing its own procedure for reaching the desired result based upon input data, as in neural networks. Another computerized diagnosis method is the Bayesian network, also known as a belief or causal probabilistic network, which classifies patterns based on probability density functions from training patterns and a priori information. Bayesian decision systems are reported for uses in interpretation of mammograms for diagnosing breast cancer (Roberts, et al., xe2x80x9cMammoNet: A Bayesian Network diagnosing Breast Cancerxe2x80x9d, Midwest Artificial Intelligence and Cognitive Science Society Conference, Carbondale, Ill., April 1995) and hypertension (Blinowska, et al. (1993) xe2x80x9cDiagnosticaxe2x80x94A Bayesian Decision-Aid Systemxe2x80x94Applied to Hypertension Diagnosisxe2x80x9d, IEEE Transactions on Biomedical Engineering 40:230-35) Bayesian decision systems are somewhat limited in their reliance on linear relationships and in the number of input data points that can be handled, and may not be as well suited for decision-support involving non-linear relationships between variables. Implementation of Bayesian methods using the processing elements of a neural network can overcome some of these limitations (see, e.g., Penny, et al. (1996) In xe2x80x9cNeural Networks in Clinical Medicinexe2x80x9d, Medical Decision-support, 1996; 16:4: 386-98). These methods have been used, by mimicking the physician, to diagnose disorders in which important variables are input into the system. It, however, would be of interest to use these systems to improve upon existing diagnostic procedures.
Preterm Delivery and Other Pregnancy-Related Conditions
Determination of impending preterm births and the risk of preterm births is critical for increasing neonatal survival of preterm infants. Many methods for detecting or predicting the risk of preterm birth and/or the risk of impending preterm delivery are subjective, not sufficiently sensitive, and not specific. In particular, preterm neonates account for more than half, and maybe as many as three-quarters of the morbidity and mortality of newborns without congenital anomalies. Although tocolytic agents that delay delivery were introduced 20 to 30 years ago, there has been only a minor decrease in the incidence of preterm delivery. It has been postulated that the failure to observe a larger reduction in the incidence of preterm births is due to errors in the diagnosis of preterm labor and the risk of preterm delivery and because the conditions are too advanced by the time they are recognized for tocolytic agents to successfully delay the birth.
There are a number of biochemical tests for assessing the risk of preterm delivery and other traditional methods of diagnosis based on symptomologies. These methods have false-negative and false-positive error rates. Traditional diagnosis also can require subjective interpretation and may require sophisticated training or equipment. The validity of the diagnosis is related to the experience and ability of the physician. Thus, there is a need for improved methods for assessing risk of preterm delivery, predicting imminent delivery and assessing time of delivery.
Therefore, it is an object herein to provide a non-invasive diagnostic aid for assessing the risk of preterm delivery. It is also an object herein to identify new variables, identify new biochemical tests and markers for preterm delivery and to design to new diagnostic tests that improve upon existing diagnostic methodologies.
Methods using decision-support systems for the diagnosis of and for aiding in the diagnosis of diseases, disorders and other medical conditions are provided. In particular, methods provided herein, assess the risk of preterm delivery and also the risk of delivery in a selected period of time (delivery-related risks). These methods are useful for assessing these risks in symptomatic pregnant female mammals, particularly human females.
Also provided are methods that use patient history data and identification of important variables to develop a diagnostic test for these assessing these delivery-related risks; a method for identification of important selected variables for use in assessing these delivery-related risks; a method of designing a diagnostic test for assessing; a method of evaluating the usefulness of diagnostic test for these assessments; a method of expanding clinical utility of a diagnostic test to include assessment of these delivery-related risks, and a method of selecting a course of treatment to reduce the risk of delivery within a selected period of time or preterm by predicting the outcome of various possible treatments.
Also provided are disease parameters or variables to aid in predicting pregnancy-related events, such as the likelihood of delivery within a particular time period, and for assessing the risk of preterm delivery.
Also provided are means to use neural network training to guide the development of the tests to improve their sensitivity and specificity, and to select diagnostic tests that improve overall diagnosis of, or potential for, assessment of the risk of preterm delivery or delivery within a selected period of time. A method for evaluating the effectiveness of any given diagnostic test is assessment of the risk of preterm delivery or delivery within a selected period of time is also provided. Also provided herein is a method for identifying variables or sets of variables that aid in the assessment of the risk of preterm delivery or delivery within a selected period of time.
Methods are provided for developing medical diagnostic tests for assessment of the risk of preterm delivery or delivery within a selected period of time using computer-based decision-support systems, such as neural networks and other adaptive processing systems (collectively, xe2x80x9cdata mining toolsxe2x80x9d). The neural networks or other such systems are trained on the patient data and observations collected from a group of test patients in whom the condition is known or suspected; a subset or subsets of relevant variables are identified through the use of a decision-support system or systems, such as a neural network or a consensus of neural networks; and another set of decision-support systems is trained on the identified subset(s) to produce a consensus decision-support system based test, such as a neural net-based test for the condition. The use of consensus systems, such as consensus neural networks, minimizes the negative effects of local minima in decision-support systems, such as neural network-based systems, thereby improving the accuracy of the system.
To refine or improve performance, the patient data can be augmented by increasing the number of patients used. Also biochemical test data and other data may be included as part of additional examples or by using the data as additional variables prior to the variable selection process.
The resulting systems are used as an aid in assessment of the risk of preterm delivery or delivery within a selected period of time. In addition, as the systems are used patient data can be stored and then used to further train the systems and to develop systems that are adapted for a particular genetic population. This inputting of additional data into the system may be implemented automatically or done manually. By doing so the systems continually learn and adapt to the particular environment in which they are used. The resulting systems have numerous uses in addition to assessment of the risk of preterm delivery or delivery within a selected period of time, which include predicting the outcome of a selected treatment protocol. The systems may also be used to assess the value of other data in a diagnostic procedure, such as biochemical test data and other such data, and to identify new tests that are useful for assessment of the risk of preterm delivery or delivery within a selected period of time.
The methods are exemplified with reference to neural networks, however, it is understood that other data mining tools, such as expert systems, fuzzy logic, decision trees, and other statistical decision-support systems which are generally non-linear, may be used. Although the variables provided herein are intended to be used with decision-support systems, once the variables are identified, then a person, typically a physician, armed with knowledge the important variables can use them to aid in diagnosis in the absence of a decision-support system or using a less complex linear system of analysis.
In the methods for identifying and selection of important variables and generating systems for diagnosis, patient data or information, typically patient history or clinical data that are the answers to particular queries are collected and variables based on this data are identified. For example, the data includes the answer to a query regarding the number of pregnancies each patient has had. The extracted variable is, thus, number of pregnancies and the query is the how many prior pregnancies (set forth herein as prior pregnancies). The variables are analyzed by the decision-support systems, exemplified by neural networks, to identify important or relevant variables.
A plurality of factors, twelve to about sixteen, particularly a set of fourteen factors, in a specific trained neural network extracted from a collection have been identified as indicia for preterm delivery.
In other embodiments, for example, a method for assessing the risk of delivery prior to completion of 35 weeks of gestation, comprising assessing a subset of variables containing at least three and up to all of the responses to the following queries: Ethnic Origin Caucasian; Marital Status living with partner; EGA by sonogram; EGA at sampling; estimated. date of delivery by best; cervical dilatation (CM); parity-preterm; vaginal bleeding at time of sampling; cervical consistency at time of sampling; and previous pregnancy without complication is provided. The method uses a decision-support system that has been trained to assesses the risk of delivery prior to 35 weeks of gestation.
A method for assessing the risk for delivery in 7 or fewer days, comprising assessing a subset of variables containing at least three up to all of the following variables: Ethnic Origin Caucasian; Uterine contractions with or without pain; Parity-abortions; vaginal bleeding at time of sampling; uterine contractions per hour; and No previous pregnancies is provided. The method uses a decision-support system that has been trained to assesses the risk of delivery within seven days.
A method for assessing the risk for delivery in 14 or fewer days, comprising assessing a subset of variables containing at least three up to all of the following variables: Ethnic Origin Hispanic; Marital Status living with partner; Uterine contractions with or without pain; Cervical dilatation; Uterine contractions per hour; and No previous pregnancies is provided. This method uses a decision-support system that has been trained to assess the risk of delivery within fourteen days.
As shown herein, variables or combinations thereof that heretofore were not known to be important in aiding in assessment of the risk of preterm delivery or delivery within a selected period of time are identified. In addition, patient history data, without supplementing biochemical test data, can be used to diagnose or aid in diagnosing a disorder or condition when used with the decision-support systems, such as the neural nets provided herein.
Also provided herein is a method of identifying and expanding clinical utility of diagnostic test. The results of a particular test, particular one that had heretofore not been considered of clinical utility with respect to assessment of the risk of preterm delivery or delivery within a selected period of time, are combined with the variables and used with the decision-support system, such as a neural net. If the performance, the ability to correctly diagnose a disorder, of the system is improved by addition of the results of the test, then the test will have clinical utility or a new utility is assessing the risk of preterm delivery.
Similarly, the resulting systems can be used to identify new utilities for drugs or therapies and also to identify uses for particular drugs and therapies for reducing the risk of preterm delivery. For example, the systems can be used to select subpopulations of patients for whom a particular drug or therapy is effective. Thus, methods for expanding the indication for a drug or therapy and identifying new drugs and therapies are provided. Diagnostic software and exemplary neural networks that use the variables for assessment of the risk of delivery before a specified time are also provided.
In other embodiments, the performance of a diagnostic neural network system for assessing risk of preterm delivery is enhanced by including variables based on biochemical test results from a relevant biochemical test as part of the factors (herein termed biochemical test data) used for training the network. One of exemplary networks described herein that results therefrom is an augmented neural network that employs 6 input factors, including results of a biochemical test and the 7 clinical parameters. The set of weights of the augmented neural networks differ from the set of weights of the clinical data neural networks. The exemplified biochemical test employs an immuno-diagnostic test format, such as the ELISA diagnostic test format. Neural networks, thus, can be trained to predict the disease state based on the identification of factors important in predicting the disease state and combining them with biochemical data.
The resulting diagnostic systems may be adapted and used not only for diagnosing the presence of a condition or disorder, but also the severity of the disorder and as an aid in selecting a course of treatment.