The creation of computer programs that emulate the complex task of medical diagnosis is a goal pursued by many researchers, for the past half a century. The success up to the present day is only limited. Although several diagnostic programs exist, of which some are commercially available; namely QMR, ILIAD, DXplain, GIDEON and others, they offer restricted diagnostic information. When a patient's symptoms or clinical data are provided to the computer, these programs typically retrieve a long list of possible diagnoses, instead of pinpointing more specifically one or a few diagnoses. Moreover, they exclude very rare diseases.
Prior art calculation of probability for a given diagnosis, including rare diagnoses of rare diseases, is routinely inaccurate because it usually relies on Bayes formula. Bayes calculation requires that clinical data manifested by a patient be independent, and the diagnoses exhaustive and incompatible. These conditions are frequently not fulfilled by actual clinical cases encountered in medical practice.
As a consequence of Ledley and Lusted's efforts, Bayes formula has become extensively used in medical applications. However, when improperly applied in a diagnostic algorithm, as often is the case, it can cause significant inaccuracies. Bayes formula is valid only when the above-mentioned three conditions are fulfilled. More precisely stated:                (i) The diseases processed by the formula must be exhaustive: all known diseases that manifest the considered clinical datum must be included in its denominator. If this condition is violated, some clinical datum originated by a disease not included in the formula will distort the calculated result. Accordingly, the calculated probability of the diagnosis under consideration will be incorrect and will adversely affect the differential diagnosis.        (ii) Clinical data used for calculation of the conditional probability of a diagnosis must be independent: that is, a specific clinical datum should neither favor nor disfavor any other clinical datum of the same disease. In other words, the probability that one clinical datum is manifested by a specific disease, should not depend on the presence of another clinical datum. This is not true in actual clinical cases, where clinical data result from a chain of reactions that originate in a common cause or lesion and are necessarily related. These clinical data configure syndromes that by definition are associations of related clinical data (e.g., jaundice, increased blood bilirubin, and dark urine.) Bayes formula often is applied erroneously to interrelated clinical data of a specific disease, violating this condition of independence and yielding an inaccurate result. To solve the problems of independence and incompatibility, so-called Bayesian networks have been devised, but their application to diagnostic algorithms is excessively complicated and hard to compute.        (iii) The diseases must be incompatible, which means that clinical data justified by one disease cannot be justified by another disease. When concurrent diseases occur, some clinical data may be caused by more than one of them. Because Bayes formula is only capable to calculate probabilities of competing diagnosis, which are incompatible, it is unsuitable to handle concurrent diseases, afflicting the same patient simultaneously.        
In addition to the above limitations, few of the prior art programs, if any, recommend a probabilistically calculated best cost-benefit clinical datum to investigate in the patient at each diagnostic step. Such recommendation, however, could precipitate a more efficient, economic and rapid final diagnosis. In addition, none of the prior art programs recommends on a probabilistically calculated basis, a set of clinical data to be investigated in the patient simultaneously, as required to avoid the need of the physician to contact the patient after each new test result to order the next one. Furthermore, although the cost of investigating a clinical datum is mentioned by some prior art teachings, none offer an effective way to consider the expense beyond dollar cost to include discomfort and risk of the procedure to obtain the datum.
Moreover, prior art programs typically do not diagnose concurrent diseases afflicting a patient. They also do not prepare complex clinical presentations of a specific disease with its diverse complications and associations of clinical entities, interactions between concurrent diseases or drugs that may mask important clinical data of the primary disease, and many other nuances related to clinical practice. A part of this problem is due to the fact that the prior art programs encounter fundamental difficulties in trying to distinguish competing diagnoses from concurrent diagnoses.
The reason for many of the limitations of the prior art diagnostic programs is due to their underlying mathematical approach. Specifically, they are usually based on entangled networks, Bayesian networks or neural networks. Any of these are difficult or even impossible to implement and update in a manner compatible with an efficient, real-time and computer-implemented diagnostic method.
Another complicating aspect of prior art computer-implemented diagnostic approaches has to do with the way in which clinical data are treated. Clinical data, especially subjective symptoms, typically have diverse and non-exclusive qualities. For example, angina pectoris typically is retrosternal, radiating to the neck, jaw, upper extremities; it is oppressive, lasting only a few minutes; it is exertion related and is relieved by nitroglycerine. A number of prior art methods confer values to these qualities, their chronology, and their evolution. This is correct when such qualities are in overwhelming support of a diagnosis. Nevertheless, the algorithmic handling these subjective qualities vastly complicates the diagnostic method, and, under many circumstances, is detrimental rather than beneficial to its overall efficacy.
Still another factor encumbering prior art computer-implemented medical diagnostics is related to the prevalence of diseases. Prevalence statistics are of epidemiological importance. However, it may be harmful to include prevalence values when calculating the probability of a patient having a rare disease. This happens because the small prevalence value for such a rare disease can considerably reduce the probability of the corresponding diagnosis, causing it to be improperly ruled out. If a patient has a disease afflicting only one in a million persons, the probability of that diagnosis would be very small, but for him or her it represents one hundred percent. A perfect program should diagnose every possible disease, including those that are rare. Furthermore, accurate epidemiological information is difficult to obtain because many disease cases remain unreported. Moreover, representing prior probabilities of diseases also introduces considerable mathematical complexity to probabilistic computations and the Bayes formula. In order to manage this complication rare diseases are commonly excluded from the list of candidate diseases for diagnosis. This, once again, has the obvious negative consequence of the system being unable to diagnose patients afflicted with such rare diseases.
Given the numerous limitations of prior art medical diagnostic methods and corresponding computer-based systems, it would be an advance in the art to provide a method and system that embody a more constructive and fruitful approach. Specifically, it is highly desirable to devise a computer-implemented method that can better address the limitations listed above and be a more effective and efficient diagnostic companion for physicians and health care professionals.