The present embodiments relate to determining the state of a patient from medical transcripts. Medical transcripts are a prevalent source of information for analyzing and understanding the state of patients. Medical transcripts are stored as text in various forms. Natural language is a common form. The terminology used in the medical transcripts varies from patient-to-patient due to difference in medical practice. The variation and use of medical terminology requires a trained or skilled medical practitioner to understand the medical concept relayed by a given transcript, such as indicating a patient has had a heart attack.
Automated analysis is difficult. The unstructured nature of the text and the various ways used to refer to the same medical condition (e.g., disease, event, symptom, billing code, or standard label) make automated analysis challenging. One approach is phrase spotting, such as searching for specific key terms in the medical transcript. The existence of a word or words is used to show the existence of the state of the patient. The existence of the word or words may be used with other information to infer a state, such as disclosed in U.S. Published Application No. 2003/0120458. Rules are used to determine the contribution of any identified word to the overall inference. Certain conditions may be only implied through a reference to related symptoms or diseases and never mentioned explicitly. The mere presence or absence of certain phrases or words immediately associated to the condition may not be enough to infer the condition of patients with high certainty.