Patient interactions with health care providers are being digitized at a rapidly accelerated pace. In many cases, digital records of these interactions include data regarding early presentations of symptoms, sets of diagnostic tests administered and their results, passive monitoring results, series of interventions, and detailed reports of health progression by health practitioners. These records can be as simple as textual input or as detailed as video of a clinician-patient interaction. Consequently, the modern hospital tends to generate large volumes of data. With the recent ubiquity of electronic health record (EHR) databases, much, if not all, of this patient information is often documented within a single storage system.
Included in hospital EHR databases are discharge summaries that summarize the conditions, symptoms, and treatments of a patient during the patient's stay in a hospital. These discharge summaries include freeform text that can be mined programmatically using natural language processing techniques to classify the health conditions of the patient. The mined classifications can be used to facilitate medical billing for services rendered to the patient during his or her stay at the hospital. For example, the mined classifications can include medical billing codes, such as codes based on the International Statistical Classification of Diseases and Related Health Problems (commonly referred to as “ICD”). Versions of ICD classification codes often used by medical billing systems include ICD-9 and ICD-10 codes.