Traditionally, medical dictation systems allow physicians or other caregivers to dictate free-form speech that is later typed by a transcriptionist or transformed into written text by a computer using automated speech recognition (ASR). The resulting report may then be used to document an encounter with a patient and may subsequently be added to the patient's medical record. There have been a few attempts to construct natural language processing (NLP) software that may automatically extract key clinical information such as problems, medications, and procedures from medical reports. Extracting these data with a high degree of accuracy has proven to be a difficult task due to the complex nature of language, the many ways that a medical concept can be expressed, and the inherent complexity of the subject matter. As a result, NLP software tends to be large and complex, difficult to develop and maintain, and demands significant processing power, working memory, and time to run.
Because traditional systems are not fully capable of extracting all of the relevant information from, for example, a medical report, either because of system limitations or the failure of a medical professional to record the information, Health Information Management (HIM) personnel often spend a significant amount of time compiling data for back-end reporting purposes. Back-end reporting may be required for tasks such as compliance, accreditation with a standards body, government/Medicare reporting, and billing. These data are usually gathered manually by individuals who must read through all supporting documentation in a patient's file and then enter the data in a paper form or into a software package or database.
Practitioners in the medical field are faced with other problems that may adversely affect their ability to properly record and catalog relevant data. One such problem is that some of the data that needs to be collected for record-keeping purposes does not necessarily come up in ordinary patient-physician interaction. Additionally, at least in the medical field, there are a number of different purposes for which records may be kept, such as, for example, the ORYX quality reporting initiative that the Joint Commission on the Accreditation of Healthcare Organizations (JCAHO) has incorporated into its accreditation process for hospitals, CPT-4 (Current Procedural Terminology—4th Edition) billing codes, ICD-9-CM (International Classification of Diseases—9th Revision—Clinical Modification), and Medicare E&M (Evaluation and Management) codes. Due to the number of potential uses of medical reports and the corresponding medical information fields that may need to be filled, it may be difficult for a physician to remember to include all of the relevant information for each of these predetermined categorization schemes.
A first predetermined categorization scheme may include the ORYX quality-reporting initiative that has been incorporated into the hospital accreditation process by JCAHO. The ORYX initiative identifies a number of core measures that would be used to evaluate a hospital's performance. These may include core measure sets for the following conditions: (1) acute myocardial infarction (AMI); (2) heart failure (HF); (3) community acquired pneumonia (CAP); and (4) pregnancy-related conditions. Other core measure sets may include surgical infection prevention (SIP).
The JCAHO estimates that the collection of data related only to the AMI and HF core measures, assuming an average number of cases of AMI at 28 and the number of HF cases to be 40 per month, was 27.4 hours a month. Some of the information that may be sought may be obscure and therefore may not come up in ordinary conversation. Therefore, some of the information may be lost completely when physicians or other health-care professionals dictate their interviews and related treatments related to their patients. For example, as of Jul. 1, 2002, the core measures related to AMI included: (1) whether aspirin was administered upon admission; (2) whether aspirin was administered on discharge; (3) was angiotensin converting enzyme inhibitor (ACEI) used on patients exhibiting anterior infarctions or a left ventricular ejection fraction (LVEF); (4) was the patient counseled to stop smoking; (5) was a beta blocker prescribed at discharge; (6) was a beta blocker prescribed at arrival; (7) time to thrombolysis (the administration of an enzyme configured to break down a blood clot); (8) time to percutaneous transluminal coronary angioplasty (PTCA); and (8) inpatient mortality.
A second predetermined categorization scheme may include the IDC-9-CM classification. This classification is intended to facilitate the coding and identifying the relative incidence of diseases. The ICD-9-CM is recommended of use in all clinical settings and is, along with CPT-4, the basis for medical reimbursements, but is required for reporting diagnoses and diseases to all U.S. Public Health Service and Centers for Medicare & Medicaid Services. Therefore, the importance of maintaining accurate records for this type of reporting is apparent.
A third example of a predetermined categorization scheme may include the Current Procedural Terminology, Fourth Edition (CPT-4), which is a listing of descriptive terms and identifying codes for reporting medical services and procedures. The purpose of the CPT listings is to provide a uniform language that accurately describes medical, surgical, and diagnostic services, and thereby serves as an effective means for reliable nationwide communication among physicians, patients, and third parties. As noted above, CPT-4 is, along with ICD-9-CM, the basis for medical reimbursements for procedures.
A fourth example of a predetermined categorization scheme may include the Medicare Evaluation and Management (E&M) codes. To determine the appropriate E&M code, physicians may, in some circumstances, be required to make judgments about the patient's condition for one or more key elements of service. These key elements of service may include, for example, patient history, examination, and medical decision-making. Additionally, the physician may, in some situations, be required to make a judgment call regarding the nature and extent of the services rendered by the physician. For example, when a cardiologist sees a new patient for cardiology consultation in, for example, an outpatient clinic setting, to bill for this encounter, the cardiologist may have to select between a number of predetermined billing codes. For example, the physician may select E&M codes from category 99241 to 99245, and then may select the appropriate service from one of the category's five E&M levels. Inaccurate determination of these levels, either down-coding (by providing a code below the appropriate level and thereby billing at an inappropriately low level) or up-coding (by providing a code above the appropriate level and thereby billing at an inappropriately high level), may result in financial penalties, which in some instances may be severe. These four exemplary systems for identifying and coding medical problems, procedures, and medications provide the user of the particular coding system with a different informational structure. For example, the JCAHO ORYX information structure used for reporting for accreditation of a hospital to the JCAHO, will likely be different from the information structure required for submissions to the Centers for Medicare & Medicaid Services for, for example, medicare reimbursement, which will have a different informational structure than that required for ICD-9-CM, CPT-4, and E&M billing.
As mentioned above, when dictating patient reports, physicians may fail to document key pieces of data which are required for these back-end reporting processes, requiring the individuals responsible for the back-end reporting processes to either get the information from some other source, go back to the physician and request the required information, or go without the information, leaving a gap in their data set. This results in reduced efficiency, increased expenses and time-on-task, and also contributes to increased error and omission rates.
As can be seen by the foregoing, the process of recording and entering medical information may be very costly, and despite the costs, data may still be incomplete. Current natural language processing implementations that work from free-form text (“non-bounded” input data or text) require complex data- and processing-intensive techniques that are not always consistent, accurate, and comprehensive. Therefore, what is needed is a simplified method and apparatus for identifying terms of art within a stream of input data, such as, for example, medical terms and classifying the terms. Additionally, there is a need for a classification system that may provide the user with both prompts or reminders to collect certain predetermined information and assistance in collecting and classifying these terms.