Information in healthcare is all around us and comes in many different forms. In medical record systems today, only about 20% of data are structured, also known as discrete or machine readable. Information that is not structured is ignored or unusable in conventional processes designed to improve care or reduce costs. This is often referred to as a big data problem.
All quality improvement and cost reduction efforts are founded on the same paradigm: measure, intervene, and measure again. The measurement steps, often called quality measures, require significant individual and population based patient data. Whether the data are originally collected through or for revenue cycle management, transcription, electronic health record (EHR), compliance, analytics, or other efforts, the ultimate goal of data collection in healthcare is improved quality, reduced costs, or both.
Current methods of data extraction from the healthcare workflow are typically manual. The physician may use dropdowns or textboxes in an application to code a medical problem or a billing coder may review a chart and assign billing codes. A quality team may be tasked with seeing every patient every day to manually document quality measures. The processes of data extraction in healthcare are slow, expensive, prone to error, and often ineffective. A “coder” is a person that reviews medical data and identifies a corresponding medical code.
As data flow through to systems, designed to improve care or reduce costs, whether they are analytics, compliance, or otherwise based, the underlying quality of data determine the efficacy of the efforts. Conventionally, the data introduced into the system have come from insurance claims data, administrative data, and discrete EHR data, with minimal use of what is known as unstructured data. Unstructured data in healthcare is primarily the medical narratives captured on every patient encounter. Where an encounter may lead to a full page narrative note to document the visit, the coded portion may be 3 or 4 ICD-9 codes. The massive gap in content in a one page narrative note versus 3-4 codes represents a large portion of the gap between the 80% of unstructured data and 20% of discrete data in healthcare. There has been criticism that while the large majority of meaningful information is captured daily in medical narratives, this content is rarely used for quality improvement. The health system needs coded data rather than unstructured medical narratives to address revenue capture, quality improvement, analytics, compliance, interoperability, and countless other application market segments. Thus, a manual system is built up and maintained to produce these discrete and coded data. The process leads to double work, where a narrative is recorded to describe important patient information and a manual coding step is added from provider, coder, quality team, or other personnel to discretely record much of the same information. Even manually coding 3-4 items, documentation time can be doubled from previous systems where only the narrative was required. It is becoming increasingly clear that manual coding is not scalable.
Use of the 80% of unstructured data in healthcare can potentially power a new generation of applications to improve care and reduce costs. It can support two of the three critical steps in healthcare quality improvement of measure, intervene, and measure again. Unfortunately, the technology to extract this information and make it meaningful is limited. If effective and easy to use systems and methods had the capability of extracting the knowledge incorporated within large stores and ongoing data collection of unstructured clinical data, the benefit would be tremendous. By utilizing this knowledge, not only would the need for manual processes be abated, but also the full breadth of clinical content would be available to address quality and costs. Care could be improved and cost reduced through disease management, population health, local and regional quality improvement, efficiency programs, research, comparative effectiveness, and other healthcare applications and systems, all powered by robust processed narrative data.
There is a need for systems and methods that provide for improved data structuring, including data extraction and understanding. But, the need does not end with a single application. Rather, if usage of narrative data is to power a new generation of application in multiple segments of healthcare, the output should be easily integrated and not solely customized for a single application. In order for processed narrative, or unstructured, data to be properly utilized, there is a need for systems and methods that transform narrative content into highly annotated documents that are clearly organized and easy to consume at a programmatic level. As healthcare applications become more modularized, just as other information technology markets and segments have done in recent decades, data extraction engines will need to integrate with multiple types of applications, such as end-user applications, data warehouses, and other content sources and care interventions within healthcare. Allowing for independent and modularized best-of-breed technologies is a time proven way to stimulate innovation and increase the speed of development of powerful applications.
For use of unstructured data in healthcare, the processed unstructured data should be as easy to consume as discrete data entered manually by the provider, coder, quality team, or other data entry personnel. The computer should address the needs of the people rather than healthcare personnel addressing the needs of the computer. At a conceptual level, this output should be easy to consume. Currently, the most programmatically easy to consume output in healthcare is discrete manually tagged concepts, either tagged by the physician or billing coder using dropdowns, text boxes, or check boxes, and ultimately stored as an annotated data element. To provide similarly usable content, unstructured data technologies would ideally model this output, at best using clear clinical modeling and schema-based output to define where individual information will reside and how it can be used. Making automatically structured narrative as easy to consume programmatically as discrete data requires extensive technology expertise and innovation.
Thus, there is a need in the field of processing healthcare data, and more specifically the field of processing electronic narrative content, for new and improved data structuring systems and methods for transforming a narrative note into a highly annotated document that is clearly organized and easily retrievable by other applications. Clear clinical model, schema, and terminology output can support bringing automatically processed unstructured data in line with the quality and usability of discretely documented data elements.
The information required by most healthcare applications is known. There is a need for systems and methods to output clear representations of unstructured narrative data within a modeled, schema-driven, elemental, and coded approach.
When made available, a robust data infrastructure built around structuring narrative content can allow narrative content to power a broad range of applications, foregoing or supplementing manually entered discrete data and addressing needs in quality analytics, reporting compliance, transcription, electronic health record, interoperability, revenue cycle management, and other applications. Described herein are devices, systems and methods that address the problems and meet the identified needs described above.