The present invention generally relates a system and method for processing and actionizing structured and unstructured patient experience data. The system and method described herein may be utilized for processing disparate patient experience data sources such as medical records, surveys, doctor review sites, and social media.
Natural language processing (“NLP”) is a field of computer science, artificial intelligence, and linguistics concerned with the interactions between computers and human languages. It involves the processing of a natural language input. A natural language input is generally language used by a person (as opposed to a computer language or other artificial language), including all of the idioms, assumptions and implications of an utterance in a natural language input. Natural language processing implemented by a computer is typically an attempt to determine the meaning of a natural language input such that the natural language input can be “understood” and/or acted on by the computer. To interact with humans, natural-language computing systems may use a data store that is parsed and annotated.
Presently, in the healthcare industry, there is a need for systems and methods that are able to rapidly parse, combine, and interpret multiple structured and unstructured data sources. Healthcare information, such as information related to a patient's care experience and satisfaction, is fractured across many isolated data stores in varying formats. To compound the problem, even when data is available, there are no easily available means of processing this data with a high degree of accuracy or efficiency.
Moreover, in health care data management systems today, only about 20% of data is structured or machine-readable. Information that is not structured or machine readable is ignored or unusable in conventional analytics systems. Online data sources, such as doctor review sites and social media, consist of largely unstructured data. Additionally, data collected from surveys or other public and private sources is often a mixture of both unstructured and structured data that varies between data stores. Due to lack of interoperability between these data stores and formats, these sources have not been analyzed in conjunction with one another.
Significantly, online data sources have risen in importance for healthcare providers, similar to most customer-focused industries. Data from online sources must be extracted, transformed, and loaded into a structured/compatible form. Extract, Transform, Load (ETL) jobs extract data from a source, transform the extracted data using one or more transformations to a format compatible with a target, and load the data into the target, for example a target database. Extraction refers to actually obtaining the data from individual data sources. Transformation indicates processing the data to put it into a more useful form or format. Loading refers to the process of loading the data into the tables of a relational database.
Attempts have been made to use customer-focused NLP systems from the hospitality and restaurant industries in the healthcare space, but these systems' lack of specificity for healthcare make them inaccurate and ineffective for actionizing patient feedback. Further, investments in such technologies do not yield the comprehensive, reliable, or actionable information necessary to improve a health care organization's viability. Instead, the value-added by the data reviewed by these technologies is diminished as true data integration and interoperability is not achieved.
There have been few attempts to construct healthcare-specific NLP systems that may automatically collect and annotate key information related to the patient's care experience and satisfaction, such as the patient's sentiment regarding the experience, identification of key staff involved in the experience and key themes describing the care experience.
Performing these annotations with a high degree of accuracy has proven to be a difficult task due to the complex nature of language, the many ways that a care experience concept can be expressed, the inherent complexity of the subject matter, and the distributed and varied nature of the available data sources. As a result, NLP software tends to be large, expensive and complex, difficult to develop and maintain, and demands significant processing power, working memory, and time to run. Further, when attempting to process data from isolated sources in differing formats, annotation accuracy is difficult to achieve. This is especially true for unstructured data—annotations regarding sentiments, named entities, key themes and the like that may fall below a traditional threshold for statistical significance. Nevertheless, unstructured data may indicate real problems with care experiences that are of value to healthcare administrators. Despite its value, it has traditionally been difficult to process and understand.
Furthermore, current methods of data extraction are slow and ineffective. These systems, however, which use only a fraction of the data available, have already been shown to reduce cost and improve outcomes. If systems and methods had the capability of using the knowledge incorporated within unstructured data in an efficient manner to improve patient experience, the benefits would be tremendous. By utilizing this knowledge, care could be improved and cost reduced through quality improvement, efficiency, comparative effectiveness, safety, and other healthcare analytics powered by this data.
Thus, there is a need in the field of processing patient experience data, and more specifically in the field of processing disparate data sources such as medical records, government surveys, doctor review sites and social media, for new and improved systems and methods for processing data. In particular, systems and methods are needed that are able to rapidly parse, combine, and interpret multiple structured and unstructured data sources. Described herein are devices, systems, and methods that address the problems and meet the identified needs described above.