Electronic health record (EHR) systems are currently being widely implemented to help manage patient records, increase the ability of analysts to assess quality of healthcare, and reduce patient suffering due to medical errors. Clinical decision support tools are essential components to leverage on the value of the data collected in EHR systems. Such tools may allow doctors to use the information/data to reach patient-specific decisions. While textual description in natural languages is one of the main modalities in EHR data, tools are yet to be developed to automatically, robustly, and accurately extract useful information from patient records.
A significant barrier to implementing such methods within the clinical setting is the lack of machine/computer understandable clinical text. By this it is meant that the meaning of text reports created in clinical practice normally cannot be extracted by a computer or other kind of machine. Clinical reports, such as discharge summaries, radiology and pathology reports, etc. are typically stored in natural language documents, rather than in more semantics-aware structured data formats. Such structured and semantically-rich data formats are useful when implementing more advanced supporting tools, such as clinical decision support (CDS) tools. To overcome this barrier, various natural language processing (NLP) and machine learning techniques have been developed specifically for identifying concepts and relationships in free text. However, much of the work in this field has been conducted using scientific textual data, which differs in important ways from the grammar-free, idiosyncratic text commonly found in clinical reports. The task of using NLP approaches to extract relevant information in real clinical cases has proven to be an extremely challenging one. While free text is here to stay, being the preferred way of reporting for clinicians, for both objective and subjective reasons, computers do not cope well with free text when it comes to interpreting the semantics. While the amounts of data collected within clinical care increase, it becomes increasingly harder for clinical users to make sense out of that data, and to filter and extract the pieces of information that are actually relevant. In this context, making the data understandable to the computer, including the semantics hidden in the data, becomes very valuable. For example, to find patients that are eligible for a particular clinical trial, the eligibility criteria of the trial need to be reliably compared with data in the patient record. The approach of fully structuring the data collected in clinical care has been received with a lot of resistance in the clinical domain. Additionally, recent studies deem such a fully structured approach as unrealistic and counterproductive, due to the complexity of clinical care and of the associated reporting.
U.S. Pat. No. 7,493,253 B1 discloses a system and a method for the indexing of free text documents using both language dependent terms and a language independent formal ontology of concepts to extract the deep meaning in free text documents. The natural language understanding system is taught what are and are not appropriate relationships between concepts by providing the linguistic ontology as part of the formal ontology. The linguistic ontology contains the rules about how language works as well as the principles that the human mind adheres to when representing reality at the conscious level of a human being.
US 20110033093 discloses a method of reporting of radiological information. A system and method are provided for the graphical presentation of the contents of radiological image study reports. Also, a system and method are provided for presenting the contents of structured radiological reports including multiple imaging studies and their corresponding findings in a single diagram. An ontology of radiological knowledge is used to interpret report content and generate information to be displayed in the graphical diagram.