The present application relates generally to an improved data processing apparatus and method and more specifically to mechanisms for classifying medically relevant phrases from a patient's electronic medical records into relevant categories.
Information retrieval and information extraction are significant issues in the medical and health care domains where the accuracy of the retrieved information and obtaining it in a time critical situation are extremely important. Information retrieval (IR) is the activity of obtaining information resources relevant to an information need from a collection of information resources. Searches can be based on full-text or other content-based indexing. Information retrieval is the science of searching for information in a document, searching for documents themselves, and also searching for metadata that describe data, and for databases of texts, images or sounds. Information extraction (IE) is the task of automatically extracting structured information from unstructured and/or semi-structured machine-readable documents. In most of the cases this activity concerns processing human language texts by means of natural language processing (NLP).