With the increasing prevalence and use of computing systems the amount of data that can be obtained regarding various problem spaces has grown exponentially. While the amount of data that may be obtained with respect to a particular space may have increased significantly, the integration of heterogeneous data from multiple sources, the sharing of information in a distributed and collaborative environment and the mining of such data are challenging informatics problems. Nowhere are these types of challenges and problems more evident than in the case of a natural disaster or epidemic as the understanding, diagnoses, treatment and prevention of human diseases requires the collection, integration and understanding of information and knowledge from a wide variety of highly distributed sources which may present a unique challenge in such circumstances. This problem is exacerbated because most clinical research environments lack proper informatics resources and infrastructure to assist with preparation, implementation and maintenance of data collection and management platforms that can consistently and concurrently support collection, integration and contextualization of multiple research projects across many participating sites.
It is thus desired to provide advanced informatics platforms to enable complete, reliable and fast collection and validation of information throughout various research projects, and among different participating locations. Moreover, in conjunction with the collection of data for such systems it may be desired to process natural language (sometimes referred to as free text). This desire is particularly strong in the field of medicine, as free text entries in the form of discharge diagnosis, chief complaint, nurse and practitioner note, diagnostic reports and consultations, etc. are extremely important part of a patient electronic health record and frequently unavailable for decision support and research queries due to its unstructured and unconstrained format. While human experts can effortlessly understand the meaning of the text, its implications in multiple different contexts (decision support, research, quality of care, etc.), or answer questions regarding patient health status, current computational processes are not able to process such health related free text to produce structured data that allows data mining of such free text.