Currently, data driven approaches are extensively used in several domains, such as finance, healthcare, life sciences, and social and physical sciences, for discovering data patterns. Unsupervised learning methods, such as clustering or self-organizing maps, are the most preferred tools for discovering data patterns.
In the field of healthcare analytics, clustering is extensively used to discover patterns in disease risk profiles and treatment responses based on common procedures, such as stratification of patients. Such procedures may be used at various levels, such as for analyses within a hospital, using electronic medical records or electronic health records and population-level analyses in hospital information systems (HIS). A clustering method, such as partitioning-based clustering (such as K-Means), relies on a measure of similarity or distance between at least two objects being clustered. The efficacy of such methods depends on a distance metric used, such as Euclidean distance, cosine similarity, and Jaccard distance, which in turn depends on the data and the application. While such distance metrics are useful in a large number of applications, other applications may require the use of specialized distance metrics.
In healthcare analytics, patient data that includes unstructured notes, such as discharge summaries and nursing notes, are clustered by using text-processing techniques. The text in the unstructured notes may be analyzed to determine several postoperative complications, detect clinical conditions in an ailment with a consistency that is indistinguishable from that of physicians reviewing the same reports, and predict mortality in the intensive care units (ICUs). In certain scenarios, based on the vast amount of real-time information associated with different health-related parameters of a patient, it may be difficult to predict certain medical conditions that may need immediate attention. Thus, an automated technique may be desired to cluster the recorded medical reports of a patient in a structured manner for predicting the health of the patient.
Further limitations and disadvantages of conventional and traditional approaches will become apparent to a person having ordinary skill in the art, through comparison of described systems with some aspects of the present disclosure, as set forth in the remainder of the present application and with reference to the drawings.