The present invention relates generally to extracting interpretable sequential events from medical data, and more specifically to methods, systems and computer program products for reducing feature space in analysis of medical data.
The desire to connect past medical history with patient outcome demands analysis of increasingly large amounts of medical data input. As more patient medical record data is digitized and collected, not only is the data input increased, but the number of patterns mined from the data also increases. Pattern mining algorithms, which are used in the analysis of data to find correlating features, tend to produce a relatively large number of features, e.g., variables used to predict an output variable. For example, a predictive model could yield hundreds of thousands of features. Accordingly, as the number of patterns increases, so does the features space. Yet, large feature spaces, involving high volumes of data, are difficult to interpret and detrimentally impact computation speed. Because parsimonious models are preferable, reduction of feature space is highly desirable. However, reduction in feature space can result in a loss of predictive power of the analysis.