A heat map is an abstract representation of correspondence between two data sets. Heat maps are often employed to compare and analyze categorical data. For example, the cluster heat map is a display of a data matrix that reveals row and column hierarchical cluster structure in the data matrix. It consists of a rectangular tiling with each tile shaded on a color scale to represent the value of the corresponding element of the data matrix. Within a relatively compact display area, it facilitates inspection of row, column and joint cluster structure. The cluster heat map compacts large amounts of information (e.g., several thousand rows/columns) into a small space to bring out coherent patterns in the data.
Identification of patterns formed via variations and clusters of data points in a heat map (e.g., often rendered as pixels of a digital image representation of the heat map) can reveal various correlations between data sets. Several techniques have evolved to facilitate identification of such patterns. The patterns formed in a heat mat and the techniques for identifying such patterns are dependent on the type of data represented by the matrices and the manner in which the data is organized, filtered and arranged. One popular mechanism for evaluating heat maps uses a seriation loss function that involves analysis of the sum of distances between adjacent rows and columns. Another mechanism involves sampling of values from known bivariate distributions, randomizing rows and columns in the sampled data matrix and comparison of solutions from different seriation algorithms. Other forms of analysis involve identification of various patterns where rows and column covariances are determined by different covariance structures, including toeplitz, band, circular, equicovariance, and block diagonal lines.
Various techniques have evolved to facilitate automated heat map analysis and pattern identification. However, many of these techniques are insufficient. This problem is exacerbated with heat map based on data involving one dimension of input and two dimensions of output. Such heat maps are generally associated with a lot of redundancy which causes many potential patterns to appear as noise or exhibit low complexity. As a result, many automated techniques produce false positive (e.g., identification of patterns that are non-representative of an accurate data correlation) and false negative (e.g., failure to identify patterns that are representative of a data correlation). Accordingly a more sensitive and granular approach to automatically identify patterns and correlations in heat maps is needed.