The present application relates generally to an improved data processing system, apparatus, computer program product, and method, and more specifically to dimensional reduction mechanisms for representing massive communication network graphs for structural querying.
The area of graph mining has numerous applications in a number of domains including computational biology, chemical applications, the Internet, social networking, and the like. In recent years, a number of data mining and management applications have been designed in the context of graphs and structural data. Data mining is the process of extracting patterns from compilations of data. That is, data may be analyzed to identify patterns within the data and these patterns may be used as a basis for deducing some behavior of a system. Structured data mining is the process of finding and extracting useful information, e.g., patterns, from semi-structured data sets. Graph mining is a special case of structured data mining where the data sets being mined are data sets for representing information in a graph form. Detailed information about known graph mining mechanisms may be found in Cook et al., Mining Graph Data, 2007, available from John Wiley and Sons, Inc. publishers.
The use of graph mining is significantly limited by the ever expanding size of the data sets defining the various graphs being mined and the limited amount of available memory in most systems to store such data sets. For example, the data sets may correspond to graphs of a large communication network, social network, biological system, or the like and thus, may comprise many thousands of nodes, edges, and the like. It may not be possible to maintain all of this data in memory for use in performing graph mining.