Users of computational devices such as computers, smart phones, personal digital assistants and the like may have access to very large quantities of data. A small fraction of the data may reside on the device itself while the vast majority of the data may be stored in databases and/or may be accessible via means of communications such as the Internet, or other wired or wireless networks capable of transmitting data.
Search engines have made the discovery and retrieval of useful and relevant data a little bit easier, but often cannot help users make sense of the data. Search results may include results that do not relate to the user's search, making the search experience sometimes overwhelming. There is therefore a need for a method and system to analyze data quickly and display its essence in a succinct but holistic manner that highlights important themes, topics or concepts in the data and how they might relate. Such a method may aid the user in making sense of data.
Text mining and analysis is a rich field of study [M. W. Berry, Survey of Text Mining: Clustering, Classification, and Retrieval, Springer, 2003]. Proposed solutions to make sense of text data may rely on combinations of statistical and rule-based methods. Many methods may be computation intensive, particularly methods that attempt to include semantics, taxonomies or ontologies into the analysis, as well as methods that require the performance of global computations such as the calculation of eigenvalues and eigenvectors for extremely large matrices. As a result such methods may be used mostly by specialists.
Another characteristic of methods that rely on semantics is their context and language dependence. There is therefore a need for a method and system that is generic enough that it can work with virtually any context and with a broad range of languages without any modification of the method.
Several methods use a network approach but may fail in their genericity, speed and ease-of-use [Carley, Kathleen. (1997) Network Text analysis: The Network Position of concepts. Text Analysis for the Social Sciences: Methods for Drawing Statistical Inferences from Texts and Transcripts, 79-100. Mahway, N.J.: Lawrence Erlbaum Associates; Corman S R, Kuhn T, McPhee R D, Dooley K J. Studying complex discursive systems: Centering resonance analysis of communication. Human Communication Research 2002; 28: 157-206].