1. Technical Field
Present invention embodiments relate to interactive analysis of a subset of data from a massive storage (e.g., peta-bytes) of data.
2. Discussion of the Related Art
There are a number of industries that require analysis of large amounts of data, both structured and unstructured, in order to obtain meaningful access to a smaller subset of data that is of interest. For example, investigative agencies, such as law enforcement, intelligence and counter-fraud agencies, have access to very large (e.g., peta-bytes in data size) sources of data including call records, financial and/or computerized (electronic) transactions, etc. When combined with other conventional types of date, including unstructured data (e.g., intelligence reports), entity-link-property (ELP) data, it becomes a massive task for an analyst to handle such massive amounts of data to obtain a meaningful subset of the data for a particular search or analysis. In particular, consider typical sources of call data records, which may include a number of years of data for several million individuals, resulting in excess of one trillion items of data.
Such data cannot be processed for visual, interactive analysis due to the size of result sets obtained from the data source. For example, result sets that need to be analyzed, while smaller than the overall data source, are still too large to be stored in the memory of a conventional desk top or other computer or for analysis by conventional data analysis tools. In addition, the result sets that need to be analyzed cannot be visualized using conventional techniques, and the result sets are also too large to be transferred between computing devices or nodes due to a lack of available network bandwidth that would be required for such transfers.