Modern data centers and other computing environments can comprise anywhere from a few host computer systems to thousands of systems configured to process data, service requests from remote clients, and perform numerous other computational tasks. During operation, various components within these computing environments often generate significant volumes of machine-generated data (“machine data”). In general, machine data can include performance data, diagnostic information and/or any of various other types of data indicative of performance or operation of equipment in a computing system. Such data can be analyzed to diagnose equipment performance problems, monitor user interactions, and to derive other insights. This is in contrast with human-generated data, i.e., data created by a human being.
A number of tools are available today to analyze machine-generated data. In order to reduce the volume of the potentially vast amount of machine data that may be generated, many of these tools typically pre-process the data based on anticipated data-analysis needs. For example, pre-specified data items may be extracted from the machine data and stored in a database to facilitate efficient retrieval and analysis of those data items at search time. However, the rest of the machine data typically is not saved and is discarded during pre-processing. As storage capacity becomes progressively cheaper and more plentiful, there are fewer incentives to discard these portions of machine data and many reasons to retain more of the data.
This plentiful storage capacity is presently making it feasible to store massive quantities of minimally processed machine data for later retrieval and analysis. In general, storing minimally processed machine data and performing analysis operations at search time can provide greater flexibility because it enables an analyst to search all of the machine data, instead of searching only a pre-specified set of data items. This may, for example, enable an analyst to investigate different aspects of the machine data that previously were unavailable for analysis. However, analyzing and searching massive quantities of machine data presents a number of challenges.
Various tools are also available today to analyze human-generated data. For example, customer relations management (CRM) software applications are available for gathering and reporting sales data for a business enterprise. However, such tools are focused on a single business unit within an enterprise, such as the sales group, limiting their usefulness. Further, existing data analyses and reporting tools do not provide the ability to gather, report and/or facilitate analysis of both human-generated data and machine data.