The rapid increase in the production and collection of machine generated data has created relatively large data sets that are difficult to search. The machine data can include sequences of time stamped records that may occur in one or more usually continuous streams. Further, machine data often represents some type of activity made up of discrete events.
Often, searching large data sets requires different ways to express search criteria. Search tools today typical allow users to search by the most frequently occurring terms or keywords within the data and generally have little notion of event based searching. Given the large volume and typically repetitive characteristics of machine data, users often need to start by narrowing the set of potential search results using event-based search mechanisms and then, through examination of the results, choose one or more keywords to add to their search parameters. Timeframes and event-based metadata like frequency, distribution, and likelihood of occurrence are especially important when searching data, but can be difficult to achieve with current search tools.
Also, users often generate ad-hoc queries to produce results from data repositories. In some cases, generating queries sufficient for retrieving the desired results may require an undesirably high-level of knowledge about the data domain and/or the operation/structure of the data repository. Thus, systems related to the searching of relatively large sets of data are the subject of considerable innovation.