The low cost of data storage hardware has led to the collection of large volumes of data. Merchants, for example, generate and collect large volumes of data during the course of their business. To compete effectively, it is necessary for a merchant to be able to identify and use information hidden in the collected data. Where the merchant operates a call centre, this data could include operational parameters such as the average time a caller spends waiting and the geographic location from which each caller interacts with the merchant. The task of identifying this hidden information has proved very difficult for merchants.
Traditionally, analysis of data has been achieved by running a query on a set of data records stored in a database. The merchant or other party first creates a hypothesis, converts this hypothesis to a query, runs the query on the database, and interprets the results obtained with respect to the original hypothesis.
One disadvantage of this verification-driven hypothesis approach is that the merchant must form the desired hypothesis in advance. This is merely confirming what the merchant already suspects and does not provide the merchant with information which may be unexpected. Another disadvantage is that the merchant needs to have available the technical knowledge to formulate the appropriate queries.