Advancements in technology have reduced the cost of computers to the point where many events in one's day are recorded by a computer. Events recorded by computer are numerous and include, for example, transactions made by an individual. Computers store the data associated with the transactions they process resulting in very large databases of information. Also, companies and individuals frequently use computers to record events related to a specific domain. For example, a meteorologist may enter into a computer database many records of data relating to weather occurrences.
One problem arises of how to make efficient use of the tremendous amount of information in these databases. When the number of records in a database rises to a certain level, simply sorting the information in the database provides no meaningful results. While statistical analysis of the records in a database may yield useful information, such analysis must generally be performed by persons with advanced training in math or computer science. Typically, these people are also needed to understand the results of the analyses. Additionally, translation of the statistical analysis of the information in a large database into a useful form is also difficult. For example, a strategic business activity such as marketing may require analytical information to be converted into a form specifically suited to the activity of marketing. Difficulties in providing or obtaining information in a useful form may prevent the effective use of the information in a database and preclude the use of a possibly valuable data resource.
Organizations of all types commonly collect and store business and technical data in various types of databases. Strategic and/or technical knowledge may be contained in the databases. In some instances, based on many years of experience, experts are able to glean knowledge from databases existing in their particular domain of expertise. In the absence of such experts, however, strategically useful information may not be available to the organization controlling or accessing a given database. The inability to obtain this knowledge may be detrimental to the business objectives of the organization. For example, if a business cannot extract useful knowledge from the data it possesses, it will likely be at a competitive disadvantage compared to a business that can discover such knowledge. Thus, the ability to discover knowledge from data contained in databases would be a valuable asset to any organization.
Certain tools are available which assist a non-expert to gain some knowledge from a database, such as data mining tools. Certain tools are also available to assist analysts to validate hypotheses through interactive exploration, such as OLAP and multidimensional database analysis tools. For example, some data analysis tools respond to queries input by the user. A query might be: “How many people within the database are between the ages 30 and 35?” The data analysis tool looks to all the records in which an age field meets the age range requirement of the query. Then, the tool simply counts the number of records. Query tools require the user to have an extensive knowledge of the database domain and the queries generally are very rigid in their structure. One example of a data mining tool is described in U.S. Pat. No. 5,933,818, entitled “Autonomous Knowledge Discovery System and Method.”
Data analysis tasks typically require skilled analysts and significant time and also may introduce opportunities for errors due to steps that require manual intervention.