A prediction query for data mining (DM) applies a prediction model to transactional data, or other kinds of data, and generates predictive results that can serve as the basis for sound business decisions in marketing, operations, budgeting and many other areas as well. The advantages and capabilities for data mining are similar to those of On-Line Analytical Processing (OLAP), but break much more ground. Like OLAP, DM exists to help one obtain qualitative information from otherwise dry, transactional data. While OLAP achieves this by optimizing drill-down queries and letting users observe patterns in data, DM actively analyzes data and determines patterns on its own. DM is based in part on artificial intelligence (AI) principles and algorithms, and is also based heavily on statistics. DM is relevant to a variety of applications, including, but not limited to, client/server applications and services, data warehousing, web site personalization, on-line customer assessment, fraud detection, etc.
FIG. 1 illustrates an exemplary prior art user interface for a relational query builder 30. For instance, join operations between relational tables 40 and 42 can be specified, and automatic mappings 44 are created between tables 40 and 42. Grid view 50 enables a user to select, e.g., “drag and drop,” columns from any of the tables to the grid in order to build a join query in a relational system. Relational query builder 30 thus provides a standard way to build relational queries; however, to date, there is no standard way to build a prediction query.
An application or object that allows prediction models to be built using data mining algorithms is sometimes called a prediction query builder or generator. A prediction query builder typically can be applied to a variety of kinds and sizes of databases. In this regard, a prediction query builder enables the incorporation of predictive data mining models (DMM) from wherever they may be located. A DMM is like a relational table, except that it typically includes special columns that can be used for data training and prediction making, i.e., the DMM enables both the creation of a prediction model and the generation of predictions. Unlike a standard relational table, though, which stores raw data, a DMM stores the patterns discovered by the particular data mining algorithm that was utilized.
A prediction join operation is an operation that is mapped to a join query between a trained data mining model and a designated input data source so that one can generate a tailored prediction result. The prediction result can then be stored, interpreted, output or displayed in a variety of formats.
Whatever the platform may be to interact with the data, in order to access the data to be mined, a DM engine formulates a query according to the format of the platform, e.g., SQL Server, in which the data is stored. Regardless of the platform, describing a prediction query in an unambiguous way can be challenging. Thus, creating prediction queries from scratch can be a complex, tedious and error-prone process. Among all other data mining tools currently available in the marketplace, there is no product that provides a simple, graphical way to build a prediction query. Thus, there exists a need in data mining products for a tool that can assist a user in building and executing a data mining prediction query in a standard manner, simply and easily. There is still further a need for a prediction query builder that allows a user to build data mining queries in a manner similar to building/executing relational join queries. There is thus a need for improvement over these and other deficiencies of the prior art.