Large data sets are now commonly used in business organizations. In fact, so much data has been gathered that responding to even a simple question about the data has become a challenge. The modern information revolution is creating huge data stores that, instead of offering increased productivity and new opportunities, are threatening to drown the users in a flood of information. Tapping into large databases for even simple browsing can result in an explosion of irrelevant and unimportant facts. Even people who do not ‘own’ large databases face the overload problem when accessing databases on the Internet. A large challenge now facing the database community is how to sift through these databases to find useful information.
Existing database management systems (DBMS) perform the steps of reliably storing data and retrieving the data using a data access language, such as Structured Query Language (SQL). One major use of database technology is to help individuals and organizations make decisions and generate reports based on the data contained in the database.
In these databases it is usual to relate data in various tables using joins that allow the data to be accessed in different ways. The manner of performing such joins is well understood, but in the increasingly complex data being analyzed, there are several opportunities for information to be misinterpreted. For example, one such mechanism results in the double counting of data. In these more complex data environments, it is well known to use modeling software applications to provide a convenient mechanism to relate the data in ways that male most sense to the users. Such modeling applications are intended to minimize the knowledge required of a user to make appropriate queries of the data. However, in some cases, the very nature and complexity of the data and its structure has meant that the user is required to have considerable knowledge of the actual structure of the data. What is needed is away to reduce this requirement.