U.S. Pat. No. 5,555,403, which is incorporated by reference herein, discloses techniques for associating user-friendly business names to database entities. Those business names (sometimes referred to as “business objects”) may then be used to form a query applied to a database. In particular, one or more SQL statements are formed to retrieve database values. For instance, on might associate the business objects “Revenue” and “Customer” and retrieve the revenue generated by each customer across all products and time periods. One could also associate the objects “Product” and “Revenue” and retrieve the revenue generated by each product across all customers and time periods. Such queries can also specify restrictions and allow, for instance, for retrieving the revenue for each customer for a specific product.
Frequently, the intuitive meaning of a query associating two or more objects is very clear and unambiguous. However, there are cases where a single association of multiple business objects may have different plausible meanings. For instance, assume that customers can rent products or buy them (or both). A query associating the two objects “Customer” and “Product” would be ambiguous. One may want to retrieve some association between customers and products, but this query alone does not tell if the user wants to know which customers have bought which products, or which customers have rented which products, or both. The computer system cannot solve this ambiguity. Instead, a user needs to provide some additional information and to choose between the different possible interpretations of the query.
In order to enable this, a system may rely upon the notion of context. A context is a list of joins between tables. A context allow for more precision when forming a query. In the previous example, the designer may have authored a context of sales and a context of rentals. By picking one context, or both, the user can specify which of the possible interpretations of the query was contemplated.
Today, a context works in the following manner. Each business object used in a query relates to one or several tables in a database. Any two database tables may be related by zero, one or more “joins”, that represent ways of relating data from one table with data from the other. For instance, an object “Customer” may be related to a database table “CUST”. The object “Products” may be related to the table “PROD”. A measure “Rentals” may be related to FACT_RENT and a measure “Revenue” may be related to the table FACT_SALES. The two tables “FACT_RENT” and “FACT_SALES” are fact tables: each of them contains foreign keys to both the “CUST” and “PROD” dimension tables. Thus, in this example there are four joins.
In order to retrieve the data for a query, one has to select which of the “joins” will be used to combine data from the different tables associated with the objects in the query. In mathematical terms, one wants to relate all the tables pertaining to objects in a query using a list of joins, so that the resulting graph is a tree (a graph without loops, in which exactly one list of joins relates any two tables). For instance, when analyzing the query “Customer, Revenue”, the two tables CUST and FACT_SALES will be selected. The system will then look for joins that can relate the tables. In particular, the system will attempt to find the join that relates the customer foreign key in FACT_SALES to the customer information in table CUST.
An ambiguity exists in a query if more than one possibility exists, i.e, if there are multiple joins that could be used to relate all the tables in a query. This is the case for the query (“Customer, Product”), because two lists of joins could be used to relate the tables CUST and PROD. One could either use the two joins that go through FACT_SALES, or the two joins that go through FACT_RENT. Such situations occur each time the graph that contains tables as nodes, related by joins as edges, has a loop.
In traditional tools, a context is a list of joins. By specifying a context, one specifies that all the joins used to relate the tables underlying objects in a query must belong to this context. If the context is correctly built, it contains only one list of joins that can accommodate a given query. In the previous example, a context called “sales” would only contain the two joins that relate the FACT_SALES table with the two dimension tables CUST and PROD. By choosing this context, all the joins used to relate elements of a query belong to this context. Therefore, the query “Customer, Product” becomes unambiguous.
The current notion of context, however, suffers from a major drawback. A context is a list of joins. If new tables with new join possibilities are added to the underlying database, they also need to be added to all existing contexts. Consequently, the cost of maintaining contexts is high.
Moreover, in order to cover all possible cases, one needs to secure enough contexts to resolve all loops. The number of requested contexts may become prohibitive when dealing with large database schemas or with federated multi-source schemas. It is not uncommon to have to accommodate hundreds of contexts. The cost of creating and maintaining these contexts can become unbearable.
Accordingly, it would be desirable to introduce a new technique for processing contexts. Such a technique should reduce the number of contexts and the cost of creating and maintaining contexts. Such a technique should be largely automated so that the maintenance of contexts is simplified.