Modern databases can contain large amounts of information, amounting to hundreds of gigabytes, if not many terabytes of data, the majority of which is stored in the form of large tables. Consequently, despite a vast array of computing power and advancements of modern computer platforms, queries against large databases can take a considerable amount of time to complete; in the order of several hours is not unusual. Thus, it is well understood that queries against large databases can be computationally intensive as a consequence of the magnitude of data that must be processed and manipulated and because the data manipulation necessary to facilitate a query can be computationally complex and expensive.
In order to reduce the computational tasks and overhead necessary to extract useful information from a database, a number of techniques have been developed, such as the utilization of materialized views and the optimization of user queries, for example. A materialized view, also known as an indexed view, is a mechanism for expediting query processing. In more detail a materialized view is pre-computed so that such materialized view can be utilized to compute a series of queries, an individual query, or a sub-query, rather than rendering a result ab initio from base tables each and every time that the series of queries, individual query, or sub-query, is executed against a database. Currently, most major database systems provide support for materialized views. Query optimization, on the other hand, involves rewriting a user's query into an optimized query by way of substitution of equivalent but less computationally intensive queries/expressions. Utilization of these techniques in conjunction, for instance, can facilitate significant efficiency gains and consequently an appreciable reduction in computational power and time necessary to render query results.
Although materialized views either alone or in combination with query optimization can provide dramatic improvements in query processing time, the synergistic improvements of the combination of materialized views with query optimization manifest themselves only when the query optimizer is able to determine when and how a query or sub-query can be computed from a particular materialized view(s). The problem of determining whether and how a query or subquery can be computed from a materialized view is known as the view matching problem.
To date, systems and methods for view matching have been confined to materialized views defined by expressions consisting of projection, selection and inner joins with an optional group-by operator on top (PSJG views). In contrast, view matching in the context of materialized views expressed as projections, selections and outer joins with the possible addition of an aggregation operator, or group-by operator, on top (PSOJG views), has yet to be satisfactorily addressed. Since many user queries and sub-queries utilize both inner joins as well as outer joins, and the lack of a solution to the view matching problem for PSOJG views, the potential efficiency gains offered by such materialized views have not been available to these types of queries.