1. Field of Use
The present invention relates to data processing systems and more particularly to database management systems.
2. Prior Art
Typically, today's enterprise or legacy systems store large quantities of data in database systems accessed by database management system (DBMS) software. In such database systems, data is logically organized into relations or tables wherein each relation can be viewed as a table where each row is a tuple and each column is a component of the relation designating an attribute. It has become quite common to use relational database management systems (RDMS) for enabling users to enter queries into the database in order to obtain or extract requested data. To extract desired data, the user enters a query derived from a database query language into the RDMS.
One well known query language is Sequel Query Language (SQL). The query language provides a set of commands for storing, retrieving and deleting data. Such language for relational database management systems does not require users to specify the manner in which data needs to be accessed. Generally, a query optimizer component is included in the database management system to select the manner in which queries will be processed. That is, the query optimizer component analyzes how best to conduct the user's query of the database in terms of optimum speed in accessing the requested data (i.e., provides an access plan for executing an SQL statement including the type of access to each table, order of access, whether any sorts or joins are performed and related information).
As known in the art, to conserve space in the database, it becomes desirable that the stored data values not be unnecessarily duplicated. Therefore, in a relational database, instead of having one very large table to hold duplicate data values, the user generally creates several smaller tables that contain unique data values that are related to each other through common attributes. A user can retrieve data for a given set of criteria by "joining" the smaller tables to represent the large table. Data is extracted from the system using a query command that locates all rows in a table or tables that meet specified criteria. In such systems, one of the most expensive operations to conduct in terms of time and resources is the "join operation" which joins together two or more entire relations which can be fairly large. When multiple joins are present in a user generated query, as is the typical case, the cost of processing the query increases dramatically. Thus, the time expended in developing, optimizing and processing complex queries can be exceedingly costly in terms of time and resources. This is particularly true as more and more complex data types.
To improve system performance in executing programs that use SQL statements to access such databases, one approach has been the introduction of an SQL cache which makes it possible to reuse the results of having processed certain types of complex query statements. This involves associating each process (i.e., each program in execution) with a local cache which is used to save the code generated for each SQL statement by such process.
The above approach has allowed the system in certain cases to bypass the step of code generation when the system encounters an identical SQL statement during the execution of a particular process. Since programs that access such database files repeatedly execute identical SQL statements, the use of an SQL cache with each process has been found to significantly improve performance. An example of a system that makes use of an SQL cache is the INTEREL product developed and marketed by Bull HN Information Systems Inc. For information concerning this product, reference may be made to the publication entitled, "Database Products INTEREL Reference Manual INTEREL Performance Guidelines, Copyright, 1996 by Bull HN Information Systems Inc., Order No. LZ93 Rev01B.
To improve performance of the above type of system, a multicache system organization was developed which enables a plurality of processes to utilize code generated in response to statements processed by each of the other processes which are stored in a central cache system. This substantially more efficient method of improving cache performance through the use of a central cache system is disclosed in the parent patent application entitled, "A Method and Apparatus For Improving the Performance of A Database Management System Through A Central Cache Mechanism".
While the above approach provides improved performance, it was noted that changes in the states of the relational database objects could have a considerable performance impact on overall system operation. That is, changes in such objects can easily render various associated cache entries obsolete. The use of obsolete entries can produce incorrect results from database entries, create software-triggered hardware faults or other exception conditions or even cause corruption in databases.
Hence the user is faced with several options which include (1) manually clearing such caches, (2) avoiding the use of the data definition language (DDL) facilities used in making such changes when such a cache is active or (3) run the risk of using obsolete entries. The first option is quite time consuming in terms of system overhead particularly in the case of large complex databases while the second option limits the user in terms of when such changes or updates can be made to the database. The last option could markedly reduce the integrity of the database.
Accordingly, it is a primary object of the present invention to provide a more efficient way of removing cache entries that have been made obsolete by changes made to database objects.
It is a more specific object of the present invention to provide a mechanism for efficiently detecting and removing obsolete cache entries.