1. Field of the Invention
The present invention relates generally to systems and methods for managing consistency constraints in database middleware that interfaces with a variety of data sources.
2. Description of the Related Art
Consistency constraints are implemented in database systems to ensure that the stored data is xe2x80x9ccorrectxe2x80x9d with respect to the real world reflected by the data, such that applications and users of the data are shielded from incorrect data. For example, a database constraint might be that an employee cannot be paid more than the employee""s manager, and any attempt to enter into the database a salary for the employee that is higher than the salary of the employee""s manager would be forbidden by the database constraint manager by, e.g., aborting the transaction that caused the constraint violation.
Moreover, when a database system includes plural data sources, constraints might exist for ensuring that the data in the sources is consistent across the sources. To support consistency constraint management, the sources might notify a constraint manager every time an update has taken place, so that the constraint manager can access the source and check the new data for consistency. If a violation occurs, the constraint manager typically has authority to write a correction to the affected source, to remedy the constraint violation. Or, the constraint manager might from time to time monitor the sources for constraint violations by evaluating an entire data source for consistency on a periodic basis.
In any case, the present invention makes the critical observation that in the case of database middleware, i.e., systems that respond to external user-generated queries for information by accessing multiple data sources, the middleware might access many diverse types of data sources. Some of the sources might notify the middleware of updates, some might permit the middleware to only monitor the sources for updates, and some sources (e.g., Internet-based sources) might not support either notification or monitoring and in any case might not grant middleware the privilege to repair inconsistencies at all, even when the inconsistencies are identified, thus depriving the middleware of the authority to enforce constraints at the source level. The present invention makes the further critical observation that partially inconsistent data might nonetheless be useful to an application, as when, for example, residence data violating a zip code constraint but nonetheless properly listing telephone numbers is supplied to a telemarketer. Accordingly, the problem addressed by the present invention is how to manage consistency constraints in middleware that uses various types of data sources to respond to queries from various different users/applications, some of which might have constraint compliance requirements that are different than other users/applications.
A general purpose computer is programmed according to the inventive steps herein to manage consistency constraints. The invention can also be embodied as an article of manufacturexe2x80x94a machine componentxe2x80x94that is used by a digital processing apparatus and which tangibly embodies a program of instructions that are executable by the digital processing apparatus to execute the present logic. This invention is realized in a critical machine component that causes a digital processing apparatus to perform the inventive method steps herein.
The invention can be implemented by a computer system including a general purpose computer and a query processor. Plural data sources are accessible to the computer such that the query processor can retrieve data from the sources in response to a user query, and logic is provided that is executable by the computer for undertaking method acts to determine whether a source generates notifications of data updates, and if so, receiving the notifications. The logic also includes determining whether the source permits monitoring of updates and whether monitoring is feasible, and if the source permits monitoring of updates and monitoring is feasible, the logic monitors for updates from the source. Otherwise, the logic envisions undertaking just in time checking for consistency, with the just in time checking including receiving a query from the query processor and checking for constraint consistency, prior to executing the query to return a query result.
In a preferred implementation, the virtual repair logic includes repairing inconsistencies in data received from the source after receiving the data from the source to render repaired data. Moreover, the preferred logic further includes sending the repaired data to the query processor, such that the query is executed by the query processor using repaired data instead of inconsistencies.
As set forth further below, the preferred logic includes identifying at least one data inconsistency during at least one of: the act of receiving the notifications, and the act of monitoring for updates from the source. Then, it is determined whether a repair is known for the inconsistency, and if so, whether a write interface to the data source of the inconsistency is provided. Also, it is determined whether authority to update the data source of the inconsistency has been granted, and if so, the inconsistency is repaired at the data source.
In contrast, if it is determined that a repair is not known for the inconsistency, the logic includes determining whether marking is requested, and if so, marking the inconsistency in the query result. Otherwise, the logic envisions nulling out the inconsistency or sending an alert with respect to the inconsistency to a user generating the query. The act of nulling out or alerting includes determining whether to null out or alert based on at least one of: the inconsistency, and the user. Thus, for example, an inconsistent zip code in a residence record containing a telephone number can be nulled out for a telemarketer requiring only a correct telephone number, whereas an alert of an inconsistent zip code can be sent to a direct mailer user requiring the same record.
In another aspect, a computer-implemented method for constraint management in a database middleware system includes accessing at least one data source supporting neither update notification nor monitoring by the middleware system. The method includes receiving a query requiring accessing of the data source, and checking for constraint consistency of data from the data source after receiving the query and prior to executing the query to return a query result.
In still another aspect, a computer program device includes a computer program storage device readable by a digital processing apparatus and a program on the program storage device and including instructions executable by the digital processing apparatus. The instructions include code means for receiving a query for data, and code means for communicating with a data source in response to the query. Code means check for data constraint inconsistencies in data stored at the data source, prior to executing the query. Also, code means repair any inconsistencies at a location other than the data source, and code means execute the query, after inconsistencies have been repaired.
The details of the present invention, both as to its structure and operation, can best be understood in reference to the accompanying drawings, in which like reference numerals refer to like parts, and in which: