1. Technical Field
The present invention relates to design of relational data bases. In particular, the present invention relates to the problem of creating and maintaining semantic consistency in the presence of redundant data in a relational database.
2. Prior Art
The design of relational data bases is a well established discipline. The classical design of relational databases is based on the development of an information model, often in an Entity/Relationship (E/R)-Notation.
A major concern for the design is the avoidance of redundant information in order to avoid updates which may lead to inconsistent data.
However, the disadvantage of having no redundant information in the data base is that time critical applications may not be able to access the data quickly enough.
These opposing problems, performance optimisation through redundant data vs. maintaining the consistency of the redundant data, can not be addressed sufficiently through prior art approaches.
Only one prior art approach has been found which appears to be related to the subject of the current invention which is disclosed in U.S. Pat. No. 5,369,761. The invention disclosed therein is to save time on read operations by avoiding joins between different tables. It is proposed to simply copy fields from a source tablexe2x80x94in the normalized database layoutxe2x80x94into other tables where they may be used with the other fields thus avoiding excessive database join operations which are known to be costly.
Said prior art invention and the present invention target at different problems in different ways but both using not- normalized databases. The cited prior art approach is based upon the existence of a normalized database which is then stepwise de-normalized transparently by the database administrator. Upon every accessxe2x80x94read, insert, or updatexe2x80x94the query needs to be translated to access the fields.
The understanding, however, that normalized databases are not the most intuitive data model, but artifacts which need to be built to compensate for the lack of support to cope with redundancy in databases should be considered for the purpose of the present invention.
The more natural design and usage of databases, however, is based upon the original un-normalized database design. With the approach according to the present invention, there is no need of any modifications on read statements, because the applications accessing the database use the real existing database layout. For database usage profiles with a large percentage of reads compared to writes, this is a considerable performance gain.
Therefore, an object of the present invention is to provide an improved method and system for database access having a better performance in using redundant data.
It is another object of the present invention to provide such method and system having a broader area of application, i.e. which is applicable to database modifications initiated by any program interface accessing the database.
It is another object of the present invention to provide such method and system which avoids complicated procedures required on recovering data consistency in case of a system failure.
These objects of the invention are achieved by the features stated in enclosed independent claims. Further advantageous arrangements and embodiments of the invention are set forth in the respective subclaims.
The present invention is based on the fact that in relational database technology there is no formalised information about dependencies between fields of the data base in state of the art data base schema information. Thus, every application using the database needs to have built-in knowledge of these dependencies, or determine that by some other means.
According to a first aspect of the present invention it is proposed to provide a method and system for optimizing access to a database of a computer, in which said database has a schema information being extendible to comprise information on redundant data, and which method comprises the provision of formalized information about dependencies between fields of the database in order to consistently update redundant data contained in the database.
In particular, the method of the present invention is characterized by the steps of providing said formalized information as an extension to the existing database schema information.
This is achieved advantageously by providing a dependency model identifying source fields and respective derived fields, said model comprising the necessary information for dependent data to be identified and calculated on a change of respective source data according to some predetermined rules, further by recording changes, i.e., modifications to data involved in a redundancy, and propagating changes including cascaded changes directed to said derived fields using said rules defined in said dependency model. The step of recording changes is separated from the step of propagating changes to derived fields in order to improve performance.
Many advantages are achieved by this inventional approach:
As the inventional Dependency Model is concerned it provides the necessary information for redundant data to be identified and calculated in a timely manner. The Dependency Model contains information that identifies the dependency, the type of dependency, documents the rules for derivation (e.g. calculation, summation), identifies the set of source and destination fields of the redundant data. A general advantage of the Dependency Model is that it provides the base for a generic solution to the problem of redundant data.
With respect to the Change Recorder it records changes to data involved in a redundancy. Preferably, the Change Recorder is implemented as a database trigger. This has the advantage over other solutions that all changes are guaranteed to be recorded. Further, because all changes are written into a Change Table as a part of the original transaction that modifies the data preferably only the data changes that have been committed are recorded.
With reference to the Change Propagator it uses the rules defined in the Dependency Model and the entries in the Change Table to derive the value of the redundant data. If, according to a preferred feature of the inventional concepts the Change Propagator is implemented as a database job then the advantage results that all redundant data can be calculated and maintained in a central place.
A further advantage is that in modern relational databases the timing of such a job can be configured. Such a precisely timed change propagating job can advantageously be run either directly after the commit of the transaction involving the Change Recorder, so that a consistent data is always present, or after a predetermined time interval as e.g., every second or at the end of each month which depends on the actual business environment.
Further, as a great advantage over prior art any application logic regarding redundancy problems will be removed. This is an important factor as there are often a plurality of applications dedicated to process derived fields. This is achieved by the automatic propagation of changes to fields to the semantically dependent fields. By only having to update the source data and not the redundant data, said applications updating data involved in a redundancy can not corrupt the consistency of the data base. Thus, the applications are relieved from understanding semantical dependencies in the data base and from computing. The semantical dependencies are encapsulated in the Dependency Model and the logic for recording changes and distributing said changes properly to derived fields, is implemented in the Change Recorder and Change Propagator, respectively.