The following notions are used in this application:
“Data management system” is an entity, which comprises one or more databases and/or data management systems, whereby the system is responsible for reading the data structures contained in the databases and/or data management systems and for changing these data structures.
“Data element” is an information structure, which can comprise other data elements or such data elements, which can be construed as atomary data elements. For instance, in a relational database data elements are represented by tables comprising rows. The rows comprise columns, which are typically atomary data elements.
“Database” is an information structure, which comprises one or more data elements, and the use of which is controlled by the data management system. The invention is applicable both in relational databases and in databases of other forms, such as in object oriented databases.
“Database Server” is a software process that manages the data of a database and through which applications can access and modify the data of the database.
“Database operation” is an event, during which data elements are read from the database, during which data elements of the database are modified, during which data elements are removed from the database, or during which data elements are added to the database.
“Database Catalogue” is a logical database within a database instance. A physical database can manage data of multiple database catalogues. Each database catalogue can act as a master or replica database node in a database synchronization environment.
“Database Schema” is the structure of a logical database, described in a formal language supported by the database management system (DBMS). In a relational database, the schema defines the tables, the columns in each table, and the relationships between columns and tables.
“Master database” is a logical database in a database synchronization system that contains the official version of synchronized/distributed data, such as for example data about significant financial transactions. The master database can have multiple replica databases in the network.
“Replica database” is a logical database in a database synchronization system that contains a full or partial copy of the master data.
“Synchronization” is the operation between replica and master database catalogues in which changed data is exchanged between the catalogues. In one known embodiment, this means propagation of Intelligent Transactions from replica to master and subscribing to at least one publication to download changed data from master to replica, [1].
“Push synchronization” is synchronization between replica and master database catalogues initiated by the master database server.
“Publication” is a set of data in a database catalogue that has been published in master database for synchronization to one or multiple replica databases. A publication can contain parameters that are used to filter data of the publication.
“Transaction” is a plurality of database operations acting on the data pieces or elements. A transaction is an atomic operation that is completed or discarded as a whole. A transaction can also comprise further transactions. A transaction may be for example a financial transaction.
There are presently many different ways for auditing or verifying health of a database. One of the typical ways to audit the databases is to use a separated application that analyses the consistency of the database. According to prior art solutions audit information for database updates and the status of transactions in process is often sequentially written in audit records in an audit file. The audit file is typically used to restore the database to a consistent state following a system failure.
These audit files can also be used for verifying that the content of a database matches with the combined effect of transactions that have been committed in that database. A mismatch between these two may indicate a security breach, technical malfunction, error in application program or user error.
There is disclosed a prior art solution in U.S. Pat. No. 6,275,824 [4] which features an audit module that may validate the enforcement of the data privacy parameters in a database management system, by ad hoc queries or otherwise.
An even further prior art publication is featured in U.S. Pat. No. 5,982,890 [3] where a remote monitor computer connected to distributed computers detects fraudulent data update. “The monitor computer collects initial data of the databases of the distributed computers via the network to generate parities for data”. Parities are later compared to detect an inconsistency.
There is disclosed a prior art solution in U.S. Pat. No. 5,758,150 for auditing of the databases, where a migratory application processes the audit trail files of the remote computer or database to create a database of change. According to the publication U.S. Pat. No. 5,758,150 the migratory application processes the audit until the database of change reaches a size of threshold, when the data extract and transfer application shuts down the migratory application, processes the database of change and restarts the migratory application to begin to creation of another database of change.
However, there are some disadvantages in the prior art methods for auditing of the databases in a trustworthy manner and securely. Most importantly, there is no automated method for securely auditing databases of distributed systems that would be able to audit a replica database with an unpredictable audit logic. An audit of the databases is performed in prior art by using separated audit application logic, whereupon there arises uncertainties about security and integrity of the system. In addition according to the prior art solutions the database server may prepare the desired results of the audit of the database beforehand taking advantage of the data or instruction for performing the audit situated in the database server and thus falsify the results of the audit of the database. Not only falsification, but the possibility of incompleteness in inspection are aspects of the prior art. When only data privacy parameters are enforced by inspecting them regularly it is still possible to corrupt the database by those parties that do not violate the data privacy. Parity inspection is a very limited method of inspection, and requires continuous surveillance in order to guarantee inspection reliability. A possibility exists that a corrupting action will change an even parity to an odd parity but then to an even parity again in between inspections.
Additionally, relying on the cryptography or other security mechanism alone to enforce data integrity is not always sufficient because any cryptography-based security system can be cracked if “unlimited” amount of computer processing power is applied or hacker finds a loophole from the security system.