Field of the Invention
The present invention relates to a method and a system for archiving data from a source database to a target database.
A prior art method and system for storing data identities in a database structure, for example, in the database structure of an SQL server, includes storing value in the fields of a table having several records. The fields may be of various data types such as char, varchar, nvarchar, decimal, integer, real, and the like. Each data type has its own size which can be predetermined or variable. For example, ‘varchar’, ‘nvarchar’ data type is adapted to store value with variable size, while ‘char’, ‘decimal’, ‘integer’ data type store values with fixed size.
In this respect, a database management system (DBMS) allocates memory space of non-constant size for the fields having the data-type ‘nvarchar’. In some known implementation, for avoiding memory waste caused by using fields with constant size, the database management system allocates dynamically different size of the memory space, depending on the size of actual data to be stored, for instance depending on an application writing and reading the database.
However, a problem permanently existing in an industrial environment of a manufacturing execution system (MES) is the problem of inserting/merging data between different databases which are subject to physical and/or logical and/or application-caused constraints relating to the data persistency. In fact, in those scenarios a raw copy of the data identities between two databases would not achieve the desired result of a merged database because the existing constraints would not be adapted accordingly and any foreign key relations would not be preserved.
The same criticality applies to the archiving of data stemming for example from an industrial process. Also for this task, the problem of preserving data inconsistency during the archiving process is still hard to be resolved when considering also possible database engine limits. Further, it is often required that the archiving is executable as scheduled transfer operation which will divide the archiving operation into an arbitrary number of time-frames.
Another not less critical problem is the performance of the archiving operation since the transfer of large amounts of data usually has a non-linear behavior increasing with the amount of data to be archived. This is valid all the more for the archiving of hierarchical elements (i.e. elements characterized by father/child relations). Moreover, it is often required that the transfer and archive operation have to be associated with the deletion of the transferred data in the source database. In this case, caused by the intrinsic nature of the most popular database engine architectures, a specific procedure would be desirable locking only very few data tables each during the archiving in order to avoid a block of the source database operation thereby harming for example the execution of a production process managed by an MES environment.
For dealing with these challenges in data transfer operations, usually a backup approach is used while each database engine usually provides a simple backup/restore utility for this purpose. Starting from a source database, a portion of the data (or the database in its entirety) can be moved into a selected target destination. Unfortunately, these operations are performed without considering possible physical relations, such as foreign key relations or logical and application relations. Furthermore, if there are table hierarchies present in the source database, these approaches known in the prior art do not consider these table hierarchies but require the database administrator to specify all the tables involving such hierarchies whose data is required to be transferred. Moreover, these approaches often delete the previous data or require specific blank tables in order to avoid possible data collisions and transfer errors.
And last but not least, the option to subdivide the transfer operation in many time-frames thereby only locking few data tables is only manually achievable or requires the definition of an ad hoc SQL procedure which simply calls the backup functionalities for each specified time frame. Unfortunately, also this approach is still lacking the solution of the data consistency problem.