The present invention relates to the generation and alteration of a database and in particular to a database where different versions of the data therein should be accessible.
A number of computer/data storage theories and practises existxe2x80x94one of which is the so-called shadow paging principle which is designed to especially take into account problems often encountered when a number of alterations are desired performed on e.g. a databasexe2x80x94alterations which may be interconnected. If this operation fails in the process of altering the data, it may not be possible to actually regenerate the former data in order to attempt the alteration again. Thus, the database is in an unknown and thus undesired state.
Shadow paging solves this problem by not overwriting or deleting data but simply firstly copying and then altering all parts thereof which are required altered as a consequence of the desired alterations of the data. These new parts are stored separately from the former data. The actual data of the shadow paging principle are stored as a number of individually addressable data blocks and a tree structure having at the lowest level nodesxe2x80x94normally termed the leaves of the treexe2x80x94pointing to these data blocks is generated. Altering a data block will require the copying thereof and performing the alteration on the copy. The address of this new data block is entered into a copy of the tree node pointing to the new data block. This new tree node is also stored at a new address and any node pointing to the former node will also be copied, alteredxe2x80x94etc. This process is applied recursively until the root node of the tree has been processed.
This will provide a new set of data blocks of which some are new and some are not amended and thus oldxe2x80x94and some old data blocks, which will not be relevant when the commit operation has been successfully completed. Also, a new tree structure is provided part of which is old and part of which is new. Each of these tree structures have a nodexe2x80x94and these nodes are different.
The actual commit operation will finally be performed by having an overall pointer to the actual tree structurexe2x80x94and thereby to the actual data structurexe2x80x94point from the old root to the new root. The advantage of this function is that the commit operation is indivisible and is performed in a single operation. This operation can hardly be stopped in the processxe2x80x94whereby the fact that the overall pointer points to the new root will, as a fact, mean that the operation has been completed successfully. If this is not the case, the old root will still be actualxe2x80x94as will the old tree structure and the old dataxe2x80x94whereby no alterations have been performed on the data.
This principle however has the disadvantage that upon a commit operation the old data will no longer be available. Thus, it will not be possible to actually retrieve an older version of the data.
The present invention relates to a solution to that problem.
In a first aspect, the invention relates to a method for storing information stored in one or more files on a permanent storage medium, the method comprising:
storing data transaction-wise according to the shadow paging principle,
but retaining, in a commit operation, the previous data and their physical storage on the storage medium together with a separate storage on the storage medium representing new data as changes to the previous data,
the previous data and the changes together constituting, upon commit, a new version of the data,
both the previous and the new versions of the data being separately accessible.
In the present context, xe2x80x9ctransaction-wisexe2x80x9d will mean that a transaction is performed wherein all desired alterations to the data or database are assembled and performed in a single operation. This transaction is completed with the commit operation where e.g. the operator xe2x80x9ccommitsxe2x80x9d himself to the desired changes where after these are performed on the data.
Upon the commit operation, a new version of the data is generated and stored separately from the older data in a manner so that both the new and old version of the data are separately accessible.
In the standard shadow paging principle, the old data are not accessible upon a successful commit operation.
According to the invention, the old version of the data is separately accessible.
Preferably the data of a file is stored as a number of data blocks and wherein a change of the contents of the file, the previous data, results in a change of the contents of one or more of the data blocks.
Also, preferably, a single commit operation causes all changes, required by the transaction, to be applied to all of the one or more files.
During operation of a program accessing the database, it may be desired that at least a number of previous versions representing the maximum simultaneously outstanding transactions plus two are retained.
By a shadow paging principle it is normally meant that the data are stored as a number of individually addressable data blocks (normally the smallest individually addressable unit in the storage medium), addresses representing the physical storage of the individual data blocks being stored in a tree structure of one or more first data elements, and comprising:
a) identifying data blocks to be modified,
b) copying the identified data blocks,
c) performing the modification(s) on the copied data blocks,
d) storing the modified data blocks at addresses not coinciding with any of the addressable data blocks or any of the first data elements,
e) for each identified data block, identifying one or more of the first data elements of the tree structure from a root of the tree structure to the data block,
f) copying each identified first data element at an address not coinciding with any of the addressable data blocks or any of the first data elements,
g) replacing, in each copied first data element, the address of the identified data block or first data element with the address of the corresponding modified data block or first data element, and
h) providing a new root of the modified tree structure and having the new root represent the modified first data element corresponding to the first data element represented by the root of the tree structure.
If a first data element represents addresses of more than one data block having been altered by the procedure, preferably this first data element is only copied, altered and stored once.
A tree structure of the present type comprises a number of nodes (one of which is a root) each having at least two pointers pointing toward leaves or other nodes. Which pointer to choose will be determinable by the property of the desired leaf.
Normally, as mentioned, the commit operation comprises only step h) in shadow paging.
One of the advantages of the shadow paging principle may be seen from the depth of a tree describing a file is determined by the maximum size of the file, and the block size of the underlying physical storage media. The maximum depth of a tree can be expressed as
tmd=(log 2(maxFileSize)xe2x88x92log 2(blockSize))/(log 2(blockSize/pointerSize)
If we assume a maxFileSize as 2**32 a block size of 512 and a pointerSize of 4 the maximum depth of a tree is less than or equal to 4. Thus, a memory of 4 GBytes may be described by a tree of depth 4 which means that the tree structure itself uses only {fraction (1/128)}xe2x80x2th of the space of the memory.
In order not to store an altered data block or first data element at an address representing an existing data block or first data element, it is desired to maintain an updated knowledge of free addresses or free space on the storage medium. One manner of obtaining that knowledge is one where:
i) prior to step d), information is provided relating to the free addresses of the data storage medium which are not occupied by the data blocks and the first data elements,
j) step d) comprises storing the modified data blocks at free addresses and removing the addresses from the free addresses,
k) step f) comprises storing the modified first data elements at free addresses and removing the addresses from the free addresses.
One manner of determining which addresses are free is to have step i) comprise:
I) identifying at least substantially all addresses of the storage medium or a relevant part thereof and denoting these addresses free addresses,
II) for each root, identifying all first data elements and data blocks of the corresponding tree element and removing the corresponding addresses from the free addresses.
In the present context, the relevant part of a storage medium may be a certain number of addresses thereof. Normally, other parts of the storage medium will be reserved for other purposes.
In that manner, updated knowledge is maintained and finding an unused address for the next altered data block or first data element is simple.
It is clear that in shadow paging or similar principles where data is copied and old data not actively deleted, the actual space taken up by the database will increase for each transaction. One solution may be to maintain the total space taken up by the data/database below e.g. a predetermined size. In this manner, the number of free addresses (when the total number of available addresses is known) may provide that information. If this limit is exceeded, a previous version of the data may be deleted and the pertaining addresses released for new altered data blocks or first data elements. Another solution is to simply maintain only a predetermined number of e.g. the latest versions of the data.
Thus, step II) may be performed only for a predetermined number of roots. In that manner, as only the data blocks and first data elements of these predetermined roots or versions are xe2x80x9creservedxe2x80x9d, the addresses of data blocks or first data elements of other versions will be released/freexe2x80x94and thereby potentially deleted over time as the pertaining addresses are selected for new altered data blocks and first data elements.
The number of root pointers (and thereby versions available) retained depends on the application area. This number ranges from 2 to any desired numberxe2x80x94and does in principle not require an upper limit. An application like a database server might retain only a few root pointers, where a backup application would desirably not impose any limit on the number of retained root pointers.
The choice of the number of retained root pointers is a trade off between the desire of retaining old data, and the capacity of the underlying storage medium. The important fact is that the number of retained versions can be limited to a predetermined number, thus limiting the storage capacity required, and enabling reuse of storage blocks.
The limit on the number of retained root pointers enables the reuse of external data blocks, a data block can be reused when it is not referenced from any retained root pointer. The data blocks not referenced by the root pointers directly or indirectly, the free blocks, are described as in freeLists.
The data structure implementing free list must allow efficient adding and removal of blocks to and from the list. The address of a data block is augmented with the type of the data it is pointing to, the possible types being a data block and a descriptor block/first data element. The data block contains data stored by the users but not interpreted by the system. A first data element contains pointers to either first data elements or data blocks. The augmented pointers are used in the tree describing files, the file access and maintenance routines has no use of type information of the augmented pointer, but maintain those purely for the purpose of efficient handling of free list.
Thus, step II) may be performed only for a predetermined number of rootsxe2x80x94normally a number of the youngest versions.
In one situation, step I) comprises, upon a commit operation,
1) storing the addresses of the identified data blocks and first data elements together with a reference to an identity of the new version of the data,
2) providing information relating to free addresses of the storage medium prior to the commit operation, and
3) adding stored addresses referring to an identity of a predetermined prior version of the data to the information relating to the free addresses.
A version may be given any identificationxe2x80x94but normally these will be numbered consecutively.
A predetermined number of versions of the data may be maintained available and step 3) may then comprise adding the stored addresses referring to a version generated prior to the predetermined number of versions.
When the method further comprises storing the addresses of the identified data blocks and first data elements in one or more second data elements stored in the storage medium, a number of advantages may be seen in e.g. the fact that the free list will increase if the amount of versions or space required thereby decreasesxe2x80x94and vice versa.
Preferably, the second data elements are linked together in a linear list.
In a preferred embodiment, the method comprises:
identifying and reserving an existing version of the data, and
performing step 3) only after release of the reserved version.
In this manner, a reserved version will be maintained until released again. This means that new versions may be stored and generatedxe2x80x94but that the data blocks and first data elements of the reserved version are not added to the free list until released.
Reserving a version has a number of advantages, such as when obtaining a snapshot of the data and when generating time consuming reports of the data. Reserving a version and then performing the reports thereon will not delay the access to further amendments of the data to the users.
Due to the fact that a reserved version may actually contain historical data which have subsequently been amended, it is preferred that a reserved version cannot be amended. Also, consistency of the data may guaranteed if no amendments are performed to the reserved versionxe2x80x94those amendments are to be seen in the later versions.
A version may be reserved by a number of users or for a number of purposesxe2x80x94and only released when the version is no longer required. Subsequent to that, the addresses in the address list pertaining to the now released version may be added to the free list and subsequently reused in new versions.
In order to ensure the integrity of the data even upon direct hostile access to the storage medium, it may be preferred that each data block is encrypted prior to storing. A DES encryption is presently preferred.
Another solution is one wherein optionally or additionally each first data element is encrypted prior to storing. Especially the situation where both are encrypted, neither the data nor the structure thereof will be derivable by third persons.
As described above, the method preferably comprises collecting a number of desired changes to the data of the one or more files, preparing the new data by performing changes to the previous data and finally separately storing the new data by performing the commit operation.
In a second aspect, the invention relates to a method of generating a database, the method comprising:
providing one or more files comprising data,
storing the data of the files on a data storing medium as a number of individually addressable data blocks,
representing addresses of the data blocks in one or more first data elements organised in a tree structure having a root,
storing additional data in the database using the above-mentioned method.
The method may comprise copying a version of the database by identifying a relevant root of a tree structure and copying the tree structure and all data elements represented by first data elements thereof. Due to the version handling ease of the invention, copying of a version is simple.
This is also seen when the method comprises retrieving a version of the data. Then the method may simply comprise identifying a root relating to the desired version and retrieving the pertaining tree structure of first data elements and all data blocks the addresses of which are represented thereby. In that manner both the data and the structure is retrieved.
In a third aspect, the invention relates to a database generated according to the above method.