This invention relates to data file storage apparatus.
The invention is particularly, although not exclusively, concerned with a data file store for use as a work-in-progress store in a transaction processing system.
In a transaction processing system, the operation of the system is divided into units, referred to as transactions. Each transaction is atomic, in the sense that, at the end of the transaction, either all the data updates associated with the transaction must be applied to their respective databases, or none, depending on whether the transaction is deemed to have been successfully completed. That is, it is not permissible to apply only some of the updates.
To ensure atomicity of transactions, it has been proposed to use a special store, referred to as the work-in progress (WIP) store, to keep a log of data updates that have been initiated by the transactions, but which have not yet been confirmed as committed. (See our co-pending European patent application No. 92304728.6). Such a store has the property that, in normal use, the only operations on the store are to write data or to delete data: reading of the store is required only in the event of a failure, requiring recovery of any transactions that have not yet committed their data updates.
Two different methods have been proposed for organising such a store: random access and serial file.
In the random access technique, each session is allocated an area of file store, for holding the required log of data updates. When the data recorded in the session is no longer required, or is updated, the old data is invalidated, by overwriting it either with the new data or with a free pattern.
This technique is potentially the most compact in terms of file store usage. However, the major disadvantages of this technique are:
(i) unless the application is well behaved, the area allocated to each transaction would either have to be large enough to hold the largest possible amount of data which can be expected, or else an overflow area has to be provided, leading to internal fragmentation or variations in performance. PA1 (ii) such a technique may not be well suited to systems where the number of sessions is highly variable and sessions are short lived, since each session requires a fixed space allocated to it whether or not it is in use. PA1 a) a file store comprising a plurality of blocks, PA1 b) means for designating each block as a free block, a data block or a forget block PA1 c) means for arranging the free blocks in a chain, PA1 d) means for creating a data area, by assigning at least one of the free blocks to the area and writing data into the area, PA1 e) means for writing into the filestore a forget block pointing to a data area to be discarded, freeing all the data blocks in that area and adding those blocks to the chain of free blocks, and PA1 f) means operative whenever at least one of the blocks in a data area pointed to by a forget block is reused, for freeing that forget block and adding the freed block to the claim of free blocks.
In the serial file technique the file store is viewed as an infinite serial file, and data is written only to the end of this file. When an item of data has to be invalidated, a special control area, referred to as a "forget block", is written to the file, indicating that the previous data item is invalid.
The main advantage of this technique is that the amount of data written for a transaction is not fixed by a predetermined size. Also, since data is written only to one point in the file (the current end pointer), several items of unrelated data can be written in one request, thus reducing head movement (in the case of a disc store).
The principal disadvantage of the serial file technique is that there is, of course, no such thing as an infinite serial file. Any practical implementation therefore requires a garbage collector to go through the file, freeing space and re-allocating areas to allow the next pass through the file to proceed. This is an expensive process in terms of processing time, and may lead to the garbage collection process becoming a bottleneck.
The object of the present invention is to provide a new technique for organising a data file store which builds on the advantages of the serial file organisation, while avoiding the worst aspects of its performance (i.e. mandatory garbage collection).