The present invention relates to data processing, and particularly to a maintenance mechanism for organizing data which has been previously stored on a storage medium during a data processing operation.
In certain data processing operations it is useful to maintain a log file related to ongoing data processing activity. For example, in the transaction processing field, it is common to maintain a log file of ongoing transactions involving a particular data processing apparatus so that if the data processing apparatus should fail for some reason (e.g., power failure) the log file can be used to recover the data processing apparatus back to the state it was in, with respect to processing the transactions, before the failure. This is a very important consideration because the business world depends on the processing of transactions to be tolerant of system failures in order to maintain the integrity of mission critical data which is involved in the transactions.
Such log files are typically maintained on a direct access non-volatile storage device (such as a hard disk drive) so that the data will not be lost in case of a power failure. As the data processing apparatus processes ongoing transactions, log records containing information pertaining to the ongoing transactions are written into files (called hereinafter xe2x80x9cextentxe2x80x9d files) in the non-volatile storage device. These log records would thus be retrieved from the extent files to reconstruct the transaction processing environment within the data processing apparatus in the event of a failure and a subsequent recovery from such failure.
The size of an extent file can be configured by the operator and log records are written to an extent file up to the configured size, at which point a new extent file is created in order to store more log records. It is often the case that an extent file is very large as compared to a log record, and thus many log records are stored in a single extent file. Special xe2x80x9clinkxe2x80x9d records are used to link one extent file to the next. A spare extent file known as a xe2x80x9ccushionxe2x80x9d file is also pre-allocated. This cushion file can be used as an extent file instead of creating a new extent file in the event that the log runs short on disk storage.
Eventually, the stored log records will become unnecessary. For example, in transaction processing, when a transaction has completed there is no longer a need for the data processing apparatus to store the log records for that transaction since there is no longer a need for that transaction to be taken into account in the event of the apparatus recovering from a failure. It is clear that if the amount of data written to the log is DW, then there will be a value DA which represents the data associated with active transactions, such that DA is less than or equal to DW.
While a transaction processing operation is progressing, it is thus useful to carry out a maintenance operation called xe2x80x9ckey-pointingxe2x80x9d the log, which involves rewriting all active data (DA) to the log as follows:
1) write a key-point start record to the log
2) rewrite all active data (DA)
3) write a key-point end record to the log
While the key-pointing operation is being carried out, concurrent accesses to the log for other purposes (updating, reading etc.) are blocked. Via the key-pointing operation, the log file is re-organized such that all of the active data (DA) is stored in log records between the key-point start record and the key-point end record. This means that all extent files which logically come prior to the extent file which contains the key-point start record can be deleted. Because DA less than DW if we have ended one or more transactions, the amount of data in the log will be reduced to a minimum and we can manage the overall size of the transactions log file.
A decision is required as to the best point at which to key-point the log. If the key-pointing is done too often, the result will be a reduction in the performance of the data processing apparatus, since the apparatus will be spending too much time key-pointing. On the other hand, if the key-pointing operation is carried out too infrequently, the log will accumulate extent files unnecessarily.
IBM Corp. has a software product called Component Broker which has a transaction service which performs a key-point operation of a transactions log file for every 100 transactions that are executed. It is not clear whether this is the best point at which to carry out the key-point operation. For example, if the extent file size is large enough to accommodate the log records for several hundred transactions, the key-point pattern will be of the form:
1) Perform a number of key-points within a single extent file. But these multiple key-points will consume much processing time, as the processing apparatus is blocked from doing log writes associated with transactional activity while the key-pointing operation is being carried out.
2) Perform a key-point operation which runs over into a new extent file. This key-point will not allow the removal of the first extent file (as the key-point start record remains within the first extent file) and therefore much file space will be unnecessarily consumed.
IBM""s CICS transaction processing software product (xe2x80x9cCICSxe2x80x9d is a trademark of the IBM Corp.) performs key-pointing after a configurable number of log write operations has been carried out. However, as the amount of data in each write can vary widely, the same inefficiencies pointed out above can also occur with this type of key-pointing.
Thus, the current state of the art with respect to the triggering of the key-pointing operation in a data processing environment employing log files is highly inefficient as too much processing time and file space is expended. There is a strong need in the art for a more efficient result.
According to a first aspect, the present invention provides a data processing apparatus comprising: a direct access non-volatile memory storage device having a plurality of extent files for storing log records therein; allocating means for allocating a current extent file to be used for storing log records; writing means for writing log records into the current extent file until the current extent file cannot store any further log records; and key-pointing means for performing a key-pointing operation on the written log records when the writing means has reached the point where no further log records can be stored in the current extent file.
According to a second aspect, the present invention provides a computer program product stored on a computer readable storage medium for running on a data processing system, the data processing system having a direct access non-volatile memory storage device having a plurality of extent files for storing log records therein, the program product for carrying out the following steps when run on the data processing system: allocating a current extent file to be used for storing log records; writing log records into the current extent file until the current extent file cannot store any further log records; and performing a key-pointing operation on the written log records when the writing means has reached the point where no further log records can be stored in the current extent file.
According to a third aspect, the present invention provides a data processing method being carried out on a data processing system, the data processing system having a direct access non-volatile memory storage device having a plurality of extent files for storing log records therein, the method having steps of: allocating a current extent file to be used for storing log records; writing log records into the current extent file until the current extent file cannot store any further log records; and performing a key-pointing operation on the written log records when the writing means has reached the point where no further log records can be stored in the current extent file.
Since the triggering of the key-pointing operation occurs when a new extent file is allocated, an optimal timing for key-pointing is achieved which saves greatly on file space and processing cycles. Specifically, since key-pointing takes place in the newly allocated extent file, the previous extent file can be deleted without losing active data, thus freeing up file space. Further, a considerable savings in processing cycles is attained since the frequency of key-pointing is reduced when extent files are large in size, because no key-pointing will take place until a new extent file is prepared for the writing of log records therein.