The present invention relates to data center management and file systems, and more specifically, this invention relates to locking at the index level of a record-oriented file system to ensure data integrity when performing operations to data records.
File systems which allow direct and sequential accesses to data stored therein typically utilize a B+ tree structure which is a variation of the basic B tree structure with all terminal nodes thereof containing data records. The non-terminal nodes of the B+ tree structure are referred to as an index structure. The top of the B+ tree structure is a single node referred to as the root. The B+ tree structure is a balanced tree with all the terminal nodes at the same level such that all data records stored therein have the same or substantially the same search length. The effectiveness and the popularity of the B+ tree structure may be attributable to the shape of the tree. The B+ tree tends to be short and wide, referred to typically as “flat,” e.g., it has few hierarchical levels and many nodes at each level.
The B+ tree structure has become somewhat of a standard for organization of files. Many database systems (relational or otherwise) and general-purpose access methods, such as virtual storage access method (VSAM), are designed using the B+ tree structure. The VSAM includes some additional features over other typical access methods, such as key compression. For ease of discussion, and due to the fact that VSAM was one of the first commercial products in the world that used the B+ tree structure, VSAM terminology may be used in the descriptions provided, but the descriptions are not limited to VSAM alone, as any access method may be used in relation to a B+ tree structure.
The index structure of a B+ tree, such as a VSAM key-sequenced data set (KSDS), includes two parts, the ‘sequence set’ and the ‘index set.’ The terminal nodes of the B+ tree structure are keyed data records which are organized into one or more control intervals (CIs). Above the CIs are one or more control areas (CAs), with each CA being capable of organizing a plurality of CIs. Each node in the index is an index CI.
To ensure data integrity during parallel access, any searches, updates, and insertions of data records in a B+ tree are conducted in a serialized manner, commonly with the aid of locks or locking mechanisms. The choice of the level of serialization, e.g., at the key level, the record level, the index level, or the data set level, directly influences the functional and performance characteristics of the file system. For instance, if all record insertions are serialized on a data set level by locking up the entire data set for each insertion request, the performance would certainly be far worse than locking only the inserted record. However, locking only the inserted data record will not achieve data integrity if the insertion causes a CA split, because the split modifies one or more index nodes.
Currently the known solutions for file systems using a B+ tree structure, including VSAM, to handle performance problems related to splits on B+ trees typically lock at the record level for the insertion of a data record and, if a split occurs, a lock at the data set level is also obtained. Locking of the entire data set for splits creates incredible performance problems, as it single-threads all split processing operations against the locked data set. This has forced users of file systems using a B+ tree structure to devise a plethora of schemes to minimize splits, with some of the schemes creating other adverse performance issues.