1. Technical Field
The present invention is directed to a method and apparatus for managing a set of data structures associated with a large file. More specifically, the present invention is directed to a method and apparatus for maintaining and using a tree structure to represent the relationship of region control blocks of a data file.
2. Description of Related Art
In the AIX operating system, it is useful to divide files into regions of 256 megabytes. Each region is assigned a unique identifier that is used by the virtual memory manager to find the memory associated with each region. The unique identifier can be mathematically converted to an address of a region control block, which is used to maintain information about the region, such as how many pages of the region are currently in memory, and the like.
In AIX version 5.2, files as large as 16 terabytes are supported. A 16-terabyte file contains 65,536 256-megabyte regions. The AIX operating system is designed to support 4 petabyte files (252 bytes), and a file this large contains 16,777,216 regions.
In the AIX operating system, a file that is currently in use by an application is uniquely identified by the identifier assigned to the file's first 256-megabyte region. This region is called the “primary region” or “base region.” If the file is larger than 256 megabytes, additional region control blocks are allocated, each with its own unique identifier. These regions are called “extended regions.” The regions are numbered consecutively with region 0 being the base region while region 1 is the extended region associated with the second 256-megabyte region of the file (the unique identifier associated with a region is independent of the region number).
Files can be created without writing to every region of the file. In addition, an application can read a portion of a file without reading the entire file. In both cases, the operating system allocates new region control blocks dynamically.
Typically, the control blocks for a file are part of a linked list in which each control block for a region has a pointer to the control block for another region. Thus, operations performed on the regions of the files require traversing the linked list until the desired region control block is encountered and then performing the operation. Since large file in AIX may have 16,777,216 regions, the linked list becomes large and the computing time required to traverse such a linked list becomes prohibitive.
Thus, it would be beneficial to have a method and apparatus for managing a set of data structures associated with a large file that arranges the control blocks in a tree, so that a desired region control block can be found much more quickly, thereby reducing the amount of computing time necessary to identify the control blocks for regions of interest.