A conventional database system uses a datastore to persistently store data pages, and a cache to provide fast access to the data pages. For example, in response to a request to access a data page, the data page is loaded from the datastore into the cache, and may thereafter be accessed from and/or modified within the cache.
The datastore may include a tree of converter pages. Converter pages at the lowest level of the tree map the logical page numbers of data pages to locations of the datastore at which the data pages are stored. The tree of converter pages is loaded into the cache upon initialization of the database system, and the cached converter pages are used to determine the location of a particular page within the datastore in response to a request to access the particular page.
In order to determine the location of a particular page in the datastore, the cached tree of converter pages is traversed from its root level to a particular converter leaf page, which is located at the lowest level of the tree and which specifies the location of the particular page. FIG. 1 illustrates converter 10 of a prior database system. Converter 10 includes converter inner pages 11 and converter leaf pages 12.
Each inner page 11 includes an indexed list of unique identifiers of one or more child converter pages, which may comprise one or more other inner pages 11 or one or more leaf pages 12. The child converter pages may be located within the cache based on a hashmap which associates the unique identifiers with cache memory addresses at which corresponding child converter pages are located. Each converter leaf page 12 maps a set of logical page numbers to datastore locations at which corresponding data pages are stored. Inner pages 11 may therefore be used to locate a converter leaf page 12 corresponding to a particular logical page number within the datastore.
Generally, the size of each page of converter 10 is a power of two (i.e., 2x), and each page includes a header. For example, each page may be 256 Kb (i.e., 28 Kb) in size, with each inner page 11 including a 16 byte header and each leaf page including an 8 byte header. Accordingly, each inner page 11 includes 28−2 child identifiers and each leaf page 12 includes 28−1 logical page number-to-datastore location mappings.
To describe an example of converter traversal according to some prior systems, it will be assumed that a process has requested logical page number seventeen of a database. Accordingly, assuming that logical page number seventeen is not located within the database cache, the location of logical page number seventeen within the datastore must be determined. Converter root page 13 is initially located and the number of page number-to-page location mappings corresponding to each of descendant branches 14-16 is determined. In the present example, each of branches 14-16 includes fourteen page number-to-page location mappings. More specifically, each of branches 14-16 includes two converter leaf pages, and each converter leaf page includes seven page number-to-page location mappings.
To determine which of branches 14-16 to descend, the logical page number (i.e., seventeen) is initially divided by the number of mappings addressable by one descendant branch (i.e., fourteen). The resulting quotient in the present example is “1”, while the remainder (i.e., “3”) is ignored (or not computed). As shown in FIG. 1, root page 13 stores an indexed list of child page identifiers (i.e., 0:28, 1:0FC and 2:33A). The quotient is used to determine a child page identifier from the indexed list. In the present example, the identifier 0FC is determined because it is associated with index 1. Therefore, converter page 18 corresponding to the identifier 0FC is located within the cache using the aforementioned hashmap.
Next, the smallest logical page number accessible in the current branch is determined. The difference (i.e., three) between this number (i.e., fourteen) and the logical page number of interest (i.e., seventeen) is determined. As before, the number of page number-to-page location mappings corresponding to each of descendant branches 20 and 21 is then determined (i.e., seven). The difference (i.e., three) is divided by the number of mappings (i.e., seven) to produce a quotient (i.e., 0) and a remainder (i.e., 3).
The quotient is used to determine a child page identifier from the indexed list stored in converter page 18. The determined identifier in the present example is 94 because the quotient is 0, and the identifier is used in conjunction with the hashmap to locate leaf page 22 within the cache. The remainder is then used to identify an appropriate mapping stored within leaf page 22. Specifically, the remainder 3 points to the fourth mapping of leaf page 22. In this regard, the first mapping (i.e., having index 0) of leaf page 22 is associated with logical page fourteen, the second mapping is associated with logical page fifteen, and the third mapping is associated with logical page sixteen, so the fourth mapping of leaf page 22 (i.e., having index 3) is associated with logical page seventeen and specifies the location of logical page seventeen within the datastore.
As described, the descent from one converter inner page (including the root page) to a child converter inner page requires a division operation (i.e., to determine the identifier of the child converter inner page), and the descent from a converter inner page to a child converter leaf page also requires a division operation (i.e., to determine the identifier of the child converter leaf page). To identify an appropriate mapping within the child converter page, systems may employ either a modulo operation (as illustrated above when descending from inner converter page 18 to leaf converter page 22) or, as an optimization, subtraction of the smallest logical page number addressed by the converter page (as illustrated above when descending from root converter page 13 to inner converter page 18). Division and modulo operations are computationally expensive and significantly impede the speed of tree traversal.