The field of the invention relates to computer databases. More specifically, the invention relates to main memory databases.
Traditional database systems, relational or hierarchical, are all disk based. The new trend of main memory relational databases reflects the market demands on high performance database systems. A main memory database system can significantly speed up the system response time, but requires loading all database contents into main memory. However, in most cases, fitting an entire database into main memory is not feasible because of the size limitation of addressable main memory and the prohibitive cost.
Currently, two approaches to managing all the indexes in a database system are used. One approach is to store all indexes on disk and use memory based buffers to keep frequently accessed contents. This approach is for traditional database index management. The problem of this approach is slow response time caused by disk access. Even if some portion of some indexes are buffered, the access methods to those buffered indexes are still disk oriented and CPU utilization cannot be optimized.
A second approach is to store all indexes in memory. This approach is for main-memory database management. Because the entire database, including all indexes, can be loaded into main memory, the database access speed will be significantly improved. However, the problem with this approach is that it requires very large memory space for a realistically sized database. Most of the current hardware platform cannot even support more than 2 gigabytes of main memory because of the 32-bit memory address limitation. Even when 64-bit architecture based machines are available, main memory cost is still prohibitive compared to disk memory cost. Currently, per megabyte cost in main memory is about 200-500 times more expensive than disk.
While some progress has been made in the field of main memory relational databases, little to no progress has been made in applying a hierarchical database model to a main memory database. The data model for a hierarchical database system is a tree or a forest of trees. A tree can be described using a node-labeled approach where a tree contains a set of nodes. A node consists of some form of the identification of the node, and optionally, a set of attributes. A node may contain a list of nodes as its attributes. The list can be ordered or un-ordered. An attribute consists of a name and one or more than one value. A node inside a tree can have a data type. A set of nodes with a certain data type is called a collection. A partition of a hierarchical database is a sub-tree of the database tree or forest.
An example of a hierarchical database structure is illustrated in FIG. 1. For example, the database could store information about a company""s structure. In one embodiment, the first level 100 of the hierarchy contains information about the company, such as name and phone number. The second level 110 contains information about departments within the company 100. The third level 120 contains information about groups within the department 110. The fourth level 130 contains information about either people within the group 120 or, if the departments 110 are not divided into groups, people 130 within the departments 110.
The present invention is a system and method for selectively loading indexes into main memory to improve hierarchical database performance. The technique is called hot indexing. By hot indexing, only the most frequently accessed portions of database contents or indexes are loaded into memory. Therefore, the size of the database is not limited by the size of the main-memory. At the same time, targeting the most frequently accessed portion ensures that the desired database content or index is in the faster main memory as opposed to the slower disk based storage. Moving the entire portion of selected indexes into main-memory, as opposed to making disk based queries to content, speeds up the entire data access process dramatically. A synchronization method guarantees that changes made to the portion in main memory are reflected in the more permanent disk-based memory.