Relational databases store data using tables. Generally, a table in a relational database consists of data organized using columns and rows. The columns represent a particular field, such as “last name” or “ID.” The rows represent data records stored in the columns, such as “Smith” and “12345.” A particular table may have millions of rows of data, making a search for a particular row slow and cumbersome. To speed access to the records in a table, databases index the rows with an index structure having algorithmic search properties. Historically, relational databases have used an index structure, called a B+ tree, to provide the shortest path possible to the desired data. In a B+ tree, a search is performed from the root of the tree through intermediate nodes down to leaf nodes. The root node and the intermediate nodes are collectively known as index nodes and point to other index nodes or to the leaf nodes. The leaf nodes point directly to records in the database (the row data). A B+ tree contains copies of the keys in the nodes and has a high number of children per node, making the path from the root node to the leaf nodes short. A short path is desirable because it results in fewer accesses to a disk storing the index. Disk accesses have a much slower access time than main memory accesses, but because of the cost of main memory, the tables and indexes are generally stored on disk-type storage devices.
As main memory decreased in price databases stored in main memory became practical. Because disk accesses are not a concern of main memory databases, the index for a main memory database would optimally seek to optimize cache memory usage rather than reduce disk accesses. To accommodate this, two techniques arose to increase the search performance of the B+ tree. The first decreases the number of “down” pointers in a node by addressing groups of nodes rather than individual child nodes. In such an index, the child nodes are stored contiguously in main memory in a node group. Thus, only one pointer is needed to point to the child nodes even though the node group contains several children. The location of each child (index key) can be calculated by simple arithmetic, since the nodes contain a fixed number of bytes and are contiguous. The node groups are also organized to avoid crossing cache-lines (e.g. if a particular cache is organized in 64 byte blocks, the cache line falls between every 64 bytes). This technique allows nodes to contain more key pointers, increasing processor cache efficiency. The root group of such a cache-sensitive index contains a single node, but subsequent groups may have one node more than the keys in the parent group. A cache-sensitive index focuses on reducing pointer overhead (i.e. reducing the number of pointers) and improving space utilization so that more keys can be added to the same-sized node. This trades off search speed for update speed because updates involve copying entire groups of nodes rather than individual nodes. FIG. 1 depicts a cache-sensitive index with node groups represented by a dashed rectangle.
The second cache-conscious version of the B+ tree is a partial key index. A partial key index reduces the size of the index by only storing partial keys and not full keys in the index nodes. Each node contains a set of down pointers, which point to other nodes or to records (rows). The nodes also contain partial keys and pointers to the full key, which is located in the record itself. The partial key information includes a two-byte offset indicating at what position the partial key differs from the base key, and two bytes of data after the offset that differ from the previous key. For example, if a base key contains “ABCDEF” and the next key contains “ABEGXY”, the partial key contains an offset of 2 and the 2 bytes of differing data contain “EG.” Thus, the partial key only contains the position of the key that differs from the base and the two bytes of data that differ from the base. A partial key index focuses on lowering key-comparison cost rather than reducing pointer overhead. FIG. 2 represents a partial-key index.
The Domain Name System (DNS) uses a distributed network of name servers (lookup nodes) to translate text-based web addresses, such as “www.acme-co.com,” to Internet protocol (IP) addresses, such as “234.562.55.3.” When an Internet user requests a web address, one or more name servers process the DNS request by looking up the web address in a database of registered domains. When the name server locates the web address in the database, the IP address is sent back to the user's computing device.
Some name servers must handle millions of DNS requests each second. Furthermore, the name servers must perform the resolution quickly to enhance the user experience on the Internet. Therefore, name servers may use main-memory databases to store the records needed to successfully resolve a DNS request to allow faster access to the data. Furthermore, web addresses are added to and removed from the name server database daily. To accurately resolve a DNS request, the name server must rely on an index updated in real time.
Because the traditional DNS resolution process is vulnerable to hacking (i.e. forged DNS data), the industry has begun to implement a secure version of DNS named DNSSEC (DNS Security Extensions). DNSSEC requires each DNS lookup node to authenticate the DNS request, thus ensuring that the request will not be misdirected to a fraudulent site.
To authenticate a DNSSEC request, the lookup node must determine where the web address falls in relation to the DNS zone. Internet addresses are divided into DNS zones in a hierarchical tree-like fashion. The root zone includes all top-level international, ISO country-code, and generic domains and are serviced by root name servers. Below the root zone are top-level domains (TLDs), such as “.com,” “.net,” and “.org.” The TLDs may be further divided into zones managed by organizations that register the second-level domains. These organizations may decide to delegate authority for sub-zones within lower-level domains. Thus, there may be several name servers responsible for the different zones associated with a web address. DNSSEC requires a name server to determine not only that a particular web address exists, but also what falls just after it and prior to it in the zone.
Therefore, it is desirable to introduce an index structure that facilitates faster access to large main-memory databases while still retaining the ability to add and delete records from the index in real time.