With network development, information grows explosively, and data of people reaches an unprecedented scale. Storage and management of the data on a very large scale have become a big challenge. A peer to peer (P2P) storage system based on a distributed hash table (DHT) technology has high scalability and supports large-scale data storage, and therefore can address this challenge well. An underlying storage engine of this storage system is generally a key-value database, which is hereinafter referred to as a K-V database, that is, a non-relational database for storing and accessing data in a form of a key-value pair. Common data access operations of the storage system include insertion, search, and deletion, and a data access form is generally as follows: put (key, &value), get (key, &value), or delete (key), where key is a unique identifier of data and value is content of the data. In the following description, put, get, and delete are respectively corresponding to an inserting operation, a searching operation, and a deleting operation.
A variable-length hash K-V database is a common hash database. This database can store a variable-length key and value. A basic principle of this database is to determine a storage location of each key and value using a hash algorithm, and when a hash collision is encountered, use a specific data structure and a binary-tree algorithm to resolve the collision. A logical structure of this database is shown in FIG. 1. In FIG. 1, the hash K-V database is logically divided into four parts, which are as follows. Bucket array, that is, a hash bucket whose size is a size of a hash space, where content stored in the hash bucket is a location of each key-value pair in a corresponding storage medium. Key, that is, a key of each key-value pair, storing a value of the key. Value, that is, a value of each key-value pair, storing a value of the value, and Ptr, that is, when the hash collision occurs, storing a location (or an offset) of a next key-value pair with a same hash value. The foregoing components form a general structure of the hash K-V database.
Because the hash algorithm is used, according to an inherent feature of the hash algorithm, an equal hash value may be obtained when inputs are different. A key to performance design of this variable-length hash K-V database lies in a manner of handling a hash collision. Because a binary tree is used in the database to resolve a collision between key-value pairs, when a data volume is too large, the tree is relatively deep. Every time when a key-value on a leaf node needs to be read or updated, some nodes on an entire binary tree on a disk needs to be randomly read. Consequently, a performance jitter is so large that a long tail of performance is formed. When the data volume is relatively large and pressure is relatively high, a case in which the performance dramatically degrades may occur.