Computer systems commonly deploy databases that utilize a dictionary or hash data structure to manage underlying data. Such a database, commonly referred to as a key-value database or key-value store, stores a collection of records (e.g., objects), with each record corresponding to one or more different fields of data. The records are stored, managed, and retrieved from the key-value store using a key that uniquely identifies a corresponding record. Thus, keys are used to quickly locate data from the key-value store.
For example, a user may deal with a local database of a networking platform in transactions. A transaction can be a complex mixture of various operations such as Put, Delete, and Patch, and each transaction may be assigned a unique identifier, called a transaction-identifier, by the database. Transaction-identifiers are commonly assigned by making use of an incrementing integer counter. Thus, transaction-identifiers of older transactions have a smaller (integer) value than transaction-identifiers of newer transactions.
A local database may also store recent transaction history, which can be maintained in a rotating transaction log. That is, after completion of each transaction, the database can append a transaction log record to the rotating transaction log and, if the size of the rotating transaction log exceeds a specified space bound (of memory), some of the oldest log records are removed to ensure that the overall size of the log remains within the specified bound.
Such a transaction log is commonly implemented using a key-value store, with the keys set as the transaction-identifiers to ensure proper rotation of the log. However, depending on the size of the log, transaction-identifiers can become large in size, thereby requiring significant amounts of memory to maintain the key-value store. Further, during in-memory processing, CPUs generally store integer values in fixed width format so that operations (e.g., additions, multiplications) can easily be performed. However, fixed width format can lead to space wastage, in particular if a significant number of the integers corresponding to the transaction-identifiers are small in value. Further, storing large numbers of integers in fixed width format to a key-value store stored in a slower storage medium, such as an external storage disk (e.g., HDD, SSD), can be inefficient because the processing is performed in the CPU and not at the external storage medium—thus storing and retrieving keys and values between the CPU and the external storage medium using fixed width integer keys can be a slow and inefficient process.
To ameliorate this problem of wasted space, some existing variable-length integer encoding schemes use an arbitrary number of bytes to store the integers, where the average number of bytes required to encode an integer is smaller than the fixed-width size. However, the existing encoding schemes fail to be order-preserving. Therefore, existing encoding schemes cannot be implemented in databases that require the ordering of the records to be preserved, such as rotating transaction logs.