It has long been known to store data in a database with the aid of different data structures. These structures provide different advantages with respect, for instance, to used storage space and the possibility of carrying out rapid searches.
One known structure is based on a so-called hash function which enables a random distribution of elements in a table in a database to be calculated.
An element can include an internal address in a database. The element is coupled to an external key, such as a telephone number, for instance. When entering a new telephone number in a table, a calculation is made as to in which element the internal address of the telephone number shall be placed with the aid of a hash function.
The purpose of the hash function is to distribute the addresses randomly in the table so that these addresses will be distributed uniformly therein. The same hash function is then used to find the correct element when searching for a specific address.
When using a hash function on a key, there is obtained a hash value that is comprised of a data-word which normally includes 32 bits.
One type of algorithm used in this respect is linear hashing (LH), where the data-word obtained from the function includes so-called LH-bits, among other things. Put simply, the LH-bits obtained can point to a page in the database, or to a position in a register where said position points to a page, and the sought element is located somewhere on this page.
It is known to perform hashing in two stages, in order to make the search for an element more effective. The data-word obtained is divided into two groups of bits in these two stages.
The first group of bits point, for instance, to an index in a vector and this index can point to a new vector. The second group of bits points to an index in the new vector, wherein this new index points, for instance, to a first element in a linked list of elements. The sought element is then found somewhere in this linked list of elements.
Such linked lists may have mutually different lengths. An internal address lies in each element. The various links are placed physically at totally different locations within a memory, but are linked together by means of pointers.
Thus, when searching for a certain element, there is obtained the header of a linked list of elements. If the internal address sought for is not found in the first element, the search is continued in the next element in the linked list. Since the next element in the linked list is positioned physically somewhere else in the memory, further access to the memory is normally required in order to pick-up this next element. This first pick-up of the wrong element is a so-called cache-miss.
Thus, a person searching for a certain element may suffer several cache-misses, depending on the length of the lists, which is a very expensive process. In next generation processors, a cache-miss will be corresponded time-wise by hundreds of instructions.
It is therefore desirable to minimise the number of cache-misses. An ideal situation is one in which a sought element can be found when the memory is first accessed.
It is also known, according to the aforegoing, to allow a hash value to point to a position in a register or to a vector where a value in this position points further in the database, so as to enable dynamic changes to be made in a database, thereby enabling an element to be moved between, e.g., different pages, with subsequent changing of the value in the register for this element, without needing to assign a new hash value to the element.
It is also known that a distributed database will include a plurality of nodes that form a common system in a common database. The information in a distributed database is distributed in the different nodes that are included.
A node can include a primary replica of one or more parts of the database and a secondary replica of one or more other parts of the database. A primary replica with an associated secondary replica is normally referred to as a fragment.
The following publications are considered to provide a comprehensive picture of known technology in this particular field:
"Linear Hashing: A New Tool for File and Tables Addressing", W. Litwin, reprinted from VLDB-80 in Readings in Database Systems, 2nd ed., M. Stonebraker (ed.), Morgan Kaufman, 1994.
"LH*--Linear Hashing for Distributed Files", W. Litwin, M.-A. Neimat, D. A. Schneider, ACM SIGMOD Int'l. Conf. on Management of Data, 1993.
"High-Availability LH* Schemes with Mirroring", W. Litwin, M. A. Neimat.
"LH*LH: A Scalable High Performance Data Structure for Switched Microcomputers", J. S. Karlsson, W. Litwin, T. Risch, 5th Int'l. Conf. on Extending Database Technology (EDBT '96).
"Dynamic Hash Tables", P.-A. Larsson, Communications of the ACM, April, 1988.
"Main-Memory Linear Hashing--Some Enhancements of Larson's Algorithm", M. Pettersson, University of Linkoping, Sweden, March, 1993.