The present disclosure relates generally to array tables used in database management, and more particularly to a hashing scheme using compact array tables.
A hash table is a data structure used to for management of data in computing environment. Hash tables often implement an associative array in order to build a structure that can map keys to values. A hash table uses a hash function to compute an index into an array of slots, from which the correct value can be found.
A hash function would want to optimally assign each possible key to a unique slot. Unfortunately, this is not always possible in many cases when data is processed dynamically and new entries are continuously added to the table, after it is created. In many instances, different keys are assigned unintentionally by the hash function to the same slot, creating a situation that is known as a collision. In some hash functions, schemes are implemented to minimize collisions. Others schemes assume that collisions are unavoidable and try to implemented ways to accommodate them.
A good hash function and implementation algorithm, are essential for hash table performance. A basic requirement of a good bash function is to provide a uniform distribution of hash values. A non-uniform distribution increases the number of collisions, and the cost of resolving them. Uniformity, however, is difficult to ensure, sometimes leading to pockets where data is more frequently disposed. The latter is often referred to as skewing and is another problem that needs to be addressed.