1. Technical Field
The present invention generally relates to data sharing management. More particularly, the present invention relates to hashing schemes.
2. Background Art
In a data sharing environment, different transactions running on different nodes or central electronic complexes (hereinafter, CEC) may require access to and the ability to alter the contents of the same data unit at the same time. Where two or more CECs require access to the same data unit and any one of them needs to modify the contents thereof, versus all merely reading the contents, that situation is referred to as a "real contention". To avoid real contentions, each access to a shared data unit is accomplished via a request, lock and release arrangement. However, this arrangement requires tracking the availability of each data unit; possibly a burdensome task for the data sharing environment. To reduce this burden, hashing schemes have been developed.
A hashing scheme provides concurrency control; that is, it tracks the use of data units and prevents concurrent modifications thereto. Hashing schemes utilize a hash table, having few entries relative to the number of data units, into which access information is mapped. Which hash entry a given data unit will be paired with is determined through the use of a hash function.
As an example of a hash function, consider a data sharing environment having thousands of data units and 100 hash table entries, from 0 to 99. A data unit number is examined to determine if it is more than two digits; if so, the data unit number is truncated to the last two digits. Thus, data unit number 1034 would correspond to hash table entry 34. This process of examination and truncation is the hash function.
Although existing hashing schemes effectively prevent concurrent modifications to the same data unit, they often result in a high rate of occurrence of "false contentions". False contentions occur when there is no real contention, but the data unit(s) attempting to be accessed are hashed to the same hash entry. In the above example, a false contention would occur if data units 634 and 1034 were requested to both be modified, both be read from, or one read from and one modified, since both would correspond to hash entry 34. Since the two data units are different, access would be allowed, however, resources are wasted in resolving the false contention; that is, determining that the contention is false rather than real. Thus, it would improve efficiency if a hashing scheme were developed that decreases the likelihood of false contentions.
One way to accomplish the goal of decreasing the likelihood of concurrent hashes to the same hash entry is to increase the size of the hash table; that is, to increase the number of hash table entries. Given the same number of data units, an increase in the number of hash table entries would result in fewer data units corresponding to the same entry, decreasing the likelihood of concurrent hashes to the same entry. However, this simple solution runs counter to the purpose of the hashing scheme to minimize the resources required to track data unit accesses; in this case, to decrease the size of the hash table. An increase in hash table size would actually require a greater amount of valuable memory space.
Another way to decrease the likelihood of false contentions is to choose a smaller data unit, such as records instead of pages. This would allow access to different records within the same page. However, when the data unit size is small, many access requests must be processed, utilizing a greater amount of valuable resources and/or taking a significantly longer time to process.
Thus, a need exists for a hashing scheme that minimizes the number of false contentions without overburdening resources.