In computer science, an associative array, map, symbol table, or dictionary is an abstract data structure composed of a collection of (key, value) pairs; a key is used to address the value associated therewith. Keys or “data keys” may be any items of data, such as numbers, names, or addresses; values or “type values” may be a number of categories or groupings to which subsets of the data keys may be assigned. For example, number data keys may be assigned to type values “odd” and “even.” A common task in data and computer processing involves determining a type value Vi of a data key Kj, given a set of a priori associations between data keys and type values. For example, it may be known that data keys K1-K10 are of type value V1 and that data keys K11-K20 are of type value V2; a data structure may be created to store these associations. If a data key Kj of unknown type is encountered, its type value Vi may be determined by looking up the data key Kj in the data structure, which returns the corresponding type value Vi.
A straightforward implementation of the data structure that stores the associations might include an array, look-up table, or similar construct. This implementation, however, would consume a prohibitively large amount of memory when the set of data keys Kj and/or set of type values Vi becomes large (e.g., on the order of millions or billions). Such a large data structure might also increase the computing processing power and time required to complete a look-up request to undesirable levels.
A more sophisticated implementation might use a probabilistic data structure to store the associations; such a data structure trades off accuracy for reduced memory size and/or speed. For example, the probabilistic data structure might consume less memory than the straightforward implementation, but would return a type value Vi for a given data key Ki that is accurate only within a certain margin of error. This margin of error may be unacceptable for many applications; however, decreasing the margin of error of the probabilistic data structure to an acceptable level may increase the memory footprint of the probabilistic data structure to an unacceptable level.
A need therefore exists for a system and method of determining type values Vi for given data keys Ki with greater accuracy and reduced memory footprint.