The present invention relates to a data store, and more specifically to an information processor configured to build a data store capable of efficiently storing keys.
A number of applications such as those relating to language processing and user management increasingly demand storing a large number of character strings such as words, phrases, persons' names, and URLs in a limited memory that requires high space efficiency. Having the ability to provide a highly efficient data store makes it possible to manage a large number of character strings in a limited memory in a space-saving manner to allow efficient implementation of the aforementioned applications.
Traditionally, a hash map or a hash table is used in connection with efficient data store and high efficiency usage. The hash map has a data structure in which keys are mapped to values by using hash functions. The hash map is capable of registering values by using keys such that such values are referenced by the corresponding key. Due to the ability of managing these “values” by using a hash map and retrieving it from corresponding keys, such hash maps can be built incrementally. Hash maps also enable high-speed access due to the fact that both search and addition features are done within a specified time limit regardless of the number of elements involved. One challenge, however, in using hash maps is to use sufficiently sparse tables such that the rate of hash collisions are reduced to enhance memory and space efficiency.
A trie also known as an ordered tree, is a data structure that is used to store an associative array where the keys are usually strings and often no node in the tree stores the key associated with that node; instead, its position in the tree defines the key with which it is associated. A trie implemented with a double-array is known as another data store for the aforementioned usage. The trie implemented with the double-array (hereinafter a double-array or a double-array trie) has a data structure in which the function of storing keys is maintained by using a link structure. The double-array is inferior to the hash map from a viewpoint of data access speed, but is known to achieve relatively high memory space efficiency. Therefore, electing between the two options provides a trade off at times.